[GitHub] spark pull request: [SPARK-4072][Core]Display Streaming blocks in ...

2015-06-24 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6672#issuecomment-114882358 If we decide not to add `SparkListenerBlockUpdated` to `EventLoggingListener`, I will revert the changes to JsonProtocol. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-4072][Core]Display Streaming blocks in ...

2015-06-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6672#discussion_r33147257 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockUpdatedInfo.scala --- @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-4072][Core]Display Streaming blocks in ...

2015-06-24 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6672#issuecomment-114872773 I've noticed that you're not persisting the SparkListenerBlockUpdated events in the event logging listener. Since there might be tons of these events, I can understand

[GitHub] spark pull request: [SPARK-4072][Core]Display Streaming blocks in ...

2015-06-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6672#discussion_r33150039 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -848,6 +908,60 @@ private[spark] object JsonProtocol { new ExecutorInfo

[GitHub] spark pull request: [SPARK-4072][Core]Display Streaming blocks in ...

2015-06-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6672#discussion_r33150980 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -407,6 +414,52 @@ private[spark] object JsonProtocol { (Log Urls

[GitHub] spark pull request: [SPARK-4072][Core]Display Streaming blocks in ...

2015-06-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6672#discussion_r33155599 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -848,6 +908,60 @@ private[spark] object JsonProtocol { new ExecutorInfo

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-24 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6205#issuecomment-114884212 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4072][Core]Display Streaming blocks in ...

2015-06-24 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6672#issuecomment-115108078 OK. Let's don't log SparkListenerBlockUpdated. I have reverted JsonProtocol. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-24 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6205#issuecomment-115115181 Is this something that we need to prevent from accidentally happening again? I encountered this issue. Scala compiler will try to pack all parameters

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-25 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6205#issuecomment-115118058 To fix this line, can we put the 600 seconds in a conf property? The other option for a RpcTimeout would look like this BlockManagerHeartbeat(blockManagerId

[GitHub] spark pull request: [SPARK-4072][Core]Display Streaming blocks in ...

2015-06-25 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6672#issuecomment-115152867 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-25 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6205#issuecomment-115119854 As per the comment here: https://github.com/scala/scala/blob/2dcc4c42fdcefee08add9dbdcf619ab5da745674/src/compiler/scala/tools/nsc/typechecker/Typers.scala#L3329

[GitHub] spark pull request: [SPARK-8376][Docs]Add common lang3 to the Spar...

2015-06-18 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6829#issuecomment-113339743 @srowen do you have an example to publish both the single jar and the assembly jar? Two approaches I'm thinking about: 1. Upload the assembly jar

[GitHub] spark pull request: [SPARK-8376][Docs]Add common lang3 to the Spar...

2015-06-18 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6829#issuecomment-113348608 Publishing the single jar would be helpful if people find some dependency conflicts or want to upgrade the version of a dependency library, and want to resolve

[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-19 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6830#discussion_r32834064 --- Diff: docs/streaming-flume-integration.md --- @@ -129,6 +138,12 @@ configuring Flume agents

[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-19 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6830#issuecomment-113532044 Added the Python unit tests. I refactored the flume unit tests and extracted the common codes for Scala and Python unit tests to FlumeTestUtils

[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-22 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6883#issuecomment-114354847 @rxin I remember you said you would like such improvement to be added to Spark SQL rather than Spark Core. What's your thoughts on this one? --- If your project is set

[GitHub] spark pull request: [SPARK-8376][Docs]Add common lang3 to the Spar...

2015-06-22 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6829#issuecomment-114356316 I agree with @srowen that Maven isn't really the right place to distribute assemblies. For the assembly jars, what we need to do is just providing download links

[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6810#discussion_r33015113 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4072][Core]Display Streaming blocks in ...

2015-06-22 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6672#issuecomment-114108458 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8399][Streaming][Web UI] Overlap betwee...

2015-06-23 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6845#issuecomment-114423828 I just tried this PR. There is still one problem. The graphs in the inner table does not align with the outer ones. ![issue](https

[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-23 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6830#issuecomment-114428019 @JoshRosen could you review the last commit? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-18 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6830#discussion_r32791523 --- Diff: external/flume-assembly/pom.xml --- @@ -0,0 +1,134 @@ +?xml version=1.0 encoding=UTF-8? +!-- + ~ Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-18 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6830#discussion_r32791712 --- Diff: external/flume/src/main/scala/org/apache/spark/streaming/flume/FlumeUtils.scala --- @@ -236,3 +242,71 @@ object FlumeUtils

[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-18 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6830#issuecomment-113323255 @zsxwing This is pretty good to me, but its not obvious that this will work. Can you add Flume python tests? See how the KafkaTestUtils (present in src not test

[GitHub] spark pull request: [SPARK-8399][Streaming][Web UI] Overlap betwee...

2015-06-23 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6845#issuecomment-114683528 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-23 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6205#issuecomment-114686364 Oops... Forgot to make `RpcUtils` private. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6205#discussion_r33110081 --- Diff: core/src/main/scala/org/apache/spark/rpc/RpcEnv.scala --- @@ -182,3 +184,109 @@ private[spark] object RpcAddress { RpcAddress(host, port

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6205#discussion_r33110309 --- Diff: core/src/main/scala/org/apache/spark/rpc/RpcEnv.scala --- @@ -182,3 +184,109 @@ private[spark] object RpcAddress { RpcAddress(host, port

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6205#discussion_r33110444 --- Diff: core/src/test/scala/org/apache/spark/rpc/akka/AkkaRpcEnvSuite.scala --- @@ -17,8 +17,18 @@ package org.apache.spark.rpc.akka

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6205#discussion_r33110433 --- Diff: core/src/test/scala/org/apache/spark/rpc/akka/AkkaRpcEnvSuite.scala --- @@ -47,4 +57,71 @@ class AkkaRpcEnvSuite extends RpcEnvSuite

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6205#discussion_r33110588 --- Diff: core/src/test/scala/org/apache/spark/rpc/RpcEnvSuite.scala --- @@ -539,6 +546,37 @@ abstract class RpcEnvSuite extends SparkFunSuite

[GitHub] spark pull request: [SPARK-6980] [CORE] Akka timeout exceptions in...

2015-06-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6205#discussion_r33110546 --- Diff: core/src/test/scala/org/apache/spark/rpc/RpcEnvSuite.scala --- @@ -539,6 +546,37 @@ abstract class RpcEnvSuite extends SparkFunSuite

[GitHub] spark pull request: [SPARK-7799][Streaming]Add streaming-akka pr...

2015-06-16 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6841 [SPARK-7799][Streaming]Add streaming-akka project This PR includes the following changes: 1. Add streaming-akka project and org.apache.spark.streaming.akka.AkkaUtils for creating

[GitHub] spark pull request: [SPARK-8373][PySpark]Add emptyRDD to pyspark a...

2015-06-15 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6826#issuecomment-112020029 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8376][Docs]Add common lang3 to the Spar...

2015-06-15 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6829 [SPARK-8376][Docs]Add common lang3 to the Spark Flume Sink doc Commons Lang 3 has been added as one of the dependencies of Spark Flume Sink since #5703. This PR updates the doc for it. You can

[GitHub] spark pull request: [SPARK-8373][PySpark]Add emptyRDD to pyspark a...

2015-06-15 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6826#discussion_r32416752 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala --- @@ -425,6 +425,11 @@ private[spark] object PythonRDD extends Logging

[GitHub] spark pull request: [SPARK-8376][Docs]Add common lang3 to the Spar...

2015-06-15 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6829#issuecomment-112063658 This is built into the assembly though, right? No. Spark Flume Sink does not assemble the dependencies. Actually, now we don't have an assembly jar for Flume

[GitHub] spark pull request: [SPARK-8373][PySpark]Add emptyRDD to pyspark a...

2015-06-15 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6826#discussion_r32416715 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala --- @@ -425,6 +425,11 @@ private[spark] object PythonRDD extends Logging

[GitHub] spark pull request: [SPARK-8376][Docs]Add common lang3 to the Spar...

2015-06-15 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6829#issuecomment-112071017 OK but surely it's easier to make an assembly target than tell people they have to piece together the dependencies and keep updating docs about it? ping @tdas

[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-15 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6830 [SPARK-8378][Streaming]Add the Python API for Flume You can merge this pull request into a Git repository by running: $ git pull https://github.com/zsxwing/spark flume-python Alternatively

[GitHub] spark pull request: [SPARK-8373][PySpark]Add emptyRDD to pyspark a...

2015-06-15 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6826#issuecomment-112093726 /cc @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-15 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6830#discussion_r32428495 --- Diff: external/flume-assembly/pom.xml --- @@ -0,0 +1,134 @@ +?xml version=1.0 encoding=UTF-8? +!-- + ~ Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-6939][Streaming][WebUI] Add timeline an...

2015-06-15 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/5533#issuecomment-112098723 @hotienvu could you open a JIRA and describe more details? Such as the error information and the invalid link? --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-12 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6747#issuecomment-111403711 The schema for set commands is wrong (we need some column name) Added schemas for all set commands and split results into columns. For the configuration

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-12 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6747#issuecomment-111409215 It seems a bunch of boilerplate could be removed and we could get some more typesafety by making the conf entries inter classes of SQLConf with methods like .value

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6747#discussion_r32294601 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala --- @@ -25,74 +25,281 @@ import scala.collection.JavaConversions._ import

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6747#discussion_r32294623 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala --- @@ -25,74 +25,281 @@ import scala.collection.JavaConversions._ import

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6747#discussion_r32294635 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala --- @@ -25,74 +25,281 @@ import scala.collection.JavaConversions._ import

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6747#discussion_r32294655 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala --- @@ -219,74 +422,78 @@ private[sql] class SQLConf extends Serializable

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6747#discussion_r32294674 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala --- @@ -25,74 +25,281 @@ import scala.collection.JavaConversions._ import

[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-12 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6707#issuecomment-111389446 This PR looks good to me. I feel a bit weird about counting ByteBufferBlock as 1, but I cannot find a better solution. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-7993] [SQL] Improved DataFrame.show() o...

2015-06-12 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6784 [SPARK-7993] [SQL] Improved DataFrame.show() output Closes #6633 You can merge this pull request into a Git repository by running: $ git pull https://github.com/zsxwing/spark pr6633

[GitHub] spark pull request: [SPARK-8309] [CORE] Support for more than 12M ...

2015-06-13 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6763#issuecomment-111709951 @SlavikBaranov Could you check how `BytesToBytesMap.putNewKey` grows the capacity? I think you can use a similar approach to increase the max capacity from `0.7 * (1

[GitHub] spark pull request: [SPARK-8309] [CORE] Support for more than 12M ...

2015-06-13 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6763#discussion_r32370463 --- Diff: core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala --- @@ -278,7 +279,7 @@ object OpenHashSet { val INVALID_POS

[GitHub] spark pull request: [SPARK-8373][PySpark] Remove PythonRDD.emptyRD...

2015-06-17 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6867 [SPARK-8373][PySpark] Remove PythonRDD.emptyRDD This is a follow-up PR to remove unused `PythonRDD.emptyRDD` added by #6826 You can merge this pull request into a Git repository by running

[GitHub] spark pull request: [SPARK-8373][PySpark]Add emptyRDD to pyspark a...

2015-06-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6826#discussion_r32694794 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala --- @@ -425,6 +425,11 @@ private[spark] object PythonRDD extends Logging

[GitHub] spark pull request: [SPARK-8404][Streaming][Tests] Use thread-safe...

2015-06-18 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6852#issuecomment-113063763 @srowen Sorry. The issue I mentioned in https://github.com/apache/spark/pull/6852#issuecomment-112722654 won't happen. I put ~~strikethrough~~ there. --- If your

[GitHub] spark pull request: [SPARK-7913][Core]Increase the maximum capacit...

2015-06-18 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6456#issuecomment-113071321 @srowen I will make `AppendOnlyMap` consistent with OpenHashMap later. For `2^29`, because unlike OpenHashMap, AppendOnlyMap use a single array to store both

[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...

2015-06-18 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6877#issuecomment-113079543 cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...

2015-06-18 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6877 [SPARK-8434][SQL]Add a pretty parameter to show to display long strings Sometimes the user may want to show the complete content of cells. Now `sql(set -v).show()` displays: ![screen shot

[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-18 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6830#issuecomment-113081831 I just followed the kafka-assembly module. Is it easy to make maven publish the assembly jar? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-7913][Core]Make AppendOnlyMap use the s...

2015-06-18 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6879 [SPARK-7913][Core]Make AppendOnlyMap use the same growth strategy of OpenHashSet and consistent exception message This is a follow up PR for #6456 to make AppendOnlyMap consistent with OpenHashSet

[GitHub] spark pull request: [SPARK-7913][Core]Make AppendOnlyMap use the s...

2015-06-18 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6879#issuecomment-113105185 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8404][Streaming][Tests] Use thread-safe...

2015-06-17 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6852 [SPARK-8404][Streaming][Tests] Use thread-safe collections to make the tests more reliable KafkaStreamSuite, DirectKafkaStreamSuite, JavaKafkaStreamSuite and JavaDirectKafkaStreamSuite use non

[GitHub] spark pull request: [SPARK-8161] Set externalBlockStoreInitialized...

2015-06-17 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6702#issuecomment-112668213 @shimingfei I think people are busy with Spark Summit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-8404][Streaming][Tests] Use thread-safe...

2015-06-17 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6852#issuecomment-112707759 In all cases foreachRDD executes serially on the driver; what's the thread safety issue? The codes in foreachRDD run in `JobScheduler.jobExecutor

[GitHub] spark pull request: [SPARK-8404][Streaming][Tests] Use thread-safe...

2015-06-17 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6852#issuecomment-112722654 There are certainly some memory barriers in between the writes and reads without this. For these tests, there is no memory barrier because the checking codes

[GitHub] spark pull request: [SPARK-8404][Streaming][Tests] Use thread-safe...

2015-06-17 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6852#issuecomment-112751779 Writes being visible is what I'm concerned. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8404][Streaming][Tests] Use thread-safe...

2015-06-17 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6852#issuecomment-112727686 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8404][Streaming][Tests] Use thread-safe...

2015-06-17 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6852#issuecomment-112700126 cc @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-7993] [SQL] Improved DataFrame.show() o...

2015-06-12 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6784#issuecomment-111492113 /cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-7913][Core]Increase the maximum capacit...

2015-06-10 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6456#issuecomment-110996193 ping @andrewor14 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-10 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6747#issuecomment-110789261 A simple example of the `conf` command: ```Scala scala sqlContext.sql(conf).collect().foreach(v = println(v.toSeq.mkString(\t

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-11 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6747#discussion_r32195751 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala --- @@ -25,74 +25,283 @@ import scala.collection.JavaConversions._ import

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-11 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6747#discussion_r32196679 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSQLParser.scala --- @@ -57,6 +57,7 @@ private[sql] class SparkSQLParser(fallback: String

[GitHub] spark pull request: [SPARK-7042] [BUILD] use the standard akka art...

2015-06-11 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6492#issuecomment-111013435 3.2.5 has the same issue, too ``` mvn -Phadoop-1 dependency:tree | grep akka [INFO] +- org.spark-project.akka:akka-remote_2.10:jar:2.3.4-spark:compile

[GitHub] spark pull request: [SPARK-7961][SQL]Refactor SQLConf to display b...

2015-06-11 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6747#issuecomment-111312728 Addressed all comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8373][PySpark]Add emptyRDD to pyspark a...

2015-06-15 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6826 [SPARK-8373][PySpark]Add emptyRDD to pyspark and fix the issue when calling sum on an empty RDD This PR fixes the sum issue and also adds `emptyRDD` so that it's easy to create a test case. You

[GitHub] spark pull request: [SPARK-8309] [CORE] Support for more than 12M ...

2015-06-15 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6763#discussion_r32396635 --- Diff: core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala --- @@ -278,7 +279,7 @@ object OpenHashSet { val INVALID_POS

[GitHub] spark pull request: [SPARK-8309] [CORE] Support for more than 12M ...

2015-06-15 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6763#issuecomment-111965952 I agree with the concern about the worse case scenario. Maybe the error message should be improved. `Can't make capacity bigger than 2^30 elements` will be confusing

[GitHub] spark pull request: [SPARK-7913][Core]Increase the maximum capacit...

2015-05-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6456#discussion_r31286191 --- Diff: core/src/main/scala/org/apache/spark/util/collection/PartitionedSerializedPairBuffer.scala --- @@ -89,11 +93,17 @@ private[spark] class

[GitHub] spark pull request: [SPARK-7931][STREAMING] Do not restart receive...

2015-05-28 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6483#issuecomment-106647895 LGTM except the minor comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-7931][STREAMING] Do not restart receive...

2015-05-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6483#discussion_r31292922 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala --- @@ -74,13 +76,16 @@ class SocketReceiver[T: ClassTag

[GitHub] spark pull request: [SPARK-7855] Move bypassMergeSort-handling fro...

2015-05-29 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6397#discussion_r31319829 --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java --- @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7855] Move bypassMergeSort-handling fro...

2015-05-29 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6397#issuecomment-106800872 LGTM except the `partitioner.getPartition(key)` issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-05-31 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/2765#issuecomment-107278995 @wangxiaojing could you update this PR? It conflicts with master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-7989][Core][Tests] Fix flaky tests in E...

2015-05-31 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6546 [SPARK-7989][Core][Tests] Fix flaky tests in ExternalShuffleServiceSuite and SparkListenerWithClusterSuite The flaky tests in ExternalShuffleServiceSuite and SparkListenerWithClusterSuite

[GitHub] spark pull request: [SQL] [TEST] [MINOR] Follow-up of PR #6493, us...

2015-06-01 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6547#discussion_r31402480 --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala --- @@ -29,6 +27,8 @@ import

[GitHub] spark pull request: [SPARK-7989][Core][Tests] Fix flaky tests in E...

2015-06-01 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6546#issuecomment-107421913 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-7989][Core][Tests] Fix flaky tests in E...

2015-06-01 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6546#issuecomment-107381194 @srowen These two failures happened in my PR #6457. I fixed them in #6457 to pass Jenkins. But I think they are not a part of #6457 and it's better to fix them

[GitHub] spark pull request: [SPARK-8001][Core]Make AsynchronousListenerBus...

2015-06-01 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6550#issuecomment-107345032 @andrewor14 could you take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-8001][Core]Make AsynchronousListenerBus...

2015-06-01 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/6550 [SPARK-8001][Core]Make AsynchronousListenerBus.waitUntilEmpty throw TimeoutException if timeout Some places forget to call `assert` to check the return value

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-06-01 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r31428558 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -183,8 +189,46 @@ class FileInputDStream[K, V, F

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-06-01 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r31429608 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -223,6 +266,11 @@ class FileInputDStream[K, V, F

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2015-06-01 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r31427177 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -183,8 +189,46 @@ class FileInputDStream[K, V, F

[GitHub] spark pull request: [SQL] [TEST] [MINOR] Follow-up of PR #6493, us...

2015-06-01 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6547#discussion_r31478931 --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala --- @@ -29,6 +27,8 @@ import

[GitHub] spark pull request: [SPARK-7989][Core][Tests] Fix flaky tests in E...

2015-06-01 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6546#issuecomment-107748117 if this is just for testing, how about we put it into a SparkListener in the tests instead, and avoid changing JobProgressListener? You could put it into a common

[GitHub] spark pull request: [SPARK-7958] [STREAMING] Handled exception in ...

2015-06-01 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/6559#discussion_r31482885 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala --- @@ -126,6 +126,8 @@ class JobScheduler(val ssc

[GitHub] spark pull request: [SPARK-7958] [STREAMING] Handled exception in ...

2015-06-01 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/6559#issuecomment-107751303 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

<    5   6   7   8   9   10   11   12   13   14   >