[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21721 **[Test build #94097 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94097/testReport)** for PR 21721 at commit [`1775c2a`](https://github.com/apache/spark/commit/1775c2a1db2bf790ddf1cad0113c7ead2409ba65). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21889 **[Test build #94101 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94101/testReport)** for PR 21889 at commit [`37e0a97`](https://github.com/apache/spark/commit/37e0a97c32f28006e9af1143549cbdae5319df49). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21915: [SPARK-24954][Core] Fail fast on job submit if run a bar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21915 **[Test build #94090 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94090/testReport)** for PR 21915 at commit [`0796f76`](https://github.com/apache/spark/commit/0796f760c60da9bb8b5cadeee2e751dd898cf8cf). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21957: [SPARK-24994][SQL] When the data type of the field is co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21957 **[Test build #94094 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94094/testReport)** for PR 21957 at commit [`24c061f`](https://github.com/apache/spark/commit/24c061fbf2e5c894729443171e16cbadfc004db3). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21586: [SPARK-24586][SQL] Upcast should not allow casting from ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21586 **[Test build #94103 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94103/testReport)** for PR 21586 at commit [`c89d12e`](https://github.com/apache/spark/commit/c89d12e7b32987cbe4a081fc417fb38022061cc5). * This patch **fails due to an unknown error code, -9**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint for SQL...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21911 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21933: [SPARK-24917][CORE] make chunk size configurable
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21933 **[Test build #94098 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94098/testReport)** for PR 21933 at commit [`0251bd5`](https://github.com/apache/spark/commit/0251bd517e7fd3e695cb8366ffa03de8c9e2900b). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21948 **[Test build #94106 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94106/testReport)** for PR 21948 at commit [`86817c7`](https://github.com/apache/spark/commit/86817c7ee36f1344e977bb5af14aeb56232c17d5). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class MemoryWriterCommitMessage(partition: Int, data: Seq[Row])` * `case class MemoryWriterFactory(outputMode: OutputMode, schema: StructType)` * `class MemoryDataWriter(partition: Int, outputMode: OutputMode, schema: StructType)` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20345: [SPARK-23172][SQL] Expand the ReorderJoin rule to handle...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20345 **[Test build #94093 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94093/testReport)** for PR 20345 at commit [`39462fb`](https://github.com/apache/spark/commit/39462fbee952ec574b4c04d7718fd73bb5f56d9d). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21975: [WIP][SPARK-25001][BUILD] Fix miscellaneous build warnin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21975 **[Test build #94096 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94096/testReport)** for PR 21975 at commit [`2354e10`](https://github.com/apache/spark/commit/2354e10bd82f4770fc58feb5ac2738dd0dd39070). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public abstract class AbstractLauncher> ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint for SQL...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21911 **[Test build #94108 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94108/testReport)** for PR 21911 at commit [`739aeb4`](https://github.com/apache/spark/commit/739aeb44e9b9bb15b74271e2b42fb3dfe6f1c8fe). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21966: [SPARK-23915][SQL][followup] Add array_except function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21966 **[Test build #94099 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94099/testReport)** for PR 21966 at commit [`16b9949`](https://github.com/apache/spark/commit/16b9949285c7133b89b3e6624cd8f5684abd3df5). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21965 **[Test build #94107 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94107/testReport)** for PR 21965 at commit [`ace19dd`](https://github.com/apache/spark/commit/ace19dd7230598350838aa60fc93b32a08642acd). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class ArrayFilter(` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21955: [SPARK-18057][FOLLOW-UP][SS] Update Kafka client version...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21955 **[Test build #94086 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94086/testReport)** for PR 21955 at commit [`6155eb8`](https://github.com/apache/spark/commit/6155eb8f2692e258e07767c5487b2f75c587e21a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21981 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21981 I am not sure. ML is not my area but I am pretty sure who you know are basically who I know .. ping me if that's minor or trivial like this. I can review and merge. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint for SQL...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21911 **[Test build #94108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94108/testReport)** for PR 21911 at commit [`739aeb4`](https://github.com/apache/spark/commit/739aeb44e9b9bb15b74271e2b42fb3dfe6f1c8fe). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21981 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21981 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94105/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21981 **[Test build #94105 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94105/testReport)** for PR 21981 at commit [`c571279`](https://github.com/apache/spark/commit/c571279a2885544cd9b565e7370984520fe92176). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21952 The regression happens at writing. Looks like when benchmarking writing time, we don't use `df.count`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21965 **[Test build #94107 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94107/testReport)** for PR 21965 at commit [`ace19dd`](https://github.com/apache/spark/commit/ace19dd7230598350838aa60fc93b32a08642acd). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21965 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21965 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1737/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21952 I noticed that the benchmark uses `df.count`, is it possible that column pruning has some issues in master? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/21965 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21948 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21948 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1736/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21948 **[Test build #94106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94106/testReport)** for PR 21948 at commit [`86817c7`](https://github.com/apache/spark/commit/86817c7ee36f1344e977bb5af14aeb56232c17d5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21948 @rdblue I have documented the object reuse behavior and ask data source to handle it, please take a look, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21898: [SPARK-24817][Core] Implement BarrierTaskContext.barrier...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21898 > Also, if there shouldn't exist two active attempts at the same time for a barrier stage, maybe we should store attemptId as a state variable. Basically, if we see a new attempt ID, we should abort the old attempts. Actually I'm not sure whether we can guarantee that, since kill tasks may take some time, it's always possible that a new stage attempt is launched, and then a zombie task that haven't been killed send a barrier sync message. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/21981 BTW, @HyukjinKwon, do you know who's still reviewing the ML PRs? I have a few old PRs and I really want to know which're considered meaningful. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21979: [SPARK-25009][CORE]Standalone Cluster mode application s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21979 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21895 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94084/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21979: [SPARK-25009][CORE]Standalone Cluster mode application s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94082/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21895 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21979: [SPARK-25009][CORE]Standalone Cluster mode application s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21979 **[Test build #94082 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94082/testReport)** for PR 21979 at commit [`e753ff8`](https://github.com/apache/spark/commit/e753ff8a4be5b1b08dc2165d04fd3af46cfcc546). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21895 **[Test build #94084 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94084/testReport)** for PR 21895 at commit [`c620fff`](https://github.com/apache/spark/commit/c620fff90d20ba1b62e1277317754d5f14567f79). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21952 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94104/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21952 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21898: [SPARK-24817][Core] Implement BarrierTaskContext.barrier...
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/21898 Also, if there shouldn't exist two active attempts at the same time for a barrier stage, maybe we should store attemptId as a state variable. Basically, if we see a new attempt ID, we should abort the old attempts. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21952 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21952 **[Test build #94104 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94104/testReport)** for PR 21952 at commit [`ec17d58`](https://github.com/apache/spark/commit/ec17d58ea674ffba6e2c07284a26f6b3a1e7357e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/21981 Thanks for the review @HyukjinKwon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21898: [SPARK-24817][Core] Implement BarrierTaskContext.barrier...
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/21898 Here is what I mean: ~~~scala case class ContextBarrierId(stageId: Int, stageAttemptId: Int) class ContextBarrierState(val numTasks: Int) { private var epoch: Int = 0 private val requesters: ArrayBuffer[RpcCallContext] = ... private val timerTask: TimerTask = new TimerTask { ... } def handleRequest(requester: RpcCallContext, barrierEpoch: Int): Unit = synchronized { // start timer if this is the first // throw exception // always check if requests = numTasks and if yes reply all, clean requesters, and increment counter } private def startTimer(): Unit def clear(): Unit = synchronized { // set epoch to -1 // clear requesters // cancel timer if active } } val states = new ConcurrentHashMap[ContextBarrierId, ContextBarrierState] ... case RequestToSync( ) => val id = ContextBarrierId(...) states.putIfAbsent(id, new ContextBarrierState(numTasks)) val state = states.get(id) state.handleRequest(context, barrierEpoch) ... def onStop() { states.forEachValue(_.clear()) states.clear() } ... ~~~ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21981 **[Test build #94105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94105/testReport)** for PR 21981 at commit [`c571279`](https://github.com/apache/spark/commit/c571279a2885544cd9b565e7370984520fe92176). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21981 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1735/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21981 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21952 Ah, finally I can reproduce this. It needs to allocate the array feature with length 16000. I was reducing it to 1600 and it largely relieve the regression. `com.databricks.spark.avro` is faster only on Spark 2.3. If using with current master branch, it isn't faster than built-in avro datasource. Maybe somewhere causes this regression. ```scala > "com.databricks.spark.avro - Spark 2.3" scala> spark.sparkContext.parallelize(writeTimes.slice(50, 150)).toDF("writeTimes").describe("writeTimes").show() +---+---+ |summary| writeTimes| +---+---+ | count|100| | mean| 0.97110999| | stddev|0.01940836797556013| |min| 0.941| |max| 1.037| +---+---+ scala> spark.sparkContext.parallelize(readTimes.slice(50, 150)).toDF("readTimes").describe("readTimes").show() +---+---+ |summary| readTimes| +---+---+ | count|100| | mean|0.36022| | stddev|0.05807476546520342| |min| 0.287| |max| 0.626| +---+---+ > "avro" scala> spark.sparkContext.parallelize(writeTimes.slice(50, 150)).toDF("writeTimes").describe("writeTimes").show() +---+---+ |summary| writeTimes| +---+---+ | count|100| | mean| 1.73716999| | stddev|0.03504399976018602| |min| 1.695| |max| 1.886| +---+---+ scala> spark.sparkContext.parallelize(readTimes.slice(50, 150)).toDF("readTimes").describe("readTimes").show() +---+---+ |summary| readTimes| +---+---+ | count|100| | mean|0.323489994| | stddev|0.06235617714615632| |min| 0.263| |max| 0.781| +---+---+ ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org