Github user maasg commented on the issue:
https://github.com/apache/spark/pull/21194
@zsxwing Thanks for dropping by. This patch is about fixing the rate ramp
up when `rowsPerSecond <= rampUpTime`, which makes the Rate Source produce no
data until `rampUpTime` (See
[SPARK-24
Github user maasg commented on the issue:
https://github.com/apache/spark/pull/21194
@holdenk as we discussed in Strata, it would be great if you could give me
your opinion on the approach taken in this PR
Github user maasg commented on the issue:
https://github.com/apache/spark/pull/21194
pinging @zsxwing: as the original author of the `RateSourceProvider`,
could you review this PR?
---
-
To unsubscribe, e-mail
Github user maasg commented on a diff in the pull request:
https://github.com/apache/spark/pull/21194#discussion_r185891712
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProviderSuite.scala
---
@@ -173,55 +173,154 @@ class
Github user maasg commented on the issue:
https://github.com/apache/spark/pull/21188
Hi Jerry,
There's an improvement from the original situation, but the initial ramp-up
phase starts only when the time gets very close to `rampUpTime`. Here you have
another example that shows
Github user maasg commented on the issue:
https://github.com/apache/spark/pull/21194
@xuanyuanking thanks for the review. I understand that the changes are
broader than what the ticket might imply, but I believe the new implementation
is much simpler to understand and delivers
Github user maasg commented on the issue:
https://github.com/apache/spark/pull/21188
Hi Jerry,
I don't think the issue is solved with this patch. I plugged the new
function in my notebook and it still shows a rather flat ramp-up:
![image](https://user
GitHub user maasg opened a pull request:
https://github.com/apache/spark/pull/21194
[SPARK-24046][SS] Fix rate source when rowsPerSecond <= rampUpTime
## What changes were proposed in this pull request?
Fixes the ramp-up of the rate source for the case `rowsPerSec
Github user maasg commented on the issue:
https://github.com/apache/spark/pull/18923
Outdated by Datasource V2 implementation. Closing.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user maasg closed the pull request at:
https://github.com/apache/spark/pull/18923
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user maasg commented on the issue:
https://github.com/apache/spark/pull/18923
@zsxwing sorry, lost track of this. Will do. Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user maasg commented on a diff in the pull request:
https://github.com/apache/spark/pull/18923#discussion_r132794142
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/console.scala
---
@@ -49,7 +49,7 @@ class ConsoleSink(options: Map[String, String
GitHub user maasg opened a pull request:
https://github.com/apache/spark/pull/18923
[SPARK-21710][SS] Fix OOM on ConsoleSink with large inputs
## What changes were proposed in this pull request?
Replace a full `collect` with a `take` using the expected number of
elements
Github user maasg commented on the pull request:
https://github.com/apache/spark/pull/4027#issuecomment-99211900
@tdas Would you have an opinion on this? Are there alternatives to
warranty the even spread of Streaming receivers over nodes of a Mesos cluster ?
---
If your project
Github user maasg commented on the pull request:
https://github.com/apache/spark/pull/4027#issuecomment-93029751
One of the issues this PR is solving is ensuring jobs could be forced to
spread over several nodes. This is particularly important for Spark Streaming
as parallelizing
Github user maasg commented on a diff in the pull request:
https://github.com/apache/spark/pull/4027#discussion_r22938076
--- Diff: docs/running-on-mesos.md ---
@@ -226,6 +226,20 @@ See the [configuration page](configuration.html) for
information on Spark config
The final
Github user maasg commented on a diff in the pull request:
https://github.com/apache/spark/pull/4027#discussion_r22938097
--- Diff: docs/running-on-mesos.md ---
@@ -226,6 +226,20 @@ See the [configuration page](configuration.html) for
information on Spark config
The final
Github user maasg commented on a diff in the pull request:
https://github.com/apache/spark/pull/4027#discussion_r22938137
--- Diff: docs/running-on-mesos.md ---
@@ -226,6 +226,20 @@ See the [configuration page](configuration.html) for
information on Spark config
The final
Github user maasg commented on a diff in the pull request:
https://github.com/apache/spark/pull/3466#discussion_r20924293
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/StreamingSource.scala ---
@@ -70,4 +78,14 @@ private[streaming] class StreamingSource(ssc
Github user maasg commented on the pull request:
https://github.com/apache/spark/pull/756#issuecomment-43221668
Thanks for the updates! +1 (after the facts :) )
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user maasg commented on a diff in the pull request:
https://github.com/apache/spark/pull/756#discussion_r12578665
--- Diff: docs/running-on-mesos.md ---
@@ -3,19 +3,109 @@ layout: global
title: Running Spark on Mesos
---
-Spark can run on clusters managed
Github user maasg commented on the pull request:
https://github.com/apache/spark/pull/756#issuecomment-42945385
Great work.
I'd love to see some more background of the dynamics of Spark running on
Mesos. It has been a tough learning experience to get our Spark + Spark
Streaming
Github user maasg commented on a diff in the pull request:
https://github.com/apache/spark/pull/756#discussion_r12578801
--- Diff: docs/running-on-mesos.md ---
@@ -25,31 +115,52 @@ val conf = new SparkConf()
val sc = new SparkContext(conf)
{% endhighlight
23 matches
Mail list logo