Here is an example code that is bundled with Spark

for (i <- 1 to ITERATIONS) {
      println("On iteration " + i)
      val gradient = points.map { p =>
        (1 / (1 + exp(-p.y * (w dot p.x))) - 1) * p.y * p.x
      }.reduce(_ + _)
      w -= gradient
    }

As you can see, an action is called on every iteration. I guess this is the
same pattern I have in my code. So why do you think this is not the right
thing to do with Spark?




On Tue, Feb 18, 2014 at 12:44 AM, Guillaume Pitel <
[email protected]> wrote:

>  Whatever you want to do, if you really have to do it that way, don't use
> Spark. And the answer to your question is : Spark automatically
> "interleaves" stages that can be interleaved.
>
> Now, I do not believe that you really want to do that. You probably should
> just do a filter + map or a flatmap. But explain what you're trying to
> achieve so we can recommend you with a better way.
>
> Guillaume
>
> With so little information about what your code is actually doing, what
> you have shared looks likely to be an anti-pattern to me.  Doing many
> collect actions is something to be avoided if at all possible, since this
> forces a lot of network communication to materialize the results back
> within the driver process, and network communication severely constrains
> performance.
>
>
> On Mon, Feb 17, 2014 at 9:51 AM, David Thomas <[email protected]> wrote:
>
>>   I have a spark application that has the below structure:
>>
>>  while(...) { // 10-100k iterations
>>    rdd.map(...).collect
>> }
>>
>>  Basically, I have an RDD and I need to query it multiple times.
>>
>>  Now when I run this, for each iteration, Spark creates a new stage (each
>> stage having multiple tasks). What I find is that the stage execution takes
>> about 1 second and most time is spend in scheduling the tasks. Since a
>> stage is not submitted until the previous stage is completed, this loop
>> takes a long time to complete. So my question is, is there a way to
>> interleave multiple stage executions? Any other suggestions to improve the
>> above query pattern?
>>
>
>
>
> --
>    [image: eXenSa]
>  *Guillaume PITEL, Président*
> +33(0)6 25 48 86 80
>
> eXenSa S.A.S. <http://www.exensa.com/>
>  41, rue Périer - 92120 Montrouge - FRANCE
> Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05
>

<<inline: exensa_logo_mail.png>>

Reply via email to