Hi
Does updateStateByKey pass elements to updateFunc (in Seq[V]) in order in which 
they appear in the RDD?
My guess is no which means updateFunc needs to be commutative. Am I correct?
I've asked this question before but there were no takers.

Here's the scala docs for updateStateByKey

  /**
   * Return a new "state" DStream where the state for each key is updated by 
applying
   * the given function on the previous state of the key and the new values of 
each key.
   * Hash partitioning is used to generate the RDDs with Spark's default number 
of partitions.
   * @param updateFunc State update function. If `this` function returns None, 
then
   *                   corresponding state key-value pair will be eliminated.
   * @tparam S State type
   */
  def updateStateByKey[S: ClassTag](
      updateFunc: (Seq[V], Option[S]) => Option[S]
    ): DStream[(K, S)] = {
    updateStateByKey(updateFunc, defaultPartitioner())
  }

Reply via email to