[ https://issues.apache.org/jira/browse/SPARK-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387811#comment-14387811 ]
SaintBacchus edited comment on SPARK-6605 at 3/31/15 1:54 AM: -------------------------------------------------------------- {{reduceByKeyAndWindow}} has two implementations and leads to two different result when coming an empty window. But we consider it as a difference not a problem. If user wants to remove the empty keys using {{ReducedWindowedDStream}}, he can have a {{filter}} function to remove it. was (Author: carlmartin): {{reduceByKeyAndWindow }} has two implementations and leads to two different result when coming an empty window. But we consider it as a difference not a problem. If user wants to remove the empty keys using {{ReducedWindowedDStream}}, he can have a {{filter}} function to remove it. > Same transformation in DStream leads to different result > -------------------------------------------------------- > > Key: SPARK-6605 > URL: https://issues.apache.org/jira/browse/SPARK-6605 > Project: Spark > Issue Type: Bug > Components: Streaming > Affects Versions: 1.3.0 > Reporter: SaintBacchus > Fix For: 1.4.0 > > > The transformation *reduceByKeyAndWindow* has two implementations: one use > the *WindowDstream* and the other use *ReducedWindowedDStream*. > But the result always is the same, except when an empty windows occurs. > As a wordcount example, if a period of time (larger than window time) has no > data coming, the first *reduceByKeyAndWindow* has no elem inside but the > second has many elem with the zero value inside. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org