[ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=343011&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343011
 ]

ASF GitHub Bot logged work on BEAM-8335:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Nov/19 23:42
            Start Date: 13/Nov/19 23:42
    Worklog Time Spent: 10m 
      Work Description: davidyan74 commented on pull request #9953: [BEAM-8335] 
Adds support for multi-output TestStream
URL: https://github.com/apache/beam/pull/9953#discussion_r346061286
 
 

 ##########
 File path: sdks/python/apache_beam/testing/test_stream_test.py
 ##########
 @@ -127,6 +127,118 @@ def process(self, element=beam.DoFn.ElementParam,
 
     p.run()
 
+  def test_multiple_outputs(self):
+    """Tests that the TestStream supports emitting to multiple PCollections."""
+    test_stream = (TestStream()
+                   .advance_watermark_to(5, tag='letters')
+                   .add_elements(['a', 'b', 'c'], tag='letters')
+                   .advance_watermark_to(10, tag='numbers')
+                   .add_elements(['1', '2', '3'], tag='numbers'))
+
+    class RecordFn(beam.DoFn):
+      def process(self, element=beam.DoFn.ElementParam,
+                  timestamp=beam.DoFn.TimestampParam):
+        yield (element, timestamp)
+
+    options = StandardOptions(streaming=True)
+    p = TestPipeline(options=options)
+
+    main = p | test_stream
+    letters = main['letters'] | 'record letters' >> beam.ParDo(RecordFn())
+    numbers = main['numbers'] | 'record numbers' >> beam.ParDo(RecordFn())
+
+    assert_that(letters, equal_to([
+        ('a', Timestamp(5)),
+        ('b', Timestamp(5)),
+        ('c', Timestamp(5))]), label='assert letters')
+
+    assert_that(numbers, equal_to([
+        ('1', Timestamp(10)),
+        ('2', Timestamp(10)),
+        ('3', Timestamp(10))]), label='assert numbers')
+
+    p.run()
+
+  def test_multiple_outputs_with_watermark_advancement(self):
+    """Tests that the TestStream can independently control output 
watermarks."""
+
+    # Purposely set the watermark of numbers to 20 then letters to 5 to test
+    # that the watermark advancement is per PCollection.
+    #
+    # This creates two PCollections, (a, b, c) and (1, 2, 3). These will be
+    # emitted at different times so that they will have different windows. The
+    # watermark advancement is checked by checking their windows. If the
+    # watermark does not advance, then the windows will be [-inf, -inf). If the
+    # windows do not advance separately, then the PCollections will both
+    # windowed in [15, 30).
+    test_stream = (TestStream()
+                   .advance_watermark_to(20, tag='numbers')
+                   .advance_watermark_to(5, tag='letters')
+                   .add_elements(['a', 'b', 'c'], tag='letters')
+                   .advance_watermark_to(10, tag='letters')
+                   .add_elements(['1', '2', '3'], tag='numbers')
+                   .advance_watermark_to(30, tag='numbers'))
+
+    options = StandardOptions(streaming=True)
+    p = TestPipeline(options=options)
+
+    main = p | test_stream
+
+    # Use an AfterWatermark trigger with an early firing to test that the
+    # watermark is advancing properly and that the element is being emitted in
+    # the correct window.
+    letters = (main['letters']
+               | 'letter windows' >> beam.WindowInto(
+                   FixedWindows(15),
+                   trigger=trigger.AfterWatermark(early=trigger.AfterCount(1)),
+                   accumulation_mode=trigger.AccumulationMode.DISCARDING)
+               | 'letter with key' >> beam.Map(lambda x: ('k', x))
+               | 'letter gbk' >> beam.GroupByKey())
+
+    numbers = (main['numbers']
+               | 'number windows' >> beam.WindowInto(
+                   FixedWindows(15),
+                   trigger=trigger.AfterWatermark(early=trigger.AfterCount(1)),
+                   accumulation_mode=trigger.AccumulationMode.DISCARDING)
+               | 'number with key' >> beam.Map(lambda x: ('k', x))
+               | 'number gbk' >> beam.GroupByKey())
+
+    # The letters were emitted when the watermark was at 5, thus we expect to
+    # see the elements in the [0, 15) window. We used an early trigger to make
+    # sure that the ON_TIME empty pane was also emitted with a TestStream.
+    # This pane has no data because of the early trigger causes the elements to
+    # fire before the end of the window and because the accumulation mode
+    # discards any late data.
 
 Review comment:
   nit: I think we shouldn't say "late data" here, since the discarding 
accumulation mode in general discards accumulated data after a trigger is 
fired, regardless whether data is late or not.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 343011)
    Time Spent: 26.5h  (was: 26h 20m)

> Add streaming support to Interactive Beam
> -----------------------------------------
>
>                 Key: BEAM-8335
>                 URL: https://issues.apache.org/jira/browse/BEAM-8335
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-py-interactive
>            Reporter: Sam Rohde
>            Assignee: Sam Rohde
>            Priority: Major
>          Time Spent: 26.5h
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to