[ 
https://issues.apache.org/jira/browse/BEAM-8575?focusedWorklogId=342095&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342095
 ]

ASF GitHub Bot logged work on BEAM-8575:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Nov/19 19:46
            Start Date: 12/Nov/19 19:46
    Worklog Time Spent: 10m 
      Work Description: robertwb commented on pull request #10070: [BEAM-8575] 
Added a unit test for Reshuffle to test that Reshuffle pr…
URL: https://github.com/apache/beam/pull/10070#discussion_r345408772
 
 

 ##########
 File path: sdks/python/apache_beam/transforms/util_test.py
 ##########
 @@ -423,6 +432,50 @@ def test_reshuffle_streaming_global_window(self):
                 label='after reshuffle')
     pipeline.run()
 
+  def test_reshuffle_preserves_timestamps(self):
+    pipeline = TestPipeline()
+
+    # Create a PCollection and assign each element with a different timestamp.
+    before_reshuffle = (pipeline
+                        | "Four elements" >> beam.Create([
+                            {'name': 'foo', 'timestamp': MIN_TIMESTAMP},
+                            {'name': 'foo', 'timestamp': 0},
+                            {'name': 'bar', 'timestamp': 33},
+                            {'name': 'bar', 'timestamp': MAX_TIMESTAMP},
+                        ])
+                        | "With timestamp" >> beam.Map(
+                            lambda element: 
beam.window.TimestampedValue(element, element['timestamp']))
+    )
+
+    # Reshuffle the PCollection above and assign the timestamp of an element 
to that element again.
+    after_reshuffle = (before_reshuffle
+                       | "Reshuffle" >> beam.Reshuffle()
+                       | "With timestamps again" >> beam.ParDo(AddTimestamp())
+    )
+
+    # Combine each element in before_reshuffle with its timestamp.
+    formatted_before_reshuffle = (before_reshuffle
+                                  | "Get before_reshuffle timestamp" >> 
beam.ParDo(GetTimestamp())
+    )
+
+    # Combine each element in after_reshuffle with its timestamp.
+    formatted_after_reshuffle = (after_reshuffle
+                                 | "Get after_reshuffle timestamp" >> 
beam.ParDo(GetTimestamp())
+    )
+
+    expected_data = ["Timestamp(-9223372036854.775000) - foo",
+                     "Timestamp(0) - foo",
+                     "Timestamp(33) - bar",
+                     "Timestamp(9223372036854.775000) - bar"
+    ]
+
+    # Can't compare formatted_before_reshuffle and formatted_after_reshuffle 
directly, because they are
 
 Review comment:
   More specifically, the equal_to must not be a (deferred) PCollection, but an 
concrete set of values. 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 342095)
    Time Spent: 3h 50m  (was: 3h 40m)

> Add more Python validates runner tests
> --------------------------------------
>
>                 Key: BEAM-8575
>                 URL: https://issues.apache.org/jira/browse/BEAM-8575
>             Project: Beam
>          Issue Type: Test
>          Components: sdk-py-core, testing
>            Reporter: wendy liu
>            Assignee: wendy liu
>            Priority: Major
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> This is the umbrella issue to track the work of adding more Python tests to 
> improve test coverage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to