[
https://issues.apache.org/jira/browse/BEAM-10768?focusedWorklogId=489206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-489206
]
ASF GitHub Bot logged work on BEAM-10768:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 23/Sep/20 04:43
Start Date: 23/Sep/20 04:43
Worklog Time Spent: 10m
Work Description: ibzib commented on a change in pull request #12637:
URL: https://github.com/apache/beam/pull/12637#discussion_r492386778
##########
File path: sdks/python/apache_beam/runners/worker/data_plane_test.py
##########
@@ -108,16 +106,28 @@ def send(instruction_id, transform_id, data):
])
# Multiple interleaved writes to multiple instructions.
- send('1', transform_1, b'abc')
- send('2', transform_1, b'def')
+ stream11 = from_channel.output_stream('1', transform_1)
+ stream11.write(b'abc')
+ stream21 = from_channel.output_stream('2', transform_1)
+ stream21.write(b'def')
+ if not time_based_flush:
+ stream11.close()
self.assertEqual(
list(
itertools.islice(to_channel.input_elements('1', [transform_1]),
1)),
[
beam_fn_api_pb2.Elements.Data(
instruction_id='1', transform_id=transform_1, data=b'abc')
])
- send('2', transform_2, b'ghi')
+ if time_based_flush:
Review comment:
Write does not provide ordering guarantees in this case.
Elements are stored in a
[queue](https://github.com/apache/beam/blob/7b3d4251d244c10545fb37f1d93ebcad84a98681/sdks/python/apache_beam/runners/worker/data_plane.py#L371)
before being sent, to enable batching. Elements aren't added to that queue
until the [flush
callback](https://github.com/apache/beam/blob/7b3d4251d244c10545fb37f1d93ebcad84a98681/sdks/python/apache_beam/runners/worker/data_plane.py#L493)
is invoked. Because the flush callback is [invoked
periodically](https://github.com/apache/beam/blob/7b3d4251d244c10545fb37f1d93ebcad84a98681/sdks/python/apache_beam/runners/worker/data_plane.py#L182)
starting from when a stream is constructed, there is no guarantee that one
stream's callback is called before the other.
##########
File path: sdks/python/apache_beam/runners/worker/data_plane_test.py
##########
@@ -108,16 +106,28 @@ def send(instruction_id, transform_id, data):
])
# Multiple interleaved writes to multiple instructions.
- send('1', transform_1, b'abc')
- send('2', transform_1, b'def')
+ stream11 = from_channel.output_stream('1', transform_1)
+ stream11.write(b'abc')
+ stream21 = from_channel.output_stream('2', transform_1)
+ stream21.write(b'def')
+ if not time_based_flush:
+ stream11.close()
self.assertEqual(
list(
itertools.islice(to_channel.input_elements('1', [transform_1]),
1)),
[
beam_fn_api_pb2.Elements.Data(
instruction_id='1', transform_id=transform_1, data=b'abc')
])
- send('2', transform_2, b'ghi')
+ if time_based_flush:
Review comment:
> Please add details as comment.
Done.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 489206)
Time Spent: 5h (was: 4h 50m)
> DataChannelTest.test_time_based_flush_grpc_data_channel flake
> -------------------------------------------------------------
>
> Key: BEAM-10768
> URL: https://issues.apache.org/jira/browse/BEAM-10768
> Project: Beam
> Issue Type: Bug
> Components: test-failures
> Reporter: Kyle Weaver
> Assignee: Kyle Weaver
> Priority: P1
> Labels: flaky-test
> Time Spent: 5h
> Remaining Estimate: 0h
>
> =================================== FAILURES
> ===================================
> ___________ DataChannelTest.test_time_based_flush_grpc_data_channel
> ____________
> [gw0] darwin -- Python 3.7.8
> /Users/runner/work/beam/beam/sdks/python/target/.tox/py37/bin/python
> self = <apache_beam.runners.worker.data_plane_test.DataChannelTest
> testMethod=test_time_based_flush_grpc_data_channel>
> def test_time_based_flush_grpc_data_channel(self):
> > self._grpc_data_channel_test(True)
> apache_beam/runners/worker/data_plane_test.py:44:
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _
> apache_beam/runners/worker/data_plane_test.py:74: in _grpc_data_channel_test
> data_channel_service, data_channel_client, time_based_flush)
> apache_beam/runners/worker/data_plane_test.py:87: in _data_channel_test
> self._data_channel_test_one_direction(client, server, time_based_flush)
> apache_beam/runners/worker/data_plane_test.py:129: in
> _data_channel_test_one_direction
> instruction_id='2', transform_id=transform_2, data=b'ghi')
> E AssertionError: Lists differ: [inst[26 chars]id: "2"
> E data: "ghi"
> E , instruction_id: "2"
> E tran[22 chars]ef"
> E ] != [inst[26 chars]id: "1"
> E data: "def"
> E , instruction_id: "2"
> E tran[22 chars]hi"
> E ]
> E
> E First differing element 0:
> E instruction_id: "2"
> E transform_id: "2"
> E data: "ghi"
> E
> E instruction_id: "2"
> E transform_id: "1"
> E data: "def"
> E
> E
> E [instruction_id: "2"
> E + transform_id: "1"
> E + data: "def"
> E + ,
> E + instruction_id: "2"
> E transform_id: "2"
> E data: "ghi"
> E - ,
> E - instruction_id: "2"
> E - transform_id: "1"
> E - data: "def"
> E ]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)