[
https://issues.apache.org/jira/browse/BEAM-5395?focusedWorklogId=145105&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145105
]
ASF GitHub Bot logged work on BEAM-5395:
----------------------------------------
Author: ASF GitHub Bot
Created on: 17/Sep/18 22:37
Start Date: 17/Sep/18 22:37
Worklog Time Spent: 10m
Work Description: angoenka commented on a change in pull request #6405:
[BEAM-5395] Chunk data streams.
URL: https://github.com/apache/beam/pull/6405#discussion_r218249585
##########
File path: sdks/python/apache_beam/coders/slow_stream.py
##########
@@ -34,15 +34,18 @@ class OutputStream(object):
def __init__(self):
self.data = []
+ self.byte_count = 0
Review comment:
Nit: Shall we rely on ` len(self.data)` instead of creating and managing a
new field.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 145105)
Time Spent: 0.5h (was: 20m)
> BeamPython data plane streams data
> ----------------------------------
>
> Key: BEAM-5395
> URL: https://issues.apache.org/jira/browse/BEAM-5395
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-harness
> Reporter: Robert Bradshaw
> Assignee: Robert Bradshaw
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Currently the default implementation is to buffer all data for the bundle.
> Experiments were made splitting at arbitrary byte boundaries, but it appears
> that Java requires messages to be split on element boundaries. For now we
> should implement that in Python (even if this means not being able to split
> up large elements among multiple messages).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)