[
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=233000&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233000
]
ASF GitHub Bot logged work on BEAM-7029:
----------------------------------------
Author: ASF GitHub Bot
Created on: 25/Apr/19 17:34
Start Date: 25/Apr/19 17:34
Worklog Time Spent: 10m
Work Description: mxm commented on pull request #8322: [BEAM-7029] Add
KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278656684
##########
File path: sdks/python/apache_beam/io/external/kafka.py
##########
@@ -118,35 +118,112 @@ def expand(self, pbegin):
payload.SerializeToString(),
self.expansion_service))
- @staticmethod
- def _encode_map(dict_obj):
- kv_list = [(key.encode('utf-8'), val.encode('utf-8'))
- for key, val in dict_obj.items()]
- coder = IterableCoder(TupleCoder(
- [LengthPrefixCoder(BytesCoder()), LengthPrefixCoder(BytesCoder())]))
- coder_urns = ['beam:coder:iterable:v1',
- 'beam:coder:kv:v1',
- 'beam:coder:bytes:v1',
- 'beam:coder:bytes:v1']
- return ConfigValue(
- coder_urn=coder_urns,
- payload=coder.encode(kv_list))
-
- @staticmethod
- def _encode_list(list_obj):
- encoded_list = [val.encode('utf-8') for val in list_obj]
- coder = IterableCoder(LengthPrefixCoder(BytesCoder()))
- coder_urns = ['beam:coder:iterable:v1',
- 'beam:coder:bytes:v1']
- return ConfigValue(
- coder_urn=coder_urns,
- payload=coder.encode(encoded_list))
-
- @staticmethod
- def _encode_str(str_obj):
- encoded_str = str_obj.encode('utf-8')
- coder = LengthPrefixCoder(BytesCoder())
- coder_urns = ['beam:coder:bytes:v1']
- return ConfigValue(
- coder_urn=coder_urns,
- payload=coder.encode(encoded_str))
+
+class WriteToKafka(ptransform.PTransform):
+ """
+ An external PTransform which writes KV data to a specified Kafka topic.
+ If no Kafka Serializer for key/value is provided, then key/value are
+ assumed to be byte arrays.
+
+ Note: To use this transform, you need to start the Java expansion service.
+ Please refer to the portability documentation on how to do that. The
+ expansion service address has to be provided when instantiating this
+ transform. During pipeline translation this transform will be replaced by
+ the Java SDK's KafkaIO.
Review comment:
Wasn't sure how this is displayed in IDEs for Python development but seems
fair not to repeat it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 233000)
Time Spent: 12h 40m (was: 12.5h)
> Support KafkaIO to be configured externally for use with other SDKs
> -------------------------------------------------------------------
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
> Issue Type: New Feature
> Components: io-java-kafka, runner-flink, sdk-py-core
> Reporter: Maximilian Michels
> Assignee: Maximilian Michels
> Priority: Major
> Time Spent: 12h 40m
> Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs.
> We should add more useful transforms then just {{GenerateSequence}}.
> {{KafkaIO}} is a good candidate.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)