[
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=293109&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293109
]
ASF GitHub Bot logged work on BEAM-7029:
----------------------------------------
Author: ASF GitHub Bot
Created on: 12/Aug/19 12:21
Start Date: 12/Aug/19 12:21
Worklog Time Spent: 10m
Work Description: mxm commented on issue #8251: [BEAM-7029] Add
KafkaIO.Read as external transform
URL: https://github.com/apache/beam/pull/8251#issuecomment-520402302
Great to hear that you resolved the issue. The user variable could probably
be inferred from the user's home directory.
>Another thing I noticed is that although the deserializer I specified in
the ReadFromKafka transform is for ByteArray (both for key and value), the
Python pipeline is consuming it as a string without further conversion.
The string consumption will only work for Python 2 though where bytes and
strings are treated equally.
>Would it make sense to have the KafkaRecord metadata as an option in the
portable pipeline? (to be able to access partitions, offsets, timestamps, and
other KafkaRecord properties).
That would make perfect sense. However, we wanted to avoid having to include
KafkaRecord as a type across SDKs. We figured it might be sufficient to have
only the raw data available. As you saw, it is not yet possible to completely
use the KafkaIO consumer without the KafkaRecord cross-language coder.
Hopefully, we will have fixed that soon and introduce a structure for
preserving KafkaRecord meta data across SDKs.
Curious to learn about your performance tests.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 293109)
Time Spent: 16h 10m (was: 16h)
> Support KafkaIO to be configured externally for use with other SDKs
> -------------------------------------------------------------------
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
> Issue Type: New Feature
> Components: io-java-kafka, runner-flink, sdk-py-core
> Reporter: Maximilian Michels
> Assignee: Maximilian Michels
> Priority: Major
> Fix For: 2.13.0
>
> Time Spent: 16h 10m
> Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs.
> We should add more useful transforms then just {{GenerateSequence}}.
> {{KafkaIO}} is a good candidate.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)