[ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=286866&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-286866
 ]

ASF GitHub Bot logged work on BEAM-7029:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Aug/19 17:14
            Start Date: 01/Aug/19 17:14
    Worklog Time Spent: 10m 
      Work Description: manuelaguilar commented on issue #8251: [BEAM-7029] Add 
KafkaIO.Read as external transform
URL: https://github.com/apache/beam/pull/8251#issuecomment-517372562
 
 
   @mxm I tried the patch and I can see that the right coder is applied to the 
transform. Thank you.
   
   My next step is to have the transform output (KafkaRecords) to be passed to 
my Python pipeline.
   
   How can I handover KafkaRecord elements from Java to Python? Will it work by 
calling an external transform in the Python pipeline or is there a simpler way 
to translate the ReadFromKafka output into a Python implementation of 
KafkaRecord (e.g. confluent_kafka). 
   
   For my pipeline test, I added a Map function 'passthrough' (it just logs the 
message and yields the element) in my Python pipeline after reading from Kafka: 
   
   ```
   messages = ( p | 'Read From Kafka' >> ReadFromKafka(
                                             consumer_config={ ......
                                             },
                                             ......
                                             
expansion_service='localhost:8097'))
   
   
   messages | 'Passthrough' >> beam.Map(passthrough)
   ```
   
   As of now, this is the error I'm getting while the portable runner is 
reading from Kafka topics
   
   ```
   [grpc-default-executor-1] ERROR 
org.apache.beam.fn.harness.control.BeamFnControlClient - Exception while trying 
to handle InstructionRequest 5 java.lang.IllegalArgumentException: Expected 
DoFn to be FunctionSpec with URN urn:beam:dofn:javasdk:0.1, but URN was 
beam:dofn:pickled_python_info:v1
           at 
org.apache.beam.vendor.guava.v20_0.com.google.common.base.Preconditions.checkArgument(Preconditions.java:416)
           at 
org.apache.beam.runners.core.construction.ParDoTranslation.doFnWithExecutionInformationFromProto(ParDoTranslation.java:572)
           at 
org.apache.beam.runners.core.construction.ParDoTranslation.getDoFn(ParDoTranslation.java:282)
           at 
org.apache.beam.fn.harness.DoFnPTransformRunnerFactory$Context.<init>(DoFnPTransformRunnerFactory.java:197)
           at 
org.apache.beam.fn.harness.DoFnPTransformRunnerFactory.createRunnerForPTransform(DoFnPTransformRunnerFactory.java:96)
           at 
org.apache.beam.fn.harness.DoFnPTransformRunnerFactory.createRunnerForPTransform(DoFnPTransformRunnerFactory.java:64)
           at 
org.apache.beam.fn.harness.control.ProcessBundleHandler.createRunnerAndConsumersForPTransformRecursively(ProcessBundleHandler.java:194)
           at 
org.apache.beam.fn.harness.control.ProcessBundleHandler.createRunnerAndConsumersForPTransformRecursively(ProcessBundleHandler.java:163)
           at 
org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:290)
           at 
org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:160)
           at 
org.apache.beam.fn.harness.control.BeamFnControlClient.lambda$processInstructionRequests$0(BeamFnControlClient.java:144)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   ```
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 286866)
    Time Spent: 14h 50m  (was: 14h 40m)

> Support KafkaIO to be configured externally for use with other SDKs
> -------------------------------------------------------------------
>
>                 Key: BEAM-7029
>                 URL: https://issues.apache.org/jira/browse/BEAM-7029
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-java-kafka, runner-flink, sdk-py-core
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: Major
>             Fix For: 2.13.0
>
>          Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to