[
https://issues.apache.org/jira/browse/BEAM-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973710#comment-16973710
]
Julia C commented on BEAM-7885:
-------------------------------
My DoFn's setup() is also not being called before process(), but I'm running in
batch mode with the fn_api direct runner against apache-beam==2.15.0. Could
this be the same issue, or is it likely something different?
> DoFn.setup() don't run for streaming jobs on DirectRunner.
> -----------------------------------------------------------
>
> Key: BEAM-7885
> URL: https://issues.apache.org/jira/browse/BEAM-7885
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Affects Versions: 2.14.0
> Environment: Python
> Reporter: niklas Hansson
> Priority: Minor
>
> From version 2.14.0 Python have introduced setup and teardown for DoFn in
> order to "Called to prepare an instance for processing bundles of
> elements.This is a good place to initialize transient in-memory resources,
> such as network connections."
> However when trying to use it for a unbounded job (pubsub source) it seams
> like the DoFn.setup() is never called and the resources are never initialize.
> [UPDATE] it is working for Dataflow runner but not for DirectRunner. For the
> Dataflow runner the DoFn.Setup seams to be called multiple times but then
> never again when the pipeline is processing elements [UPDATE] . For the
> direct runner I get:
> """"
> AttributeError: 'NoneType' object has no attribute 'predict' [while running
> 'transform the data']
> """
> My source code: [https://github.com/NikeNano/DataflowSklearnStreaming]
>
> I am happy to contribute with example code for how to use setup as soon as I
> get it running :)
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)