[ https://issues.apache.org/jira/browse/BEAM-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903189#comment-16903189 ]
James Hutchison commented on BEAM-7930: --------------------------------------- If this isn't already a known issue I can try to provide more information. > bundle_processor log spam using python SDK on dataflow runner > ------------------------------------------------------------- > > Key: BEAM-7930 > URL: https://issues.apache.org/jira/browse/BEAM-7930 > Project: Beam > Issue Type: Bug > Components: runner-dataflow > Affects Versions: 2.13.0 > Reporter: James Hutchison > Priority: Minor > > When running my pipeline on dataflow, I can see in the stackdriver logs a > large amount of spam for the following messages (note that the numbers in > them change every message): > * [INFO] (bundle_processor.create_operation) No unique name set for > transform generatedPtransform-67 > * [INFO] (bundle_processor.create_operation) No unique name for transform -19 > * [ERROR] (bundle_processor.create) Missing required coder_id on grpc_port > for -19; using deprecated fallback. > I tried using a breakpoint on where these log messages originate using the > direct runner and it never hit it, so I don't know specifically what is > causing them. > I also tried using the logging module to change the threshold and also mocked > out the logging attribute in the bundle_processor module to change the log > level to CRITICAL and I still see the log messages. > The pipeline is a streaming pipeline that reads from two pubsub topics, > merges the inputs and runs distinct on the inputs over each processing time > window, fetches from an external service, does processing, and inserts into > elasticsearch with failures going into bigquery. I notice the log messages > seem to cluster and this appears early on before any other log messages in > any of the other steps so I wonder if maybe this is coming from the pubsub > read or windowing portion. > Expected behavior: > * I don't expect to see these noisy log messages which seem to indicate > something is wrong > * The missing required coder_id message is at the ERROR log level so it > pollutes the error logs. I would expect this to be at the WARNING or INFO > level. -- This message was sent by Atlassian JIRA (v7.6.14#76016)