kennknowles opened a new issue, #19567:
URL: https://github.com/apache/beam/issues/19567
When running my pipeline on dataflow, I can see in the stackdriver logs a
large amount of spam for the following messages (note that the numbers in them
change every message):
* [INFO] (bundle_processor.create_operation) No unique name set for
transform generatedPtransform-67
* [INFO] (bundle_processor.create_operation) No unique name for transform
-19
* [ERROR] (bundle_processor.create) Missing required coder_id on grpc_port
for -19; using deprecated fallback.
I tried running locally using the debugger and setting breakpoints on where
these log messages originate using the direct runner and it never hit it, so I
don't know specifically what is causing them.
I also tried using the logging module to change the threshold and also
mocked out the logging attribute in the bundle_processor module to change the
log level to CRITICAL and I still see the log messages.
The pipeline is a streaming pipeline that reads from two pubsub topics,
merges the inputs and runs distinct on the inputs over each processing time
window, fetches from an external service, does processing, and inserts into
elasticsearch with failures going into bigquery. I notice the log messages seem
to cluster and this appears early on before any other log messages in any of
the other steps so I wonder if maybe this is coming from the pubsub read or
windowing portion.
Expected behavior:
* I don't expect to see these noisy log messages which seem to indicate
something is wrong
* The missing required coder_id message is at the ERROR log level so it
pollutes the error logs. I would expect this to be at the WARNING or INFO level.
Imported from Jira
[BEAM-7930](https://issues.apache.org/jira/browse/BEAM-7930). Original Jira may
contain additional context.
Reported by: jimpremise.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]