[
https://issues.apache.org/jira/browse/BEAM-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eugene Huang updated BEAM-6107:
-------------------------------
Description:
Hello, my name is Eugene. This is my first bug I am submitting to the Apache
Beam project!
I'm using Python's SDK on Google Cloud Platform to prototype with Beam locally
inside Datalab and then submit a job to Dataflow. In GCP, I'm using Dataflow
and use the logging as basically a print statement to check the state of
objects. For example, after a GroupByKey, if I run Beam locally, the grouped
elements are in a list. In Dataflow, the group elements sometimes will be a
generator. Hence, logging is very important to me.
When submitting a job to Dataflow using Beam 2.6.0, the logging works
correctly. When I click on a box on the Dataflow job's user interface, I can
see a bunch of logs. However, when submitting a job to Dataflow using Beam
2.7.0 or 2.8.0, the logs do not appear at all.
I believe this logging problem is specific to Dataflow/GCP and might not apply
to the other runners. The reason is that I look in the StackDriver for the job
submitted for Dataflow with Beam versions 2.6.0, 2.7.0, and 2.8.0–all the logs
exist in StackDriver. Hence, the logging was correctly accounted for. However,
in the Dataflow user interface, the logs do not appear for Beam 2.7.0 or 2.8.0
when you click on the box for a transformation.
It appears that this problem is GCP specific. I'm creating this bug here since
I don't know where else to raise this issue. Sorry if this is not the
appropriate place to post the bug. If there is a more appropriate place, please
tell me and I'll raise the bug there.
If there are any questions, I can provide screenshots or anything you like.
Thank you,
Eugene
was:
Hello, my name is Eugene. This is my first bug I am submitting to the Apache
Beam project!
I'm using Python's SDK on Google Cloud Platform to prototype with Beam locally
inside Datalab and then submit a job to Dataflow. In GCP, I'm using Dataflow
and use the logging as basically a print statement to check the state of
objects. For example, after a GroupByKey, if I run Beam locally, the grouped
elements are in a list. In Dataflow, the group elements sometimes will be a
generator. Hence, logging is very important to me.
When submitting a job to Dataflow using Beam 2.6.0, the logging works
correctly. When I click on a box on the Dataflow job's user interface, I can
see a bunch of logs. However, when submitting a job to Dataflow using Beam
2.7.0 or 2.8.0, the logs do not appear at all-it is as if logging did not
happen at all.
My guess is that this logging problem is specific to Dataflow/GCP and might not
apply to the other runners. In that case, I'm creating this bug here since I
don't know where else to raise this issue.
If there are any questions, I can provide screenshots or anything you like.
Thank you,
Eugene
> Logging on Dataflow for Python's Beam 2.7.0 and 2.8.0 not working
> -----------------------------------------------------------------
>
> Key: BEAM-6107
> URL: https://issues.apache.org/jira/browse/BEAM-6107
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Affects Versions: 2.7.0, 2.8.0
> Reporter: Eugene Huang
> Assignee: Tyler Akidau
> Priority: Minor
>
> Hello, my name is Eugene. This is my first bug I am submitting to the Apache
> Beam project!
> I'm using Python's SDK on Google Cloud Platform to prototype with Beam
> locally inside Datalab and then submit a job to Dataflow. In GCP, I'm using
> Dataflow and use the logging as basically a print statement to check the
> state of objects. For example, after a GroupByKey, if I run Beam locally, the
> grouped elements are in a list. In Dataflow, the group elements sometimes
> will be a generator. Hence, logging is very important to me.
>
> When submitting a job to Dataflow using Beam 2.6.0, the logging works
> correctly. When I click on a box on the Dataflow job's user interface, I can
> see a bunch of logs. However, when submitting a job to Dataflow using Beam
> 2.7.0 or 2.8.0, the logs do not appear at all.
> I believe this logging problem is specific to Dataflow/GCP and might not
> apply to the other runners. The reason is that I look in the StackDriver for
> the job submitted for Dataflow with Beam versions 2.6.0, 2.7.0, and 2.8.0–all
> the logs exist in StackDriver. Hence, the logging was correctly accounted
> for. However, in the Dataflow user interface, the logs do not appear for Beam
> 2.7.0 or 2.8.0 when you click on the box for a transformation.
> It appears that this problem is GCP specific. I'm creating this bug here
> since I don't know where else to raise this issue. Sorry if this is not the
> appropriate place to post the bug. If there is a more appropriate place,
> please tell me and I'll raise the bug there.
> If there are any questions, I can provide screenshots or anything you like.
> Thank you,
> Eugene
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)