[
https://issues.apache.org/jira/browse/FLINK-12953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875930#comment-16875930
]
Chad Dombrova commented on FLINK-12953:
---------------------------------------
{quote}Does the web UI you showed is not comes from Apache Beam? You said it
comes from Google Dataflow which is a product of Google Cloud. So, does the web
UI of beam support this feature in your gif? Sorry, I did not use Beam before.
{quote}
No, Beam is a streaming API that is abstracted to work on many different
runners, of which Flink and Dataflow are two. Beam does not come with any web
UI of its own.
{quote}As you mentioned in your comment, it integrated with Stackdriver. IMO
Flink should not support this feature, because it overflows the scope of one
project
{quote}
I agree with this. I mentioned it primarily to demonstrate what's possible with
a logging database, like Stackdriver and Elasticsearch.
{quote}Currently, Flink shows log based on JM/TM view, you need to view the log
for one job from the job view. Do you consider collect those logs print by
tasks in many TM into one place(e.g. one web page)?
{quote}
A collected view is not as important to me as a per-job view, so if I had to
choose only one, it'd be the latter.
{quote}Some people against our opinion(FLINK-11202) because of event
correlation, their thought is event correlation is important for locating
issues. Actually, it's true. So it's a balance. It seems a good way is that we
stay the same, but integrate with some other systems like Elasticsearch to do
these enhanced features you listed.
{quote}
The problem we currently face is that we have a desire for multiple views of
our logged data, but our data source – log files on disk – does not permit it.
So we must choose a file layout that makes the most sense for the most users,
or adopt something like Elasticsearch which would allow us to query and view
data however we like.
I can fully understand why the developers of Flink would not want to make
Elasticsearch a requirement for running their web stack. It's a big dependency
to take on, but it is the de facto solution for this kind of problem for a
reason.
One compromise that that would avoid the need for ES is to do something like
the following:
- provide each job with its own log file
- do a simple concatenation of job logs in the Task Manager log view, and
optionally provide some means of choosing multiple jobs to view (like a name
filter)
It's not quite the same as what we have now since messages from multiple jobs
would not be interleaved, but it at least puts all the logs in the same place.
> View logs from Job view in Web Dashboard
> ----------------------------------------
>
> Key: FLINK-12953
> URL: https://issues.apache.org/jira/browse/FLINK-12953
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Web Frontend
> Reporter: Chad Dombrova
> Priority: Major
> Attachments: dataflow-log-ux.gif
>
>
> As a (beam) developer I want to be able to print/log information from my
> custom transforms, and then monitor that output within the job view of the
> Web Dashboard, so that I don't have to go hunting through the combined log in
> the Task Manager view. The Task Manager log has way too much in it,
> spanning all jobs, including output logged by both flink and user code.
> A good example of how this UX should work can be found in Google Dataflow:
> - click on a job, and see the logged output for that job
> - click on a transform, and see the logged output for just that transform
> thanks!
> Edited: changed Job Manager to Task Manager
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)