Github user squito commented on the issue:
https://github.com/apache/spark/pull/20532
I agree with @jiangxb1987 ... we already have issues with event logs being
too big, as it the driver gets backlogged even writing them out, and then the
history server takes a long time to parse those files. There have been recent
improvements to that, but doesn't mean we should reintroduce the problem.
I'm not saying this doesn't have a use, I'd just like to figure out if this
the best way to do it. If it only has one very specific use case for
@LantaoJin , then maybe they have an alternative still using public apis, with
a custom listener as I suggested. I worry a user might turn this on (why not,
more data is better) and then later on hit other scaling challenges and not
realize this was the problem.
Or if this does have some general use case for all users, then maybe its
fine, but I haven't seen that yet. And maybe there is a better way to do that
... do we need another way to get detailed output metrics from executors, that
doesn't have some of the scaling challenges of the event log?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]