[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20532 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user squito commented on the issue: https://github.com/apache/spark/pull/20532 I agree with @jiangxb1987 ... we already have issues with event logs being too big, as it the driver gets backlogged even writing them out, and then the history server takes a long time to parse those files. There have been recent improvements to that, but doesn't mean we should reintroduce the problem. I'm not saying this doesn't have a use, I'd just like to figure out if this the best way to do it. If it only has one very specific use case for @LantaoJin , then maybe they have an alternative still using public apis, with a custom listener as I suggested. I worry a user might turn this on (why not, more data is better) and then later on hit other scaling challenges and not realize this was the problem. Or if this does have some general use case for all users, then maybe its fine, but I haven't seen that yet. And maybe there is a better way to do that ... do we need another way to get detailed output metrics from executors, that doesn't have some of the scaling challenges of the event log? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20532 I'm still wondering whether event log is supposed to work this way, that as the source for customized analysis. I really feel we shall need some event/metrics logging framework that serves for external analysis, but we really need investigate into more use cases and give a more thoughtful design on that issue. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/20532 Thanks @jerryshao. Changed to simply add a new configuration instead sampling logging. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20532 I would suggest to do it like what we have already done for block update event. Since we already opened a door for block update event, it is also acceptable to leave room for another event. User should be responsible for the big event log file. @jiangxb1987 what is your opinion? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/20532 Thanks everyone. So just close it? Or easily leave an enabled switch like blockUpdated dose? I am all OK. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user squito commented on the issue: https://github.com/apache/spark/pull/20532 I can see why you want this sometimes, but I'm trying to figure out if its really valuable for users in general. You could always add a custom listener to log this info. It would go into separate file, not the std event log file, which means you'd have a little more work to do to stitch them together. OTOH that could be a good thing, as it means these history server wouldn't have to parse those extra lines. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/20532 @jerryshao and @jiangxb1987 , thanks for your advice. In 2.1.x, the two Update events (BlockUpdated & ExecutorMetricsUpdate) are all dumb. And in 2.2.x, only BlockUpdated event has a configuration to be logged. So the fair way is just adding an enable configuration to logging ExecutorMetrics for further using. But actually, we are refactoring the heartbeat to report more Executor information to the Driver. And this information will be logged to event log to be analysed. You know the Update events are so mass and they are not sequential. So we don't need all the events to do analysing. For example, we report Heap Usage after GC of all Executors, 10% events also can help us to get the result we want. So just add a configuration "enabled" for ExecutorMetricsUpdate is poorly simple in a big production cluster. BTW, the work I mentioned about will be contributed back to community as well soon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20532 I agree with @jiangxb1987 . @LantaoJin would you please elaborate the usage scenario of dumping executor metrics to event log? Seems history server doesn't leverage such information necessarily. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20532 I'm also worried that if we want to sample more events in the future, we have to add more configs following this way, which doesn't sound like a perfect choice. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20532 Emmm... in case we want to sample more events, does that means we shall add a new config for each event sampling? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20532 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/20532 @jerryshao Could you have a time to help to review? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20532: [SPARK-23353][CORE] Allow ExecutorMetricsUpdate events t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20532 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org