[jira] [Commented] (YUNIKORN-14) Add rest API to retrieve app/container history info
[ https://issues.apache.org/jira/browse/YUNIKORN-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074331#comment-17074331 ] Adam Antal commented on YUNIKORN-14: Thanks for the reviews! > Add rest API to retrieve app/container history info > --- > > Key: YUNIKORN-14 > URL: https://issues.apache.org/jira/browse/YUNIKORN-14 > Project: Apache YuniKorn > Issue Type: New Feature > Components: core - scheduler >Reporter: Wilfred Spiegelenburg >Assignee: Adam Antal >Priority: Blocker > Labels: pull-request-available > Fix For: 0.8 > > Attachments: Yunikorn_UI.png > > Time Spent: 20m > Remaining Estimate: 0h > > As part of the web UI we can show application and container history. > The current pages are mocked up and do not show the real history. Before the > changes can be made on the web UI side we need to provide the history via a > REST interface so it can be consumed by the UI. > All web service code is located in package > [https://github.com/apache/incubator-yunikorn-core/tree/master/pkg/webservice]. > When running the scheduler locally (from K8shim using "make run"), the REST > APIs can be accessed via > * [http://localhost:9080/ws/v1/apps] > * [http://localhost:9080/ws/v1/queues] > * [http://localhost:9080/ws/v1/nodes] > We need to add another endpoint to provide data to yunikorn-web to render the > app/container history page. Please check with [~akhilpb] for the desired data > format, etc. That issue is tracked via YUNIKORN-8. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org
[jira] [Commented] (YUNIKORN-14) Add rest API to retrieve app/container history info
[ https://issues.apache.org/jira/browse/YUNIKORN-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17068549#comment-17068549 ] Adam Antal commented on YUNIKORN-14: Discussed a discussion with [~wilfreds] about this issue offline. The proposed solution is the following: - Keep the current {{HistoricalClusterInfo}} object to store the data. - Keep the {{HistoricalPartitionInfoUpdater}} object, but modify its implementation to not pull the metrics directly from the scheduler, but from the existing Prometheus metrics endpoints. Thus we will have the same data in the web UI as a user would see it from Grafana using the Prometheus endpoints. - The settings can be kept the same, but the classes might be moved from the cache package to some other package. > Add rest API to retrieve app/container history info > --- > > Key: YUNIKORN-14 > URL: https://issues.apache.org/jira/browse/YUNIKORN-14 > Project: Apache YuniKorn > Issue Type: New Feature > Components: core - scheduler >Reporter: Wilfred Spiegelenburg >Assignee: Adam Antal >Priority: Blocker > Labels: pull-request-available > Attachments: Yunikorn_UI.png > > Time Spent: 10m > Remaining Estimate: 0h > > As part of the web UI we can show application and container history. > The current pages are mocked up and do not show the real history. Before the > changes can be made on the web UI side we need to provide the history via a > REST interface so it can be consumed by the UI. > All web service code is located in package > [https://github.com/apache/incubator-yunikorn-core/tree/master/pkg/webservice]. > When running the scheduler locally (from K8shim using "make run"), the REST > APIs can be accessed via > * [http://localhost:9080/ws/v1/apps] > * [http://localhost:9080/ws/v1/queues] > * [http://localhost:9080/ws/v1/nodes] > We need to add another endpoint to provide data to yunikorn-web to render the > app/container history page. Please check with [~akhilpb] for the desired data > format, etc. That issue is tracked via YUNIKORN-8. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org
[jira] [Commented] (YUNIKORN-14) Add rest API to retrieve app/container history info
[ https://issues.apache.org/jira/browse/YUNIKORN-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067390#comment-17067390 ] Wilfred Spiegelenburg commented on YUNIKORN-14: --- Providing two sets of metrics that are not in sync or can show highly different numbers is a bad idea. We'll get questions and new jiras raised about the fact that the provided web UI is out of sync with metrics collected. We should either leverage the metrics implementation or not provide the web UI metrics. Doing two things is asking for problems. > Add rest API to retrieve app/container history info > --- > > Key: YUNIKORN-14 > URL: https://issues.apache.org/jira/browse/YUNIKORN-14 > Project: Apache YuniKorn > Issue Type: New Feature > Components: core - scheduler >Reporter: Wilfred Spiegelenburg >Assignee: Adam Antal >Priority: Major > Labels: pull-request-available > Attachments: Yunikorn_UI.png > > Time Spent: 10m > Remaining Estimate: 0h > > As part of the web UI we can show application and container history. > The current pages are mocked up and do not show the real history. Before the > changes can be made on the web UI side we need to provide the history via a > REST interface so it can be consumed by the UI. > All web service code is located in package > [https://github.com/apache/incubator-yunikorn-core/tree/master/pkg/webservice]. > When running the scheduler locally (from K8shim using "make run"), the REST > APIs can be accessed via > * [http://localhost:9080/ws/v1/apps] > * [http://localhost:9080/ws/v1/queues] > * [http://localhost:9080/ws/v1/nodes] > We need to add another endpoint to provide data to yunikorn-web to render the > app/container history page. Please check with [~akhilpb] for the desired data > format, etc. That issue is tracked via YUNIKORN-8. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org
[jira] [Commented] (YUNIKORN-14) Add rest API to retrieve app/container history info
[ https://issues.apache.org/jira/browse/YUNIKORN-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067350#comment-17067350 ] Weiwei Yang commented on YUNIKORN-14: - Hi [~wilfreds] Thanks for the comments. The history info here just provides very basic info about the cluster, e.g # of containers/apps in the last 12h. I think we can leverage this simple solution to give a basic impression for users. For comprehensive metrics, we have Prometheus integration so we can push that to its store for persistent. Here, we just need a small time-bound cache just like [~adam.antal] has implemented. It is a pull mode, but that's fine. We are doing the pull once per minute (or maybe 30s), since the data is cached, no matter how many requests from web UI, it will not lock partition and damage scheduler performance. For the moment where write happens, it simply gets the data from partition without any calculation, the impact is trivial. The push mode is the Prometheus metrics, which we already have so I don't think we need to build anything similar. > Add rest API to retrieve app/container history info > --- > > Key: YUNIKORN-14 > URL: https://issues.apache.org/jira/browse/YUNIKORN-14 > Project: Apache YuniKorn > Issue Type: New Feature > Components: core - scheduler >Reporter: Wilfred Spiegelenburg >Assignee: Adam Antal >Priority: Major > Labels: pull-request-available > Attachments: Yunikorn_UI.png > > Time Spent: 10m > Remaining Estimate: 0h > > As part of the web UI we can show application and container history. > The current pages are mocked up and do not show the real history. Before the > changes can be made on the web UI side we need to provide the history via a > REST interface so it can be consumed by the UI. > All web service code is located in package > [https://github.com/apache/incubator-yunikorn-core/tree/master/pkg/webservice]. > When running the scheduler locally (from K8shim using "make run"), the REST > APIs can be accessed via > * [http://localhost:9080/ws/v1/apps] > * [http://localhost:9080/ws/v1/queues] > * [http://localhost:9080/ws/v1/nodes] > We need to add another endpoint to provide data to yunikorn-web to render the > app/container history page. Please check with [~akhilpb] for the desired data > format, etc. That issue is tracked via YUNIKORN-8. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org
[jira] [Commented] (YUNIKORN-14) Add rest API to retrieve app/container history info
[ https://issues.apache.org/jira/browse/YUNIKORN-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067342#comment-17067342 ] Wilfred Spiegelenburg commented on YUNIKORN-14: --- I looked at the PR and want to propose a different approach as I see a number of issues. I have mentioned tracking applications details in the text but I am not sure if that is needed in the first instance. It would still fit in the design if we want to add that in the second step. History should be part of {{common}} or the {{scheduler}} not the {{cache}} I think. I would expect that we have multiple generic collectors that can collect history data. One generic collector is started per partition like the {{PartitionManager}} in its own go routine. History and all tracking is always per partition and will not go over that level at any point. The current implementation uses a pull mechanism to collect the data from the partition. That requires locking the partition on retrieval (locks are missing currently in the solution) and could thus impact scheduling performance if the web interface gets lots of requests. We should not need to impact the partition to retrieve the history. The data should be kept in the collector and retrieved from there. A change going deeper: why is the history just getting top level partition data? Getting info out for queues or nodes is as important going forward. I also see an omission here: we lose history data as soon as we remove the partition. It will thus not show us real history for a time period just the history for the current state going back a fixed time. That would become even more important when we look at queues, nodes or applications. If we go forward we need to be able to track and maintain the history data for a period of time independent of the removal of the partition/node/queue/application. Tracking history should not be limited by the number of entries but by time range that we need to keep (24 hours as an example). Having a history per minute is what we need at least. Maybe we even need to go to a 30 or 15 second split. Longer periods means we could too easily miss short running containers or applications. The other solution would be to use a push from the different tracked objects into a channel that is read by the history collector. That would mean we do not miss info but the implementation becomes a bit trickier. We can still sum up to give stats per time range but that would then become easier to manage for small intervals. That would also not be "on demand" but based on an internal timing of the history collector. All changes for things we need to track run through the partition info already so we would just need to instrument one object to keep track of all these things. Thoughts? > Add rest API to retrieve app/container history info > --- > > Key: YUNIKORN-14 > URL: https://issues.apache.org/jira/browse/YUNIKORN-14 > Project: Apache YuniKorn > Issue Type: New Feature > Components: core - scheduler >Reporter: Wilfred Spiegelenburg >Assignee: Adam Antal >Priority: Major > Labels: pull-request-available > Attachments: Yunikorn_UI.png > > Time Spent: 10m > Remaining Estimate: 0h > > As part of the web UI we can show application and container history. > The current pages are mocked up and do not show the real history. Before the > changes can be made on the web UI side we need to provide the history via a > REST interface so it can be consumed by the UI. > All web service code is located in package > [https://github.com/apache/incubator-yunikorn-core/tree/master/pkg/webservice]. > When running the scheduler locally (from K8shim using "make run"), the REST > APIs can be accessed via > * [http://localhost:9080/ws/v1/apps] > * [http://localhost:9080/ws/v1/queues] > * [http://localhost:9080/ws/v1/nodes] > We need to add another endpoint to provide data to yunikorn-web to render the > app/container history page. Please check with [~akhilpb] for the desired data > format, etc. That issue is tracked via YUNIKORN-8. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org