[ 
https://issues.apache.org/jira/browse/YUNIKORN-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833993#comment-17833993
 ] 

Craig Condit commented on YUNIKORN-2532:
----------------------------------------

Feel free to patch your own YuniKorn instance to use the old log format. Given 
we now have the far superior eventing system available in 1.5, I see little 
reason to expend resources to maintain the existing app summary code. In fact, 
I'd be in favor of deprecating the app summary changes for removal in 1.6 (or 
at the latest 1.7). The functionality is duplicated, so we expend unnecessary 
CPU cycles, memory, and clutter logs on something that is far better handled 
elsewhere now.

> Resource usage report has an incompatible format change
> -------------------------------------------------------
>
>                 Key: YUNIKORN-2532
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2532
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Yongjun Zhang
>            Priority: Major
>
> There is some recent change that caused the application resource usage report 
> to have a new format:
> Prior the change, the format was:
> {code:java}
> YK_APP_SUMMARY: {"appID": "adf53ee0-experiment-organicad-94520240-1-1", 
> "submissionTime": 1712169262131, "startTime": 1712169264134, "finishTime": 
> 1712173619983, "user": 
> "system:serviceaccount:spark-operator-02:spark-operator", "queue": 
> "root.queue-large", "state": "Completed", "rmID": "test-cluster", 
> "resourceUsage": 
> {"insttype-1":{"memory":139178200478515200,"pods":1729129,"vcore":5183062000},"insttype-2":{"memory":113789789798400,"pods":1413,"vcore":4239000}},
>  "preemptedResource": {}}
>   {code}
> with the change, the new format is:
> {code:java}
>  2024-04-04T00:33:08.532Z     INFO    core.scheduler.application.usage        
> objects/application_summary.go:60       YK_APP_SUMMARY: {ApplicationID: 
> afa303d0-test-trino-sparksql--20240404-2-1, SubmissionTime: 1712190615461, 
> StartTime: 1712190617496, FinishTime: 1712190788532, User: 
> system:serviceaccount:spark-operator-01:spark-operator, Queue: 
> root.queue-large, State: Completed, RmID: test-cluster, ResourceUsage: 
> TrackedResource{UNKNOWN:pods=177,UNKNOWN:vcore=354000,UNKNOWN:memory=1431454089216},
>  PreemptedResource: TrackedResource{}, PlaceholderResource: 
> TrackedResource{}}{code}
> There are several incompatibilities:
> 1. the class name TrackedResource was not there before, now it is.
> 2. the instance type was outside the resource part before, not it's embedded
> 3. the instance type was reported correctly before the change, now it's 
> UNKNOWN
> #3 may be a different issue, but it's observed by us at the same time.
> I think what should change the format back to the original one, as this is an 
> incompatible change. What do you think [~wilfreds] , [~pbacsko] ,[~ccondit] ?
> Thanks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to