Re: YuniKorn community meetup: Santa Clara, CA

2022-10-03 Thread Chenya Zhang
Big +1!

I'll also be in the Bay area the week of Oct 17th. I can also help Wilfred
to organize this meetup.

Let me check if there are topics of interest from our side and share them
here. This is very exciting!

Best,
Chenya

On Fri, Sep 30, 2022 at 4:38 PM Weiwei Yang  wrote:

> That sounds great!!
> Can we start to collect the topics and finalize the date ASAP?
>
> Thanks
> Weiwei
>
> On Thu, Sep 29, 2022 at 10:38 PM Wilfred Spiegelenburg <
> wilfr...@apache.org>
> wrote:
>
> > HI all,
> >
> > There has not been a community meetup in a while. With ApacheCON next
> week
> > in New Orleans I will be in the USA for a while.
> >
> > I would like to propose a meetup in Santa Clara, California in the 3rd
> week
> > of October. That is the week from 17-21 October. Cloudera is willing to
> > host and organise the meetup at their office.
> > We can work on the exact content over the next weeks.
> >
> > Anyone is welcome, please let me know what you think.
> >
> > Wilfred
> >
>


Re: [ANNOUNCE] Apache YuniKorn v1.0.0 release

2022-05-05 Thread Chenya Zhang
Big Congrats!!! :D

On Thu, May 5, 2022 at 3:20 PM Sunil Govindan  wrote:

> Kudos!
>
> This is a great milestone for the YuniKorn community. Thank you.
>
> Sunil
>
> On Wed, May 4, 2022 at 4:58 PM Chaoran Yu  wrote:
>
> > Congrats! Amazing milestone!
> >
> >
> > On Wed, May 4, 2022 at 1:45 PM Weiwei Yang  wrote:
> >
> > > Congrats, and thanks for everyone's contribution!!
> > >
> > > On Wed, May 4, 2022 at 1:31 PM Wilfred Spiegelenburg <
> > wilfr...@apache.org>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > It gives me great pleasure to announce that the Apache YuniKorn
> > community
> > > > has
> > > > voted to release Apache YuniKorn v1.0.0.
> > > >
> > > > Apache YuniKorn v1.0.0 is the first release for the project as an
> > > > Apache top level project. It also marks a major milestone as the
> first
> > > > major release.
> > > >
> > > > It contains 173 fixes and improvements. The release details, list of
> > > > major features and incompatible changes are on the v1.0.0
> announcement
> > > > page [1].
> > > >
> > > > You can also download the release from the Downloads page [2],
> > > >
> > > > Many thanks to everyone who contributed to the release. This release
> > > > is a direct result of your great contributions.
> > > >
> > > > Wilfred
> > > >
> > > > [1] https://yunikorn.apache.org/release-announce/1.0.0
> > > > [2] https://yunikorn.apache.org/download
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> > > > For additional commands, e-mail: dev-h...@yunikorn.apache.org
> > > >
> > > >
> > >
> >
>


[jira] [Created] (YUNIKORN-1071) Update website for YuniKorn Meetup in Feb 2022

2022-02-07 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-1071:
--

 Summary: Update website for YuniKorn Meetup in Feb 2022
 Key: YUNIKORN-1071
 URL: https://issues.apache.org/jira/browse/YUNIKORN-1071
 Project: Apache YuniKorn
  Issue Type: Sub-task
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Update the "Events" section of the website.

https://yunikorn.apache.org/community/events



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC4

2022-01-30 Thread Chenya Zhang
+1 (binding)

Thanks Craig for the efforts!

- Build from source
- Run unit tests
- Run sample workloads
- Checked logs and K8s messages
- Verified the Web UI
- Verified REST endpoints
- Checked no unexpected files

Best,
Chenya

On Fri, Jan 28, 2022 at 4:34 AM Wilfred Spiegelenburg 
wrote:

> +1 (binding)
>
> Checked the following:
> * no unexpected binaries or files
> * signature confirmed
> * hash confirmed
> * build from source
> * deployed into kind 1.22.4 using helm charts and local build
> * checked REST calls
> * checked Web UI
> * killed scheduler pod and made sure it was restarted and available
>
> Wilfred
>
> On Fri, 28 Jan 2022 at 18:21, Weiwei Yang  wrote:
> >
> > +1 (binding)
> >
> > - Build the docker images from the source
> > - Verified the image SHAs are at the correct commit
> > - Install on a local cluster with helm charts, verified installation is
> good
> > - Run simple job and verify the K8s events
> >
> >
> > On Thu, Jan 27, 2022 at 9:04 PM Sunil Govindan 
> wrote:
> >
> > > +1 (binding)
> > >
> > > Thanks Craig for the efforts
> > >
> > >
> > >- Verified checksum and signature
> > >- compiled and built the binaries from source code
> > >- brought a yunikorn cluster locally
> > >- Ran basic jobs
> > >
> > >
> > > Thanks
> > > Sunil
> > >
> > > On Wed, Jan 26, 2022 at 10:11 AM Craig Condit 
> > > wrote:
> > >
> > > > Sorry, description should say RC4.
> > > >
> > > >
> > > > > On Jan 26, 2022, at 12:10 PM, Craig Condit  >
> > > > wrote:
> > > > >
> > > > > Hello everyone,
> > > > >
> > > > > I’d like to call a vote for releasing Apache YuniKorn (incubating)
> > > > 0.12.2 RC3.
> > > > >
> > > > > The release artifacts have been uploaded here:
> > > >
> https://dist.apache.org/repos/dist/dev/incubator/yunikorn/0.12.2-rc4/ <
> > > >
> https://dist.apache.org/repos/dist/dev/incubator/yunikorn/0.12.2-rc4/>
> > > > >
> > > > > My public key is located here:
> > > > https://downloads.apache.org/incubator/yunikorn/KEYS <
> > > > https://downloads.apache.org/incubator/yunikorn/KEYS>
> > > > >
> > > > > JIRA issues that have been resolved in this release:
> > > > https://issues.apache.org/jira/issues/?filter=12351270 <
> > > > https://issues.apache.org/jira/issues/?filter=12351270>
> > > > >
> > > > > Git tags for each component are as follows:
> > > > >
> > > > > incubator-yunikorn-scheduler-interface: v0.12.2-1
> > > > > incubator-yunikorn-core: v0.12.2-1
> > > > > incubator-yunikorn-k8shim: v0.12.2-4
> > > > > incubator-yunikorn-web: v0.12.2-1
> > > > > https://github.com/apache/incubator-yunikorn-release: <
> > > > https://github.com/apache/incubator-yunikorn-release:> v0.12.2-4
> > > > >
> > > > > One the release is voted on and approved, all repos will be tagged
> > > > 0.12.2 for consistency.
> > > > >
> > > > > Please review and vote. The vote will be open for at least 72
> hours and
> > > > closes on Monday, January 31 2022, 1pm PDT.
> > > > >
> > > > > [ ] +1 Approve
> > > > > [ ] +0 No opinion
> > > > > [ ] -1 Disapprove (and the reason why)
> > > > >
> > > > >
> > > > > Thank you,
> > > > > Craig
> > > >
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> > > > For additional commands, e-mail: dev-h...@yunikorn.apache.org
> > > >
> > > >
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> For additional commands, e-mail: dev-h...@yunikorn.apache.org
>
>


Re: Apache YuniKorn (Incubating) - Community Graduation Vote

2022-01-25 Thread Chenya Zhang
+1 to graduate YuniKorn from the ASF
incubator and become a top-level Apache project!

Best,
Chenya

On Tue, Jan 25, 2022 at 9:09 PM Weiwei Yang  wrote:

> Hi YuniKorn community and mentors
>
> Based on the discussion thread [1], after 2 years time of incubating, it is
> considered that now is a good time to graduate YuniKorn from the ASF
> incubator and become a top-level Apache project. We have reviewed the ASF
> project maturity model [2] and provided some assessment of the project's
> maturity based on the guidelines. Details are included as the following. I
> have enough reasons to believe the project has done sustainable development
> successfully in the Apache way. Please read this and add your vote by
> replying to this email, your feedback will be much appreciated!!! Note,
> this vote is not just for committers or PPMC members, we welcome anyone in
> the community to vote, thanks!
>
> *Code, License, and Copyright*
>
> All code is maintained on github, under Apache 2.0 license. We have
> reviewed all the dependencies and ensured they do not bring any license
> issues. All the status files, license headers, and copyright are up to
> date.
>
> *Release*
>
> The community has released 5 releases in the past 2 years, i.e v0.8, v0.9,
> v0.10, v0,11, and v0.12. These releases were done by 5 different release
> managers [3] and indicate the community can create releases independently.
> We have also a well-documented release process, automated tools to help new
> release managers with the process.
>
> *Quality*
>
> The community has developed a comprehensive CI/CD pipeline as a guard of
> the code quality. The pipeline runs per-commit license check, code-format
> check, code-coverage check, UT, and end-to-end tests. All these are built
> as automated github actions, new contributors can easily trigger and view
> results when submitting patches.
>
> *Community*
>
> The community has developed an easy-to-read homepage for the project [4],
> the website hosts all the materials related to the project including
> versioned documentation, user docs, developer docs, design docs,
> performance docs. It provides the top-level navigation to the software
> download page, where links to all our previous releases. It also has the
> pages for the new contributors on-boarding with the project, such as how to
> join community meetings, events links, etc.
>
> The community shows appreciation to all contributors and welcomes all kinds
> of contributions (not just for code). We have built an open, diverse
> community and gathered many people to work together. With that, we have 41
> unique code contributors and some non-code contributors as well. Many of
> them have becoming to be committers and PPMC members while working with the
> community. There were 2 new mentors, 8 new committers, 4 new PPMC from 6
> different organizations [5] added in the incubating phase. And in total,
> the project has 6 mentors, 23 PPMC, and 29 committers from at least 14
> different organizations. All the info are generally available on the
> project website, including some guidelines to help people become
> committer/PPMC member [6]. Community collaboration was done in a
> wide-public, open manner, we leverage regular bi-weekly/weekly community
> meetings for 2 different timezones [7] and dev/user slack channels, mailing
> lists for offline discussions.
>
> *Independence*
>
> The project was initially donated by Cloudera, but with a diverse open
> source community, it has been operated as an independent project since it
> entered into ASF incubator. The committers and PPMC members are a group of
> passionate people from at least 14 different organizations, such as
> Alibaba, Apple, Cloudera, Databricks, LinkedIn, Microsoft, Snowflake, etc.
> The project's success is not depending on any single entity.
>
> [1] https://lists.apache.org/thread/dno411y59g2pcy1d3kd7s3kdjz9jw65n
> [2]
> https://community.apache.org/apache-way/apache-project-maturity-model.html
>
> [3] https://yunikorn.apache.org/community/download
> [4] https://yunikorn.apache.org/
> [5] https://incubator.apache.org/projects/yunikorn.html
> [6] https://yunikorn.apache.org/community/people
>
> [6]
>
> https://docs.google.com/document/d/165gzC7uhcKc5XDWiMYSRKBiPQBy2tDtXADUPuhGlUa0
>


Re: Apache YuniKorn (Incubating) - Community Graduation Vote

2022-01-25 Thread Chenya Zhang
thanks Sunil! btw. I followed https://infra.apache.org/committer-email.html
to configure my gmail settings to send from the apache email address. I
originally used my gmail to subscribe to the private list and didn't
receive a confirmation/error email. Hope others won't encounter the same
problem.

Best,
Chenya

On Tue, Jan 25, 2022 at 10:03 AM Sunil Govindan  wrote:

> Yes. I reached out to a couple of them and added them to the list.
> We still have a gap. Will try again.
>
> Thanks
>  Sunil
>
> On Mon, Jan 24, 2022 at 10:45 PM Chenya Zhang  >
> wrote:
>
> > Subscribed to the private list! 3 -> 2 ? :)
> >
> > On Mon, Jan 24, 2022 at 6:02 PM Wilfred Spiegelenburg <
> wilfr...@apache.org
> > >
> > wrote:
> >
> > > When you check the status page [1] you will see that a wiki is no
> > > longer required.
> > > We can skip adding it.
> > >
> > > BTW: I added Chenya to the roster that increases the PPMC members not
> > > subscribed to 3 again after it was down to 2.
> > >
> > > Wilfred
> > >
> > > [1] https://incubator.apache.org/projects/yunikorn.html
> > >
> > >
> > > On Mon, 24 Jan 2022 at 13:12, Weiwei Yang  wrote:
> > > >
> > > > Hi Sunil
> > > >
> > > > I don’t think we ever have a wiki, do we still need to add that? I
> see
> > > some projects leave that empty as well.
> > > >
> > > > Sent from my iPhone
> > > >
> > > > > On Jan 23, 2022, at 2:07 PM, Sunil Govindan 
> > wrote:
> > > > >
> > > > > @Weiwei Yang 
> > > > > Could you please add WIKI as well to this?
> > > > >
> > > > > Thanks
> > > > > Sunil
> > > > >
> > > > >> On Sun, Jan 23, 2022 at 1:33 PM Weiwei Yang 
> > wrote:
> > > > >>
> > > > >> Thank you Felix.
> > > > >> I have added the initial podling status file:
> > > > >>
> > > > >>
> > >
> >
> https://svn.apache.org/repos/asf/incubator/public/trunk/content/podlings/yunikorn.yml
> > > > >> .
> > > > >> Please let me know if that looks good or not.
> > > > >>
> > > > >>> On Sat, Jan 22, 2022 at 10:18 PM Sunil Govindan <
> sun...@apache.org
> > >
> > > wrote:
> > > > >>>
> > > > >>> I will reach out to them.
> > > > >>>
> > > > >>> Thanks
> > > > >>> Sunil
> > > > >>>
> > > > >>> On Sat, Jan 22, 2022 at 9:00 PM Felix Cheung <
> > > felixcheun...@hotmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>>> Pls add the podling status file
> > > > >>>>
> > > > >>>
> > > > >>
> > >
> >
> https://svn.apache.org/repos/asf/incubator/public/trunk/content/podlings/
> > > > >>>>
> > > > >>>> 3 ppmc members have not subscribed to private@
> > > > >>>>
> > > > >>>> These can be found on
> > > > >>>> https://whimsy.apache.org/roster/ppmc/yunikorn
> > > > >>>>
> > > > >>>> <
> > > > >>>
> > > > >>
> > >
> >
> https://svn.apache.org/repos/asf/incubator/public/trunk/content/podlings/
> > > > >>>>>
> > > > >>>> 
> > > > >>>> From: Weiwei Yang 
> > > > >>>> Sent: Thursday, January 20, 2022 10:05:55 PM
> > > > >>>> To: dev@yunikorn.apache.org 
> > > > >>>> Cc: priv...@yunikorn.apache.org 
> > > > >>>> Subject: Re: Apache YuniKorn (Incubating) - Community Graduation
> > > Vote
> > > > >>>>
> > > > >>>> hi all
> > > > >>>>
> > > > >>>> Most issues under the graduation preparation JIRA YUNIKORN-1005
> > > > >>>> <https://issues.apache.org/jira/browse/YUNIKORN-1005> are
> fixed.
> > > > >>>> The remaining one is the who-are-we web page, I am currently
> > > collecting
> > > > >>>> info for that, should be done by next week.
> > > > >>>> S

Re: Apache YuniKorn (Incubating) - Community Graduation Vote

2022-01-24 Thread Chenya Zhang
w of what is expected on graduation.
> > >>>>>
> > >>>>> Wilfred
> > >>>>>
> > >>>>> [1] https://www.apache.org/security/
> > >>>>>
> > >>>>> On Tue, 11 Jan 2022 at 18:21, Weiwei Yang  wrote:
> > >>>>>>
> > >>>>>> Hi Wilfred
> > >>>>>>
> > >>>>>> Adding a security@ mailing list sounds like a good idea, but I do
> > >>> not
> > >>>>> think that is required in the current stage.
> > >>>>>> We can do that post-graduate. For now, the Apache security doc
> said
> > >>>>>>
> > >>>>>>> We strongly encourage you to report potential security
> > >>>> vulnerabilities
> > >>>>> to one of our private security mailing lists first, before
> disclosing
> > >>>> them
> > >>>>> in a public forum.
> > >>>>>>
> > >>>>>> I do not see any issue if we use our private@ mailing list for
> > >> this
> > >>>>> purpose.
> > >>>>>>
> > >>>>>> On Mon, Jan 10, 2022 at 11:01 PM Wilfred Spiegelenburg <
> > >>>>> wilfr...@apache.org> wrote:
> > >>>>>>>
> > >>>>>>> The private@ is a moderated list. This has two issues: a
> > >> moderator
> > >>>>>>> needs to approve any message not sent by a PMC member. This will
> > >>> slow
> > >>>>>>> down the process of interaction with the reporter. It would also
> > >> not
> > >>>>>>> reach the YuniKorn committers group as not all committers are
> part
> > >>> of
> > >>>>>>> the PMC. Security issues should be handled and worked on by all
> > >>>>>>> committers not just by the PMC members.
> > >>>>>>>
> > >>>>>>> The security notification update made to the website I think does
> > >>> not
> > >>>>>>> line up with the security guidelines referenced in the link
> > >> provided
> > >>>>>>> in the dropdown menu of the YuniKorn site [1]. In that link there
> > >>> is a
> > >>>>>>> well defined way to report security issues. If we need to enhance
> > >>> and
> > >>>>>>> extend what we do we either establish a security@ mailing list
> > >> and
> > >>>>>>> provide a static page with security related information on our
> > >> site
> > >>> or
> > >>>>>>> we leave it as is. My preference would be to establish a
> security@
> > >>>>>>> list and make all committers a member of that list.
> > >>>>>>>
> > >>>>>>> I think we need to roll back the website changes part of
> > >>> YUNIKORN-1006
> > >>>>>>> [2] in PR [3] for the website.
> > >>>>>>>
> > >>>>>>> Wilfred
> > >>>>>>>
> > >>>>>>> [1] https://www.apache.org/security/
> > >>>>>>> [2] https://issues.apache.org/jira/browse/YUNIKORN-1006
> > >>>>>>> [3] https://github.com/apache/incubator-yunikorn-site/pull/105
> > >>>>>>>
> > >>>>>>> On Tue, 11 Jan 2022 at 04:45, Holden Karau  >
> > >>>>> wrote:
> > >>>>>>>>
> > >>>>>>>> For "The project provides a well-documented, secure and private
> > >>>>> channel to report security issues, along with a documented way of
> > >>>>> responding to them.' the standard that I've seen used is to tell
> > >> people
> > >>>> to
> > >>>>> e-mail private@ when they think they might have a security related
> > >>>> issue.
> > >>>>> I think that would probably work well for Yunikorn too.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Mon, Jan 10, 2022 at 7:04 AM Chenya Zhang <
> > >>>>> chenyazhangche...@gmail.com> wrote:
> > >>>>>>>>>
> > >&g

Re: [VOTE} Release Apache YuniKorn (incubating) 0.12.2 RC3

2022-01-22 Thread Chenya Zhang
+1

- Build from source
- Run unit tests
- Run sample workloads
- Checked logs and K8s messages
- Verified the Web UI
- Verified REST endpoints
- Checked no unexpected files

Best,
Chenya

On Fri, Jan 21, 2022 at 11:18 PM Weiwei Yang  wrote:

> +1 (binding)
>
> - Build from source
> - Deploy the scheduler using helm charts
> - Run sample jobs
> - Verify the bug that has been fixed
>
> PS: since this is just adding a bug fix on top of RC2, I did not repeat all
> the validation steps I have done for RC2
>
> On Fri, Jan 21, 2022 at 12:28 PM Chaoran Yu 
> wrote:
>
> > +1 (binding)
> >
> > - Built images from source
> > - Deployed using the Helm chart on a k8s cluster
> > - Verified basic pod scheduling and Spark scheduling
> > - Verified that the last-minute bug in RC2 has been fixed.
> >
> > Thanks Craig for the quick effort!
> > Chaoran
> >
> >
> > On Fri, Jan 21, 2022 at 12:12 PM Sunil Govindan 
> wrote:
> >
> > > +1 (binding)
> > >
> > > - Built from source
> > > - Deployed on local k8s cluster
> > > - Ran basic jobs
> > > - Verified checksum and signature
> > > - UI looks good.
> > >
> > > Thanks
> > > Sunil
> > >
> > > On Fri, Jan 21, 2022 at 10:23 AM Craig Condit 
> > > wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > I’d like to call a vote for releasing Apache YuniKorn (incubating)
> > 0.12.2
> > > > RC3.
> > > >
> > > > The release artifacts have been uploaded here:
> > > >
> https://dist.apache.org/repos/dist/dev/incubator/yunikorn/0.12.2-rc3/
> > <
> > > >
> https://dist.apache.org/repos/dist/dev/incubator/yunikorn/0.12.2-rc3/>
> > > >
> > > > My public key is located here:
> > > > https://downloads.apache.org/incubator/yunikorn/KEYS <
> > > > https://downloads.apache.org/incubator/yunikorn/KEYS>
> > > >
> > > > JIRA issues that have been resolved in this release:
> > > > https://issues.apache.org/jira/issues/?filter=12351270 <
> > > > https://issues.apache.org/jira/issues/?filter=12351270>
> > > >
> > > > Git tags for each component are as follows:
> > > >
> > > > incubator-yunikorn-scheduler-interface: v0.12.2-1
> > > > incubator-yunikorn-core: v0.12.2-1
> > > > incubator-yunikorn-k8shim: v0.12.2-3
> > > > incubator-yunikorn-web: v0.12.2-1
> > > > https://github.com/apache/incubator-yunikorn-release: v0.12.2-3
> > > >
> > > > One the release is voted on and approved, all repos will be tagged
> > 0.12.2
> > > > for consistency.
> > > >
> > > > Please review and vote. The vote will be open for at least 72 hours
> and
> > > > closes on Tuesday, January 25 2022, 1pm PDT.
> > > >
> > > > [ ] +1 Approve
> > > > [ ] +0 No opinion
> > > > [ ] -1 Disapprove (and the reason why)
> > > >
> > > >
> > > > Thank you,
> > > > Craig
> > >
> >
>


Re: Add Chinese translation for documents

2022-01-16 Thread Chenya Zhang
Thanks Xiang for this wonderful contribution! It can help many more folks
to learn about YuniKorn and join our community!

I will also help to review your PRs. Cheers

On Sun, Jan 16, 2022 at 11:23 AM Weiwei Yang  wrote:

> Hi Chen Xiang
>
> This is so awesome!!! Thank you so much for working on this!!
> We have some people in the committer/PPMC list who can speak Chinese, we
> will help to review.
> Thank you again for creating the PR and raising up this in the mailing
> list, great work!
>
>
> On Sun, Jan 16, 2022 at 4:33 AM 陈 翔  wrote:
>
> > Hi everyone
> >
> > I spent a lot of time in the second half of last year and this year
> trying
> > to translate yunikorn's website into Chinese. The current progress has
> > almost completed the main web page and part of the doc, and other parts
> may
> > be completed before 1.0.
> >
> > At present, I have submitted the completed part to a PR
> > https://github.com/apache/incubator-yunikorn-site/pull/108 . I hope some
> > PPMCs and committers will have time to review it and see what needs to be
> > improved. Your valuable advice is the biggest driving force for me to
> > continue to do it.
> > [
> >
> https://opengraph.githubassets.com/d451f638caaf3935e654c29a5131603d7d0d52b160f7b199efd13b62ca39ebd9/apache/incubator-yunikorn-site/pull/108
> > ]
> > [YUNIKORN-1029] Add Chinese translation for documents by cdmikechen ·
> Pull
> > Request #108 · apache/incubator-yunikorn-site<
> > https://github.com/apache/incubator-yunikorn-site/pull/108>
> > At present, most website pages and some design documents have been
> > completed. Theres a lot of work to do later. It is expected to
> improve
> > the subsequent parts before version 1.0. The translati...
> > github.com
> >
> > Thanks~
> >
> > --
> > CHEN XIANG  software engineer
> > Shangyuan Smart Technology. ShenYang.China
> > http://www.shangy.com
> >
>


Re: [VOTE] Release Apache YuniKorn (incubating) 0.12.2 RC2

2022-01-14 Thread Chenya Zhang
+1

- Build from source
- Run unit tests
- Run sample workloads
- Checked logs and K8s messages
- Verified the Web UI
- Verified REST endpoints
- Checked no unexpected files

Best,
Chenya

On Fri, Jan 14, 2022 at 1:00 AM Peter Bacsko  wrote:

> +1
>
> - Verified signature & hash
> - Built images with make
> - Installed to KIND cluster
> - Executed batch sleep job
> - Inspected logs
> - Checked some REST endpoints
>
> Cheers,
> Peter
>
> On Fri, Jan 14, 2022 at 7:49 AM Weiwei Yang  wrote:
>
> > +1
> >
> > Verified the following things:
> >
> >- Checked NOTICE, DISCLAIMER, README, and CHANGELOG
> >- Verified signature
> >- Build from source using make
> >- Run unit test using make test
> >- Run some sample workloads, roughly went over the logs and k8s
> messages
> >- Verified the basic UI functionalities
> >- Inspected the docker images and verified the commit SHAs
> >
> > Thanks
> >
> >
> > On Thu, Jan 13, 2022 at 4:53 PM Wilfred Spiegelenburg <
> wilfr...@apache.org
> > >
> > wrote:
> >
> > > +1
> > >
> > > Checked the following:
> > > * no unexpected binaries or files
> > > * signature confirmed
> > > * hash confirmed
> > > * build from source
> > > * startup and run the tests
> > > * deployed into kind 1.22.4 using helm charts and local build
> > > * checked REST calls: logLevel,  healthcheck, and statedump
> > > * checked Web UI
> > >
> > > On Fri, 14 Jan 2022 at 05:45, Craig Condit 
> > wrote:
> > > >
> > > > Hello everyone,
> > > >
> > > > I’d like to call a vote for releasing Apache YuniKorn (incubating)
> > > 0.12.2 RC2.
> > > >
> > > > The release artifacts have been uploaded here:
> > > https://dist.apache.org/repos/dist/dev/incubator/yunikorn/0.12.2-rc2/
> <
> > > https://dist.apache.org/repos/dist/dev/incubator/yunikorn/0.12.2-rc2/>
> > > >
> > > > My public key is located here:
> > > https://downloads.apache.org/incubator/yunikorn/KEYS <
> > > https://downloads.apache.org/incubator/yunikorn/KEYS>
> > > >
> > > > JIRA issues that have been resolved in this release:
> > > https://issues.apache.org/jira/issues/?filter=12351270 <
> > > https://issues.apache.org/jira/issues/?filter=12351270>
> > > >
> > > > Since only changes to the release repo were necessary for RC2, tags
> > > remain at 0.12.2-1 in code, only the release repo has been tagged
> > 0.12.2-2.
> > > Post-release, all repos will be tagged 0.12.2 for consistency.
> > > >
> > > > Please review and vote. The vote will be open for at least 72 hours
> and
> > > closes on Tuesday, January 18 2022, 1pm PDT.
> > > >
> > > > [ ] +1 Approve
> > > > [ ] +0 No opinion
> > > > [ ] -1 Disapprove (and the reason why)
> > > >
> > > >
> > > > Thank you,
> > > > Craig
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> > > For additional commands, e-mail: dev-h...@yunikorn.apache.org
> > >
> > >
> >
>


Re: [ANNOUNCE] New committer: Yu Teng Chen

2022-01-12 Thread Chenya Zhang
Congrats Yu Teng, well deserved!

On Wed, Jan 12, 2022 at 4:36 PM Wilfred Spiegelenburg 
wrote:

> The Project Management Committee (PMC) for Apache YuniKorn has invited
> Yu Teng to
> become a committer and we are pleased to announce that he has accepted.
> Please join me in congratulating him.
>
> Congratulations & Welcome aboard Yu Teng!
>
> Wilfred
> on behalf of The Apache YuniKorn PPMC
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> For additional commands, e-mail: dev-h...@yunikorn.apache.org
>
>


Re: [DISCUSS] v0.12.2 release to allow K8s 1.22 and K8s 1.23 deployments

2022-01-11 Thread Chenya Zhang
+1 thanks Wilfred and Craig for driving YUNIKORN-941! Look forward to the
release!

On Tue, Jan 11, 2022 at 8:48 PM Weiwei Yang  wrote:

> Apart from the issues that have been fixed.
> Can we also make sure
>
>- Update e2e test matrix to include 1.22
>- Update the user doc to include guidelines of how to do upgrade
>- Update and publish the new helm charts in the helm chart repo
>
> can we have an umbrella JIRA to track all the remaining issues for 1.22?
> People can help out some of unfinished tasks.
>
> On Tue, Jan 11, 2022 at 8:44 PM Weiwei Yang  wrote:
>
> > Fantastic, thank you, Wilfred, Craig for getting this solved!
> > +1 of having a 1.22 release, to unblock people who use 1.22+ K8s
> versions.
> >
> >
> > On Tue, Jan 11, 2022 at 8:40 PM Chaoran Yu 
> > wrote:
> >
> >> +1 because it unblocks some users.
> >>
> >> On Tue, Jan 11, 2022 at 4:35 PM Sunil Govindan 
> wrote:
> >>
> >> > I think this will help our customers who are stuck with K8s 1.22 issue
> >> > before 1.0 is out.
> >> >
> >> > +1 to the proposal
> >> >
> >> > Thanks
> >> > Sunil
> >> >
> >> > On Tue, Jan 11, 2022 at 4:11 PM Wilfred Spiegelenburg <
> >> wilfr...@apache.org
> >> > >
> >> > wrote:
> >> >
> >> > > Over the last weeks Craig and I have been working on getting the
> >> > > admission controller deployment updated. These changes were
> committed
> >> > > yesterday as YUNIKORN-941. The main goal was to move away from old
> K8s
> >> > > objects and API calls and remove the scripts for certificate
> creation.
> >> > >
> >> > > With the changes committed deployments and e2e tests against K8s1.22
> >> > > and K8s 1.23 are no longer failing.
> >> > > I would like to propose that we release v0.12.2 with just the
> changes
> >> > > for the admission controller to allow deployments on the latest
> >> > > versions of K8s. Since Craig is the person that has been driving
> this
> >> > > code change I would like to propose him as the release manager.
> >> > >
> >> > > This is the list of changes related to YUNIKORN-941. It includes a
> >> > > number of other jiras as they are all fixed by the new deployment:
> >> > > * YUNIKORN-995
> >> > > * YUNIKORN-938
> >> > > * YUNIKORN-947
> >> > > * YUNIKORN-625
> >> > > * YUNIKORN-726
> >> > > It has one jira it dependents on: YUNIKORN-928
> >> > >
> >> > > Wilfred
> >> > >
> >> > >
> -
> >> > > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> >> > > For additional commands, e-mail: dev-h...@yunikorn.apache.org
> >> > >
> >> > >
> >> >
> >>
> >
>


Re: Apache YuniKorn (Incubating) - Community Graduation Vote

2022-01-10 Thread Chenya Zhang


Another thing in my mind is our PR code check... :D

The "check warning" from Codecov is everywhere which makes our PR review
inconvenient. I'm not sure if the ASF board will randomly check our PRs and
find it not ideal.

If there is no ticket tracking this, I could help to create one and do some
investigation there.

Thanks,
Chenya

On Mon, Jan 10, 2022 at 12:49 PM Sunil Govindan  wrote:

> I created YUNIKORN-1007
> <https://issues.apache.org/jira/browse/YUNIKORN-1007>
>
> We have to look at LC20 more closely given the public artifacts that we
> have.
>
> + @Wilfred Spiegelenburg 
>
> Thanks
> Sunil
>
> On Mon, Jan 10, 2022 at 11:59 AM Chenya Zhang  >
> wrote:
>
> > Thanks Weiwei for creating the tickets! I will help to work on
> > https://issues.apache.org/jira/browse/YUNIKORN-1006 to address QU 30,
> 40,
> > 50.
> >
> > On Mon, Jan 10, 2022 at 10:21 AM Weiwei Yang  wrote:
> >
> >> I think we can add some documents to clearly address QU 30, 40, 50.
> >> I have created a task under YUNIKORN-1005
> >> <https://issues.apache.org/jira/browse/YUNIKORN-1005> to address them.
> >> Thank you Chenya, Holden for your feedback, please comment more if there
> >> is
> >> anything else outstanding.
> >>
> >> On Mon, Jan 10, 2022 at 9:45 AM Holden Karau 
> >> wrote:
> >>
> >> > For "The project provides a well-documented, secure and private
> channel
> >> to
> >> > report security issues, along with a documented way of responding to
> >> them.'
> >> > the standard that I've seen used is to tell people to e-mail private@
> >> > when they think they might have a security related issue. I think that
> >> > would probably work well for Yunikorn too.
> >> >
> >> >
> >> > On Mon, Jan 10, 2022 at 7:04 AM Chenya Zhang <
> >> chenyazhangche...@gmail.com>
> >> > wrote:
> >> >
> >> >> Hi Weiwei,
> >> >>
> >> >> Thanks for driving this! The evaluation is quite comprehensive
> >> overall. I
> >> >> checked our Apache project maturity guidelines and noticed the below
> >> three
> >> >> items. Not sure if we already have them but they are not blockers to
> >> our
> >> >> graduation. We could think more about them along the way.
> >> >>
> >> >> QU30
> >> >>
> >> >> The project provides a well-documented, secure and private channel to
> >> >> report security issues, along with a documented way of responding to
> >> them.
> >> >>
> >> >> QU40
> >> >>
> >> >> The project puts a high priority on backwards compatibility and aims
> to
> >> >> document any incompatible changes and provide tools and documentation
> >> to
> >> >> help users transition to new features.
> >> >>
> >> >> CO50
> >> >>
> >> >> The project documents how contributors can earn more rights such as
> >> >> commit access or decision power, and applies these principles
> >> consistently.
> >> >>
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Chenya
> >> >>
> >> >>
> >> >>
> >> >> On Mon, Jan 10, 2022 at 12:00 AM Weiwei Yang 
> wrote:
> >> >>
> >> >>> Hi YuniKorn community and mentors
> >> >>>
> >> >>> Based on the discussion thread [1], after 2 years time of
> incubating,
> >> it
> >> >>> is
> >> >>> considered that now is a good time to graduate YuniKorn from the ASF
> >> >>> incubator and become a top-level Apache project. We have reviewed
> the
> >> ASF
> >> >>> project maturity model [2] and provided some assessment of the
> >> project's
> >> >>> maturity based on the guidelines. Details are included as the
> >> following.
> >> >>> Please read this and share your thoughts by replying to this email,
> >> your
> >> >>> feedback will be much appreciated!!!
> >> >>>
> >> >>> *Code, License, and Copyright*
> >> >>>
> >> >>> All code is maintained on github, under Apache 2.0 license. We have
> >> >>> reviewed all the dependencies and ensured they do not bring any
> >> licens

Re: Apache YuniKorn (Incubating) - Community Graduation Vote

2022-01-10 Thread Chenya Zhang
Thanks Weiwei for creating the tickets! I will help to work on
https://issues.apache.org/jira/browse/YUNIKORN-1006 to address QU 30, 40,
50.

On Mon, Jan 10, 2022 at 10:21 AM Weiwei Yang  wrote:

> I think we can add some documents to clearly address QU 30, 40, 50.
> I have created a task under YUNIKORN-1005
> <https://issues.apache.org/jira/browse/YUNIKORN-1005> to address them.
> Thank you Chenya, Holden for your feedback, please comment more if there is
> anything else outstanding.
>
> On Mon, Jan 10, 2022 at 9:45 AM Holden Karau  wrote:
>
> > For "The project provides a well-documented, secure and private channel
> to
> > report security issues, along with a documented way of responding to
> them.'
> > the standard that I've seen used is to tell people to e-mail private@
> > when they think they might have a security related issue. I think that
> > would probably work well for Yunikorn too.
> >
> >
> > On Mon, Jan 10, 2022 at 7:04 AM Chenya Zhang <
> chenyazhangche...@gmail.com>
> > wrote:
> >
> >> Hi Weiwei,
> >>
> >> Thanks for driving this! The evaluation is quite comprehensive overall.
> I
> >> checked our Apache project maturity guidelines and noticed the below
> three
> >> items. Not sure if we already have them but they are not blockers to our
> >> graduation. We could think more about them along the way.
> >>
> >> QU30
> >>
> >> The project provides a well-documented, secure and private channel to
> >> report security issues, along with a documented way of responding to
> them.
> >>
> >> QU40
> >>
> >> The project puts a high priority on backwards compatibility and aims to
> >> document any incompatible changes and provide tools and documentation to
> >> help users transition to new features.
> >>
> >> CO50
> >>
> >> The project documents how contributors can earn more rights such as
> >> commit access or decision power, and applies these principles
> consistently.
> >>
> >>
> >> Thanks,
> >>
> >> Chenya
> >>
> >>
> >>
> >> On Mon, Jan 10, 2022 at 12:00 AM Weiwei Yang  wrote:
> >>
> >>> Hi YuniKorn community and mentors
> >>>
> >>> Based on the discussion thread [1], after 2 years time of incubating,
> it
> >>> is
> >>> considered that now is a good time to graduate YuniKorn from the ASF
> >>> incubator and become a top-level Apache project. We have reviewed the
> ASF
> >>> project maturity model [2] and provided some assessment of the
> project's
> >>> maturity based on the guidelines. Details are included as the
> following.
> >>> Please read this and share your thoughts by replying to this email,
> your
> >>> feedback will be much appreciated!!!
> >>>
> >>> *Code, License, and Copyright*
> >>>
> >>> All code is maintained on github, under Apache 2.0 license. We have
> >>> reviewed all the dependencies and ensured they do not bring any license
> >>> issues. All the status files, license headers, and copyright are up to
> >>> date.
> >>>
> >>> *Release*
> >>>
> >>> The community has released 5 releases in the past 2 years, i.e v0.8,
> >>> v0.9,
> >>> v0.10, v0,11, and v0.12. These releases were done by 5 different
> release
> >>> managers [3] and indicate the community can create releases
> >>> independently.
> >>> We have also a well-documented release process, automated tools to help
> >>> new
> >>> release managers with the process.
> >>>
> >>> *Quality*
> >>>
> >>> The community has developed a comprehensive CI/CD pipeline as a guard
> of
> >>> the code quality. The pipeline runs per-commit license check,
> code-format
> >>> check, code-coverage check, UT, and end-to-end tests. All these are
> built
> >>> as automated github actions, new contributors can easily trigger and
> view
> >>> results when submitting patches.
> >>>
> >>> *Community*
> >>>
> >>> The community has developed an easy-to-read homepage for the project
> [4],
> >>> the website hosts all the materials related to the project including
> >>> versioned documentation, user docs, developer docs, design docs,
> >>> performance docs. It provides the top-level navigation to th

Re: Apache YuniKorn (Incubating) - Community Graduation Vote

2022-01-10 Thread Chenya Zhang
Hi Weiwei,

Thanks for driving this! The evaluation is quite comprehensive overall. I
checked our Apache project maturity guidelines and noticed the below three
items. Not sure if we already have them but they are not blockers to our
graduation. We could think more about them along the way.

QU30

The project provides a well-documented, secure and private channel to
report security issues, along with a documented way of responding to them.

QU40

The project puts a high priority on backwards compatibility and aims to
document any incompatible changes and provide tools and documentation to
help users transition to new features.

CO50

The project documents how contributors can earn more rights such as commit
access or decision power, and applies these principles consistently.


Thanks,

Chenya



On Mon, Jan 10, 2022 at 12:00 AM Weiwei Yang  wrote:

> Hi YuniKorn community and mentors
>
> Based on the discussion thread [1], after 2 years time of incubating, it is
> considered that now is a good time to graduate YuniKorn from the ASF
> incubator and become a top-level Apache project. We have reviewed the ASF
> project maturity model [2] and provided some assessment of the project's
> maturity based on the guidelines. Details are included as the following.
> Please read this and share your thoughts by replying to this email, your
> feedback will be much appreciated!!!
>
> *Code, License, and Copyright*
>
> All code is maintained on github, under Apache 2.0 license. We have
> reviewed all the dependencies and ensured they do not bring any license
> issues. All the status files, license headers, and copyright are up to
> date.
>
> *Release*
>
> The community has released 5 releases in the past 2 years, i.e v0.8, v0.9,
> v0.10, v0,11, and v0.12. These releases were done by 5 different release
> managers [3] and indicate the community can create releases independently.
> We have also a well-documented release process, automated tools to help new
> release managers with the process.
>
> *Quality*
>
> The community has developed a comprehensive CI/CD pipeline as a guard of
> the code quality. The pipeline runs per-commit license check, code-format
> check, code-coverage check, UT, and end-to-end tests. All these are built
> as automated github actions, new contributors can easily trigger and view
> results when submitting patches.
>
> *Community*
>
> The community has developed an easy-to-read homepage for the project [4],
> the website hosts all the materials related to the project including
> versioned documentation, user docs, developer docs, design docs,
> performance docs. It provides the top-level navigation to the software
> download page, where links to all our previous releases. It also has the
> pages for the new contributors on-boarding with the project, such as how to
> join community meetings, events links, etc.
>
> The community shows appreciation to all contributors and welcomes all kinds
> of contributions (not just for code). We have built an open, diverse
> community and gathered many people to work together. With that, we have 41
> unique code contributors and some non-code contributors as well. Many of
> them have becoming to be committers and PPMC members while working with the
> community. There were 2 new mentors, 8 new committers, 2 new PPMC from 6
> different organizations [5] added in the incubating phase. And in total,
> the project has 6 mentors, 21 PPMC, and 27 committers from at least 14
> different organizations. Community collaboration was done in a wide-public,
> open manner, we leverage regular bi-weekly/weekly community meetings for 2
> different timezones [6] and dev/user slack channels, mailing lists for
> offline discussions.
>
> *Independence*
>
> The project was initially donated by Cloudera, but with a diverse open
> source community, it has been operated as an independent project since it
> entered into ASF incubator. The committers and PPMC members are a group of
> passionate people from at least 14 different organizations, such as
> Alibaba, Apple, Cloudera, Databricks, LinkedIn, Microsoft, Snowflake, etc.
> The project's success is not depending on any single entity.
>
> I have enough reasons to believe the project has done sustainable
> development successfully in the Apache way. Again, please share your
> thoughts, all YuniKorn contributors, committers, PPMC, and mentors. Thank
> you!
>
> [1] https://lists.apache.org/thread/dno411y59g2pcy1d3kd7s3kdjz9jw65n
> [2]
> https://community.apache.org/apache-way/apache-project-maturity-model.html
>
> [3] https://yunikorn.apache.org/community/download
> [4] https://yunikorn.apache.org/
> [5] https://incubator.apache.org/projects/yunikorn.html
>
> [6]
>
> https://docs.google.com/document/d/165gzC7uhcKc5XDWiMYSRKBiPQBy2tDtXADUPuhGlUa0
>


[jira] [Created] (YUNIKORN-1004) Investigate tagging in GIT and the interaction with go modules

2022-01-08 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-1004:
--

 Summary: Investigate tagging in GIT and the interaction with go 
modules
 Key: YUNIKORN-1004
 URL: https://issues.apache.org/jira/browse/YUNIKORN-1004
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: release
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Experiment different ways to tag in GIT and its interaction with go modules.

Discussed with [~wilfreds] offline, as soon as we tag a release that tag is 
fixed and tracked from a go mod point. It cannot be changed. It creates a 
problem for a release candidate: it needs a special tag and that tag cannot be 
the real release tag due to the voting. For example, we can have something like 
a 1.0.0-1 tag for the build and use in go mod. Increase to -2 and -3 etc for 
each release candidate we build. Then we add an extra tag 1.0.0 as the official 
release.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: [DISCUSS] Next release: Apache YuniKorn v1.0.0

2022-01-07 Thread Chenya Zhang
Thanks Wiflred for initiating the v1.0.0 release!

I have resolved all sub-tasks under the umbrella ticket of
https://issues.apache.org/jira/browse/YUNIKORN-720. Please let me know for
any questions.

Meanwhile, I'm also interested in learning more about the release process.
I will check with you offline on our v1.0.0 release related tickets to see
if there is any task that I can help with.

Best,
Chenya

On Tue, Jan 4, 2022 at 3:23 PM Wilfred Spiegelenburg 
wrote:

> Mani,
>
> Yes I forgot about that one. We should add that to the list.
> The metrics work is also larger than just one change and we should add
> that too.
>
> That means we currently have the following non exhaustive list of major
> changes:
>
> * scheduler plugin: https://issues.apache.org/jira/browse/YUNIKORN-971
> * admission deployment: https://issues.apache.org/jira/browse/YUNIKORN-978
> * user and group limits:
> https://issues.apache.org/jira/browse/YUNIKORN-984
> * rest API: https://issues.apache.org/jira/browse/YUNIKORN-954
> * metrics: https://issues.apache.org/jira/browse/YUNIKORN-720
>
> Wilfred
>
> On Tue, 4 Jan 2022 at 18:21, Manikandan R  wrote:
> >
> > Wilfred, Thank you for the proposal. We should be able to cover major
> tasks
> > of YUNIKORN-984 umbrella jira as well.
> >
> > Thanks,
> > Mani
> >
> > On Tue, Jan 4, 2022 at 11:53 AM Wilfred Spiegelenburg <
> wilfr...@apache.org>
> > wrote:
> >
> > > Now that we have released v0.12.1 we can start looking forward to the
> > > next release. The new release has been created. All jiras that were
> > > targeted but not fixed have been updated to the new release target
> > > [1]. The first couple of jiras have already been marked as fixed in
> > > the new release [2].
> > >
> > > The last couple of releases we have been able to turn around in about
> > > 3 months. That seems a good time frame again for the next release. We
> > > do have some release pieces to figure out now that we understand the
> > > impact of tagging and what can and cannot be done during the cycles.
> > >
> > > For the planning of the content I would like to propose that we at
> > > look at the following large items:
> > > * plugin version of the scheduler
> > > * separate deployment of the admission controller and scheduler to
> > > support upgrades
> > > * deprecate the old REST API and move the web UI to the new layout
> > >
> > > Beside that we will have to look at the version of Kubernetes we
> > > support. Moving to a later version will add some smaller items due to
> > > the fact that there are a number of beta objects that have been
> > > removed and or APIs that have graduated.
> > >
> > > Smaller items specially around metrics or logging and even bug fixes
> > > will always show up.
> > >
> > > I am stepping up as the release manager for this release.
> > > Wilfred
> > >
> > > [1] https://issues.apache.org/jira/issues/?filter=12348416
> > > [2] https://issues.apache.org/jira/issues/?filter=12350818
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> > > For additional commands, e-mail: dev-h...@yunikorn.apache.org
> > >
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> For additional commands, e-mail: dev-h...@yunikorn.apache.org
>
>


[jira] [Resolved] (YUNIKORN-720) Add and improve queue metrics throughout the scheduling cycle

2022-01-06 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang resolved YUNIKORN-720.
---
Resolution: Fixed

> Add and improve queue metrics throughout the scheduling cycle
> -
>
> Key: YUNIKORN-720
> URL: https://issues.apache.org/jira/browse/YUNIKORN-720
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Critical
>
> There are still quite some queue-level metrics missing clear definition, 
> implementation, or documentation
> We need to improve on this so:
>  * Users from the same queue can leverage these metrics to run their jobs 
> more efficiently
>  * Admins of the cluster(s) can monitor all queues to identify any outliers 
> timely



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: YuniKorn for Streaming Use Cases

2022-01-05 Thread Chenya Zhang
** Corrections: Apache YuniKorn meetup :) **

On Wed, Jan 5, 2022 at 4:56 PM Chenya Zhang 
wrote:

> Hi Weiwei, thanks for sharing your past experience! This is a helpful
> discussion.
>
> We should set up some dedicated discussions and topic threads for
> "Streaming with Apache YuniKorn". I know a lot of folks from the industry
> would be interested. This would be a great opportunity to expand YuniKorn's
> footprints to more use case scenarios.
>
> In our next Apache Flink meetup, I could help to invite some speakers
> (please feel free to recommend any) and organize a roundtable for
> streaming-specific discussions so folks could share their experience/needs
> to identify any gaps for future improvement together.
>
> Please let me know what you think. +devs
>
> Best,
> Chenya
>
>
>
> On Wed, Jan 5, 2022 at 9:52 AM Weiwei Yang  wrote:
>
>> hi Chenya
>>
>> > As we know, streaming applications are long-running and need to secure
>> all
>> requested resources before starting to run. In most cases, they do not
>> have
>> a strong need to be queued, ordered, or preempted to wait to obtain or
>> give
>> back their resource.
>>
>> You are right if the assumption is pure streaming cases, all long-running
>> jobs, and the cluster has sufficient resources for all jobs. Maybe it is
>> fair to say it is not a day 1 challenge.
>> However, in my past experience, this is not always enough and will not be
>> enough. When we operate large-scale Flink jobs, the major issues we were
>> dealing with: resource utilization, resource contention, hot-spot,
>> isolation, etc. We used to have tens of queues per cluster and shared by
>> many users, and jobs have different priorities and high-priority jobs can
>> make room by preempting lower priority ones. We have a customized
>> node-score system in order to distribute pods more efficiently. As you
>> see,
>> resource queues, app-sorting, node-sorting, preemption, all play a role
>> here. Also central job management, scheduling latency/throughput are also
>> important.
>>
>> On K8s and Cloud, it brings more challenges. I guess one thing challenging
>> and also interesting is how to do auto-scaling more efficiently. Sometimes
>> we need a strategy to warm up resources on Cloud in order to fit new jobs
>> in low latency. Most likely the scheduler can give some hints for that.
>> This will be a fun part to explore too. With all being said, I do think a
>> customized scheduler (instead of the pod-level scheduler -
>> default-k8s-scheduler) will be necessary.
>>
>> On Tue, Jan 4, 2022 at 10:18 PM Chenya Zhang > >
>> wrote:
>>
>> > Hi Weiwei
>> >
>> > Thanks for sharing. I checked the video and for Alibaba's use case, they
>> > have a mixed cluster for streaming and batch applications running with
>> > Apache Flink. Our use case is different. We only use Apache Flink for
>> > stream processing in physical clusters separate from Spark for batch
>> > processing.
>> >
>> > As we know, streaming applications are long-running and need to secure
>> all
>> > requested resources before starting to run. In most cases, they do not
>> have
>> > a strong need to be queued, ordered, or preempted to wait to obtain or
>> give
>> > back their resource.
>> >
>> > I'm gathering more streaming use case requirements that could not be
>> > satisfied by K8s namespace for resource quota management or other
>> advanced
>> > scheduling needs. Will keep this thread updated.
>> >
>> > Meanwhile, happy to hear more thoughts from you!
>> >
>> > Best,
>> > Chenya
>> >
>> > On Tue, Jan 4, 2022 at 9:20 PM Weiwei Yang  wrote:
>> >
>> > > Hi Chenya
>> > >
>> > > The use case is similar, YK will play a big role there. Lots of
>> features
>> > > are relevant, such as queues, job ordering, user/group ACLs,
>> preemption,
>> > > over-subscription, and performance etc.
>> > > Some of the basic functionalities are available in YK, some more
>> needs to
>> > > be built.
>> > > Please take a look at the slides from the Alibaba Flink team, they
>> have
>> > > shared how they use YK to address their use cases.
>> > > This was presented in ApacheConf:
>> > > https://www.youtube.com/watch?v=4hghJCuZk5M
>> > >
>> > > On Tue, Jan 4, 2022 at 6:35 PM Chenya Zhang <
>> chenyazhangche...@gmail.com
>> > >
>> > > wrote:
>> > >
>> > > > Hey folks,
>> > > >
>> > > > We have some new streaming use cases with Apache Flink that could
>> > > > potentially leverage YuniKorn for resource scheduling.
>> > > >
>> > > > The initial implementation is to use K8s namespace for resource
>> quota
>> > > > management. We are investigating what could be some strong benefits
>> > > > switching to YuniKorn in streaming cases for long-running services.
>> For
>> > > > example: Job queueing, job ordering, resource reservation, user
>> groups
>> > > etc
>> > > > all seem to be more desirable for batch use cases.
>> > > >
>> > > > Any thoughts or suggestions?
>> > > >
>> > > > Thanks,
>> > > > Chenya
>> > > >
>> > >
>> >
>>
>


Re: YuniKorn for Streaming Use Cases

2022-01-05 Thread Chenya Zhang
Hi Weiwei, thanks for sharing your past experience! This is a helpful
discussion.

We should set up some dedicated discussions and topic threads for
"Streaming with Apache YuniKorn". I know a lot of folks from the industry
would be interested. This would be a great opportunity to expand YuniKorn's
footprints to more use case scenarios.

In our next Apache Flink meetup, I could help to invite some speakers
(please feel free to recommend any) and organize a roundtable for
streaming-specific discussions so folks could share their experience/needs
to identify any gaps for future improvement together.

Please let me know what you think. +devs

Best,
Chenya



On Wed, Jan 5, 2022 at 9:52 AM Weiwei Yang  wrote:

> hi Chenya
>
> > As we know, streaming applications are long-running and need to secure
> all
> requested resources before starting to run. In most cases, they do not have
> a strong need to be queued, ordered, or preempted to wait to obtain or give
> back their resource.
>
> You are right if the assumption is pure streaming cases, all long-running
> jobs, and the cluster has sufficient resources for all jobs. Maybe it is
> fair to say it is not a day 1 challenge.
> However, in my past experience, this is not always enough and will not be
> enough. When we operate large-scale Flink jobs, the major issues we were
> dealing with: resource utilization, resource contention, hot-spot,
> isolation, etc. We used to have tens of queues per cluster and shared by
> many users, and jobs have different priorities and high-priority jobs can
> make room by preempting lower priority ones. We have a customized
> node-score system in order to distribute pods more efficiently. As you see,
> resource queues, app-sorting, node-sorting, preemption, all play a role
> here. Also central job management, scheduling latency/throughput are also
> important.
>
> On K8s and Cloud, it brings more challenges. I guess one thing challenging
> and also interesting is how to do auto-scaling more efficiently. Sometimes
> we need a strategy to warm up resources on Cloud in order to fit new jobs
> in low latency. Most likely the scheduler can give some hints for that.
> This will be a fun part to explore too. With all being said, I do think a
> customized scheduler (instead of the pod-level scheduler -
> default-k8s-scheduler) will be necessary.
>
> On Tue, Jan 4, 2022 at 10:18 PM Chenya Zhang 
> wrote:
>
> > Hi Weiwei
> >
> > Thanks for sharing. I checked the video and for Alibaba's use case, they
> > have a mixed cluster for streaming and batch applications running with
> > Apache Flink. Our use case is different. We only use Apache Flink for
> > stream processing in physical clusters separate from Spark for batch
> > processing.
> >
> > As we know, streaming applications are long-running and need to secure
> all
> > requested resources before starting to run. In most cases, they do not
> have
> > a strong need to be queued, ordered, or preempted to wait to obtain or
> give
> > back their resource.
> >
> > I'm gathering more streaming use case requirements that could not be
> > satisfied by K8s namespace for resource quota management or other
> advanced
> > scheduling needs. Will keep this thread updated.
> >
> > Meanwhile, happy to hear more thoughts from you!
> >
> > Best,
> > Chenya
> >
> > On Tue, Jan 4, 2022 at 9:20 PM Weiwei Yang  wrote:
> >
> > > Hi Chenya
> > >
> > > The use case is similar, YK will play a big role there. Lots of
> features
> > > are relevant, such as queues, job ordering, user/group ACLs,
> preemption,
> > > over-subscription, and performance etc.
> > > Some of the basic functionalities are available in YK, some more needs
> to
> > > be built.
> > > Please take a look at the slides from the Alibaba Flink team, they have
> > > shared how they use YK to address their use cases.
> > > This was presented in ApacheConf:
> > > https://www.youtube.com/watch?v=4hghJCuZk5M
> > >
> > > On Tue, Jan 4, 2022 at 6:35 PM Chenya Zhang <
> chenyazhangche...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hey folks,
> > > >
> > > > We have some new streaming use cases with Apache Flink that could
> > > > potentially leverage YuniKorn for resource scheduling.
> > > >
> > > > The initial implementation is to use K8s namespace for resource quota
> > > > management. We are investigating what could be some strong benefits
> > > > switching to YuniKorn in streaming cases for long-running services.
> For
> > > > example: Job queueing, job ordering, resource reservation, user
> groups
> > > etc
> > > > all seem to be more desirable for batch use cases.
> > > >
> > > > Any thoughts or suggestions?
> > > >
> > > > Thanks,
> > > > Chenya
> > > >
> > >
> >
>


Re: Form a process of YuniKorn Improvement Proposal (YIP)

2022-01-05 Thread Chenya Zhang
+1 The intention is really good. If we are free to do it as an incubating
project, we should go for it.

On Wed, Jan 5, 2022 at 4:35 PM Sunil Govindan  wrote:

> Thanks, Bowen for initiating this.
>
> As an incubating project, do we have permission to define such a process?
> And will there be any change once we become a top-level project?
>
> Also, this adds more clarity to the by-laws as well in terms of defining
> the structure and process moving forward.
> +1
>
> Thank You
> Sunil
>
> On Wed, Jan 5, 2022 at 4:32 PM Chaoran Yu  wrote:
>
> > This will be a great initiative to introduce more structure to how we
> > handle larger-scale features and improvements.
> > So far we have been doing things in an ad-hoc way, which tends to get
> less
> > effective as the community grows.
> > +1 from me.
> >
> > On Wed, Jan 5, 2022 at 4:26 PM Weiwei Yang  wrote:
> >
> > > Oops, got it wrong, you mean broader review, not ASF board.
> > > Sorry, please ignore my last comment : )
> > >
> > > On Wed, Jan 5, 2022 at 4:25 PM Weiwei Yang  wrote:
> > >
> > > > Hi Bowen
> > > >
> > > > +1
> > > > Having a formal process will definitely help the cross-org
> > communication.
> > > > Do we need the ASF board to review this? I am not sure, usually, each
> > > > project committee is able to decide what is the best for the project.
> > > >
> > > > Thanks
> > > >
> > > > On Wed, Jan 5, 2022 at 4:07 PM Bowen Li  wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I'd like to start conversation of building a formal process of
> > YuniKorn
> > > >> Improvement Proposal (YIP).
> > > >>
> > > >> (X)IP is a common approach to propose, discuss, collaborate on and
> > > tackle
> > > >> major or important changes in open source projects and communities.
> > > Within
> > > >> Apache projects, there're successful examples and adoptions like
> Spark
> > > >> (SPIP), Flink (FLIP), Kafka (KIP).
> > > >>
> > > >> Similarly, a YIP will define the following parts, including but not
> > > >> limited
> > > >> to:
> > > >> - what's considered a "major change" that needs a YIP
> > > >> - what should be included in a YIP (e.g. motivation/business
> > > >> justifications, use case requirements, proposed changes, API
> changes,
> > > >> migration/compatibility, rejected alternatives, etc)
> > > >> - who should initiate or be involved in a YIP
> > > >> - end-to-end process
> > > >>
> > > >> YK community has been growing, and we've seen cases where such a
> > process
> > > >> can help to better facilitate communications, understanding, and
> > > >> collaborations within YK community.
> > > >>
> > > >> Please share your thoughts, or +1/-1. If we get a consensus this is
> > > good,
> > > >> I
> > > >> will submit a draft for YIP for broader review.
> > > >>
> > > >> Thanks,
> > > >> Bowen
> > > >>
> > > >
> > >
> >
>


Re: YuniKorn for Streaming Use Cases

2022-01-04 Thread Chenya Zhang
Hi Weiwei

Thanks for sharing. I checked the video and for Alibaba's use case, they
have a mixed cluster for streaming and batch applications running with
Apache Flink. Our use case is different. We only use Apache Flink for
stream processing in physical clusters separate from Spark for batch
processing.

As we know, streaming applications are long-running and need to secure all
requested resources before starting to run. In most cases, they do not have
a strong need to be queued, ordered, or preempted to wait to obtain or give
back their resource.

I'm gathering more streaming use case requirements that could not be
satisfied by K8s namespace for resource quota management or other advanced
scheduling needs. Will keep this thread updated.

Meanwhile, happy to hear more thoughts from you!

Best,
Chenya

On Tue, Jan 4, 2022 at 9:20 PM Weiwei Yang  wrote:

> Hi Chenya
>
> The use case is similar, YK will play a big role there. Lots of features
> are relevant, such as queues, job ordering, user/group ACLs, preemption,
> over-subscription, and performance etc.
> Some of the basic functionalities are available in YK, some more needs to
> be built.
> Please take a look at the slides from the Alibaba Flink team, they have
> shared how they use YK to address their use cases.
> This was presented in ApacheConf:
> https://www.youtube.com/watch?v=4hghJCuZk5M
>
> On Tue, Jan 4, 2022 at 6:35 PM Chenya Zhang 
> wrote:
>
> > Hey folks,
> >
> > We have some new streaming use cases with Apache Flink that could
> > potentially leverage YuniKorn for resource scheduling.
> >
> > The initial implementation is to use K8s namespace for resource quota
> > management. We are investigating what could be some strong benefits
> > switching to YuniKorn in streaming cases for long-running services. For
> > example: Job queueing, job ordering, resource reservation, user groups
> etc
> > all seem to be more desirable for batch use cases.
> >
> > Any thoughts or suggestions?
> >
> > Thanks,
> > Chenya
> >
>


YuniKorn for Streaming Use Cases

2022-01-04 Thread Chenya Zhang
Hey folks,

We have some new streaming use cases with Apache Flink that could
potentially leverage YuniKorn for resource scheduling.

The initial implementation is to use K8s namespace for resource quota
management. We are investigating what could be some strong benefits
switching to YuniKorn in streaming cases for long-running services. For
example: Job queueing, job ordering, resource reservation, user groups etc
all seem to be more desirable for batch use cases.

Any thoughts or suggestions?

Thanks,
Chenya


[jira] [Closed] (YUNIKORN-829) Produce metrics on queue-level resource utilization

2022-01-04 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang closed YUNIKORN-829.
-
Resolution: Won't Do

> Produce metrics on queue-level resource utilization
> ---
>
> Key: YUNIKORN-829
> URL: https://issues.apache.org/jira/browse/YUNIKORN-829
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - scheduler, shim - kubernetes
>Reporter: Chaoran Yu
>Priority: Major
>
> YuniKorn already has metrics on the resources requested/allocated for each 
> queue. But we have no visibility into how much of the allocated resources are 
> actually being used. Take Spark as an example, an under-optimized job may 
> request 1 TB of total executor memory but the actual processing logic only 
> uses 100 GB. This has the consequence that other jobs might not be able to 
> fit in the queue. Having a metric that shows the real utilization will help 
> members of a queue better understand their job characteristics and optimize 
> the jobs.
> K8s metrics server has metrics on real utilization. YK may be able to perform 
> some aggregations to arrive at the stats at the queue level. This is a 
> k8s-specific solution though.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: [ANNOUNCE] New committer: Peter Bacsko

2022-01-04 Thread Chenya Zhang
Welcome aboard, Peter!

On Tue, Jan 4, 2022 at 8:53 AM Chaoran Yu  wrote:

> Congrats Peter. Welcome aboard!
>
>
> > On Jan 4, 2022, at 08:23, Weiwei Yang  wrote:
> >
> > Hi Peter
> >
> > Thank you! Have fun!!
> >
> > On Tue, Jan 4, 2022 at 8:07 AM Peter Bacsko  wrote:
> >
> >> Hi everyone,
> >>
> >> thanks everyone for inviting as a committer, I'm sure we'll have some
> good
> >> time together with YK! :)
> >>
> >> Cheers,
> >> Peter
> >>
> >> On Tue, Jan 4, 2022 at 11:37 AM Wilfred Spiegelenburg <
> wilfr...@apache.org
> >>>
> >> wrote:
> >>
> >>> The Project Management Committee (PMC) for Apache YuniKorn has invited
> >>> Peter  to
> >>> become a committer and we are pleased to announce that he has accepted.
> >>> Please join me in congratulating him.
> >>>
> >>> Congratulations & Welcome aboard Peter!
> >>>
> >>> Wilfred
> >>> on behalf of The Apache YuniKorn PPMC
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> >>> For additional commands, e-mail: dev-h...@yunikorn.apache.org
> >>>
> >>>
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> For additional commands, e-mail: dev-h...@yunikorn.apache.org
>
>


Re: [DISCUSS] Graduate YuniKorn from Apache Incubator

2022-01-04 Thread Chenya Zhang
+1 for starting the graduation process.

Please let us know of anything that we could do to help. Feel free to
assign tickets to us. cc' Weiwei, Wilfred, Sunil.

On Tue, Jan 4, 2022 at 9:37 AM Luciano Resende  wrote:

> If you haven't done yet, please take a look at the Graduation Guide [1]
> there is also a timeline that suggests that graduation might take at least
> one month. So, we should start the process and items, but don't get stuck
> on a particular time frame/month, we submit when we are ready (and the
> multiple reports and votes are done).
>
> [1] https://incubator.apache.org/guides/graduation.html
>
> On Tue, Jan 4, 2022 at 9:08 AM Sunil Govindan  wrote:
>
> > I think we can aim for Feb'22, considering we can close all the tasks
> that
> > we planned.
> > We had these discussions earlier, and it is great to see that we are in
> > much better shape.
> >
> > Let's go ahead with the TLP path and the work required to complete it.
> > @Wilfred Spiegelenburg  and @Weiwei Yang
> > , I referred to a few of the recent TLP threads and
> come
> > up with a list of items that they were focussing.
> >
> > If there are no objections, let's go to the next step of tasks.
> >
> > Great to see this feedbacks across. Thanks, team.
> >
> > Thanks,
> > Sunil
> >
> > On Tue, Jan 4, 2022 at 8:54 AM Chaoran Yu 
> wrote:
> >
> > > +1 from me. I agree we can get started on the graduation process now.
> > >
> > >
> > > > On Jan 4, 2022, at 08:39, Weiwei Yang  wrote:
> > > >
> > > > Hi all
> > > >
> > > > Thanks for the feedback!
> > > > Aiming for the next board meeting sounds good to me. This thread is
> to
> > > > collect feedback, so please +1 if you think we are good to get
> started.
> > > > Thank you Wilfred for putting an eye on the name search and fixing
> the
> > > > whimsy.
> > > >
> > > > On Tue, Jan 4, 2022 at 2:14 AM Wilfred Spiegelenburg <
> > > wilfr...@apache.org>
> > > > wrote:
> > > >
> > > >> I think that is cutting it too close. The board meeting is in 2
> weeks.
> > > >> The incubator needs to file its report a week before that on the
> 12th.
> > > >> That leaves us with just a week to finish up everything.
> > > >> We have not done all the work yet  that we need to do
> (administrative)
> > > >> and we still need to go through the two voting rounds also.
> > > >>
> > > >> I would feel more comfortable aiming for the next meeting which is
> in
> > > >> mid February.
> > > >>
> > > >> Wilfred
> > > >>
> > > >> On Tue, 4 Jan 2022 at 17:27, Wei-Chiu Chuang 
> > > wrote:
> > > >>>
> > > >>> I think it's the time.
> > > >>> Do we want to propose it as part of the January board meeting
> report?
> > > >>> Time's running out.
> > > >>>
> > > >>> On Tue, Jan 4, 2022 at 2:08 PM Wilfred Spiegelenburg <
> > > >> wilfr...@apache.org>
> > > >>> wrote:
> > > >>>
> > > >>>> Graduation is not linked to any release. It is purely a project
> > > status.
> > > >>>> There are still a number of things that need to be completed
> before
> > we
> > > >>>> can graduate.
> > > >>>> But I think we should start the discussion and the process to
> > > graduate.
> > > >>>>
> > > >>>> One of the points I cleared up today is the podling name search
> [1].
> > > >>>> I fixed the whimsy scans that we should pass before graduation via
> > > [2].
> > > >>>>
> > > >>>> Wilfred
> > > >>>>
> > > >>>> [1] https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-193
> > > >>>> [2] https://issues.apache.org/jira/browse/YUNIKORN-856
> > > >>>>
> > > >>>> On Tue, 4 Jan 2022 at 16:45, Chenya Zhang <
> > > chenyazhangche...@gmail.com
> > > >>>
> > > >>>> wrote:
> > > >>>>>
> > > >>>>> Thanks Weiwei for initiating this discussion!
> > > >>>>>
> > > >>>>> So happy to see all the milestones achieved by YuniKorn in the
> past
> > > >> two
&g

Re: [DISCUSS] Graduate YuniKorn from Apache Incubator

2022-01-03 Thread Chenya Zhang
Thanks Weiwei for initiating this discussion!

So happy to see all the milestones achieved by YuniKorn in the past two
years. Big Kudos to the whole community!

A few follow-up questions, do we plan to release the plug-in framework
before graduating from Apache Incubator? Any potential timeline is needed
for 1.00 release?

Thanks,
Chenya

On Mon, Jan 3, 2022 at 9:20 PM Weiwei Yang  wrote:

> Hi, yunikorn community
>
>
> Happy new year!! I hope you all have an excellent start in 2022.
>
>
> YuniKorn started incubating in ASF on Jan/21/2020, and the project has been
> incredibly well managed and maintained by the community in almost 2 years
> under ASF now. There were 5 releases published in a good cadence, i.e v0.8,
> v0.9, v0.10, v0,11, and v0.12. The community works closely together to
> build better open-source software and is committed to success together.
> Since incubating in ASF, the project has:
>
>
>- Added 2 new mentors, 8 new committers, 2 new PPMC from 6 different
>organizations [1].
>- Published 5 releases by 5 different release managers [2]
>- Regular bi-weekly/weekly community meetings for 2 different timezones,
>meeting minutes recorded here [3]
>- Leveraged mailing list and slack channel for various offline
>discussions, i.e user/dev groups
>- The community co-operates the YouTube channel, with many demo/intro
>videos published [4]
>
> Now the project has 6 mentors, 21 PPMC, and 27 committers from 14 different
> organizations. I have enough reasons to believe the project has done
> sustainable development successfully in the Apache way. Thanks to all the
> mentors, committers, release managers, contributors, and evangelists!!! I
> have reviewed the Apache maturity doc, I don’t see anything that blocks us
> from graduating from the incubator. Therefore, I want to raise this
> discussion in the community, appreciate your feedback!!
>
>
> [1] https://incubator.apache.org/projects/yunikorn.html
>
> [2] https://yunikorn.apache.org/community/download
>
> [3]
>
> https://docs.google.com/document/d/165gzC7uhcKc5XDWiMYSRKBiPQBy2tDtXADUPuhGlUa0
>
> [4] https://www.youtube.com/channel/UCDSJ2z-lEZcjdK27tTj_hGw
>
>
> Thanks!
>


Re: Apache YuniKorn (Incubating) 0.12.1 Released

2021-12-27 Thread Chenya Zhang
Thanks Chaoran for the great efforts!

Look forward to our 1.0 release and graduation from the Apache incubator!

Best,
Chenya

On Sun, Dec 26, 2021 at 11:35 PM Weiwei Yang  wrote:

> Hi Chaoran
>
> This is awesome, thank you for getting our release out!!
>
> On Sun, Dec 26, 2021 at 10:16 PM Chaoran Yu 
> wrote:
>
> > I forgot to mention that the source code and convenience images can be
> > found at the Downloads page:
> > https://yunikorn.apache.org/community/download <
> > https://yunikorn.apache.org/community/download>
> >
> > The Artifact Hub https://artifacthub.io/packages/helm/yunikorn/yunikorn
> <
> > https://artifacthub.io/packages/helm/yunikorn/yunikorn> has been updated
> > with the latest Helm chart.
> >
> >
> > > On Dec 26, 2021, at 22:09, Chaoran Yu  wrote:
> > >
> > > Hi all,
> > >
> > > The IPMC has voted in approval of the v0.12.1 release candidate. The
> > voting result thread can be found at
> > https://lists.apache.org/thread/1h3s77jo977qjf8c3w34flj6lj307yks <
> > https://lists.apache.org/thread/1h3s77jo977qjf8c3w34flj6lj307yks>.
> > >
> > > Now we are officially announcing that v0.12.1 is released. The release
> > notes can be found at
> https://yunikorn.apache.org/release-announce/0.12.1
> > . Thanks to all the
> > folks who contributed to this release!
> > >
> > > Hope everyone is having a great holiday break. Let’s regroup in the new
> > year to achieve even bigger goals, among them the 1.0 release and
> > graduation from the Apache incubator!
> > >
> > > Cheers,
> > > Chaoran
> > >
> >
> >
>


[jira] [Closed] (YUNIKORN-722) Improve YuniKorn core's queue-level and scheduler metrics

2021-12-23 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang closed YUNIKORN-722.
-

> Improve YuniKorn core's queue-level and scheduler metrics
> -
>
> Key: YUNIKORN-722
> URL: https://issues.apache.org/jira/browse/YUNIKORN-722
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - common
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>
> To improve YuniKorn core's queue-level and scheduler metrics:
> - Differentiate queue level metrics with scheduler metrics, e.g. using 
> "IncQueueApplicationsAccepted"
> - Refactor related queue and scheduler metrics definitions
> - Refactor related queue and scheduler metrics operation functions
> - Update metrics naming and help messages 
> - Update in-line comments and documentations



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-718) Redefine YuniKorn core's scheduler metrics names and help messages for usability and clarity

2021-12-23 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang resolved YUNIKORN-718.
---
Resolution: Duplicate

The work is covered by https://issues.apache.org/jira/browse/YUNIKORN-717

> Redefine YuniKorn core's scheduler metrics names and help messages for 
> usability and clarity
> 
>
> Key: YUNIKORN-718
> URL: https://issues.apache.org/jira/browse/YUNIKORN-718
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - common
>Reporter: Chenya Zhang
>    Assignee: Chenya Zhang
>Priority: Major
>
> Some issues observed:
>  * Metrics name is not meaningful
>  * Help messages are lack of some details
>  * Prometheus labels are not consistent
> Need to improve for usability and clarity.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-718) Redefine YuniKorn core's scheduler metrics names and help messages for usability and clarity

2021-12-23 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang closed YUNIKORN-718.
-

> Redefine YuniKorn core's scheduler metrics names and help messages for 
> usability and clarity
> 
>
> Key: YUNIKORN-718
> URL: https://issues.apache.org/jira/browse/YUNIKORN-718
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - common
>Reporter: Chenya Zhang
>    Assignee: Chenya Zhang
>Priority: Major
>
> Some issues observed:
>  * Metrics name is not meaningful
>  * Help messages are lack of some details
>  * Prometheus labels are not consistent
> Need to improve for usability and clarity.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-719) Refactor YuniKorn core's scheduler metrics for sorting latency

2021-12-23 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang resolved YUNIKORN-719.
---
Resolution: Duplicate

The work is covered by https://issues.apache.org/jira/browse/YUNIKORN-717

> Refactor YuniKorn core's scheduler metrics for sorting latency
> --
>
> Key: YUNIKORN-719
> URL: https://issues.apache.org/jira/browse/YUNIKORN-719
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - common
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>
> Defining "nodeSortingLatency", "appSortingLatency", and "queueSortingLatency" 
> to initialize scheduler metrics is redundant in code.
>  * It can be combined into "sortingLatency"
>  * It can use a "prometheus.CounterVec" with different Prometheus labels



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-719) Refactor YuniKorn core's scheduler metrics for sorting latency

2021-12-23 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang closed YUNIKORN-719.
-

> Refactor YuniKorn core's scheduler metrics for sorting latency
> --
>
> Key: YUNIKORN-719
> URL: https://issues.apache.org/jira/browse/YUNIKORN-719
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - common
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>
> Defining "nodeSortingLatency", "appSortingLatency", and "queueSortingLatency" 
> to initialize scheduler metrics is redundant in code.
>  * It can be combined into "sortingLatency"
>  * It can use a "prometheus.CounterVec" with different Prometheus labels



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: Observability of actually cpu/memory usage

2021-12-21 Thread Chenya Zhang
>From metrics server's documentation,

Don't use Metrics Server when you need:
- Non-Kubernetes clusters
- An accurate source of resource usage metrics
- Horizontal autoscaling based on other resources than CPU/Memory

I think they have some concerns on metrics accuracy. We may need to
understand what are some possible risks here.

For example, if a user is trying to tune an application but gets
conflicting information in different runs, it could be confusing for them.
If there is a good range of consistency or any potential areas of
inaccuracy that can be documented, it would be a helpful source of
information for application tuning.


On Tue, Dec 21, 2021 at 3:19 PM Weiwei Yang  wrote:

> K8s dashboard did some integration with metrics-server, maybe we can
> investigate and see how that was done.
> Essentially we just need to pull these metrics somewhere.
>
> On Tue, Dec 21, 2021 at 2:42 PM Chaoran Yu 
> wrote:
>
> > Previously when doing research on this topic, I saw that the
> metrics-server
> > documentation says:"*Metrics Server is not meant for non-autoscaling
> > purposes. For example, don't use it to forward metrics to monitoring
> > solutions, or as a source of monitoring solution metrics. In such cases
> > please collect metrics from Kubelet /metrics/resource endpoint
> directly*."
> > But the Kubelet APIs
> > <
> >
> https://github.com/kubernetes/kubernetes/blob/v1.21.5/pkg/kubelet/server/server.go#L236
> > >that
> > the statement refers to are not documented, meaning they are hidden APIs
> > that can change or be deprecated at any future Kubernetes release.
> > Integrating with these APIs doesn't sound promising. But besides Kubelet,
> > the actual utilization info of workloads is not readily available
> anywhere
> > else. We'll need to explore other ideas.
> >
> > On Tue, Dec 21, 2021 at 12:51 PM Weiwei Yang  wrote:
> >
> > > Thank you Bowen to raise this up, this is an interesting topic. Bear
> with
> > > me this long reply : )
> > >
> > > Like Wilfred mentioned, YK doesn't know about the actual used resources
> > in
> > > terms of CPU and memory for each pod, or application, at least not
> > today. I
> > > understand the requirements about tracking this info in order to give
> > users
> > > some feedback or even recommendations on how to tune their jobs more
> > > properly. It would be good to have something in our view as "Allocated"
> > vs
> > > "Used" for each app/queue. We could further introduce some penalties if
> > > people keep over-requesting resources.
> > >
> > > However, most likely we will need to do this outside of YK. The major
> > > reason is all data YK is consuming are from api-server, backed by etcd.
> > Non
> > > of such metrics will be stored in etcd, as per design in metrics-server
> > > . Second, YK
> doesn't
> > > have any per-node agent running that we can facilitate to collect
> actual
> > > resource usages, we still need to leverage a 3rd party tool to do so.
> > Maybe
> > > we can do some integration with metrics-server, aggregating app/queue
> > used
> > > info from those fragmented metrics, and then plug that into our
> > > yunikorn-web UI. We have the flexibility to do this I believe, which
> > could
> > > be an option.
> > >
> > > On Mon, Dec 20, 2021 at 10:28 PM Wilfred Spiegelenburg <
> > > wilfr...@apache.org>
> > > wrote:
> > >
> > > > Hi Bowen,
> > > >
> > > > Maybe a strange question but what do you consider "actually
> > > > used" resources? Anything the scheduler sees is used. The scheduler
> has
> > > no
> > > > information on what the container really occupies: it asked for 100GB
> > but
> > > > it only allocated 50GB etc. If you need that YuniKorn cannot help
> you.
> > If
> > > > it is just a looking at allocation over time YuniKorn is capable of
> > > giving
> > > > you the information.
> > > >
> > > > Second point to make is that normally applications do not provide any
> > > > information on what they expect to use before they use it. Let's
> take a
> > > > Spark application. The driver creates pods as it needs new executors.
> > The
> > > > Spark config drives those requests and the limitations. The scheduler
> > > only
> > > > sees the pods that are really requested. It does not know, and should
> > not
> > > > know, if that is limited by what is configured or that the job uses
> > only
> > > > part or more than what is configured.
> > > >
> > > > The only time the scheduler would have any idea about a "maximum" is
> > > when a
> > > > gang request is made. For gang scheduling we can track if the gang
> > > > request is completely used or not. We could add metrics for it on an
> > > > application. We can also track the number of containers allocated for
> > an
> > > > application or queue, the time from start to finish for containers
> etc.
> > > We
> > > > could even track the maximum resource allocation for an application
> or
> > a
> > > > queue over a time interval. Prometheus should 

[jira] [Resolved] (YUNIKORN-722) Improve YuniKorn core's queue-level and scheduler metrics

2021-12-21 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang resolved YUNIKORN-722.
---
Resolution: Duplicate

https://issues.apache.org/jira/browse/YUNIKORN-721

> Improve YuniKorn core's queue-level and scheduler metrics
> -
>
> Key: YUNIKORN-722
> URL: https://issues.apache.org/jira/browse/YUNIKORN-722
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - common
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>
> To improve YuniKorn core's queue-level and scheduler metrics:
> - Differentiate queue level metrics with scheduler metrics, e.g. using 
> "IncQueueApplicationsAccepted"
> - Refactor related queue and scheduler metrics definitions
> - Refactor related queue and scheduler metrics operation functions
> - Update metrics naming and help messages 
> - Update in-line comments and documentations



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: [VOTE] Release Apache YuniKorn (incubating) 0.12.1

2021-12-15 Thread Chenya Zhang
+1

Verified the following:
- Built from source
- Docker image
- Signature
- Installed with Helm Charts
- REST API, Web UI
- Triggered simple jobs and checked scheduling, events, logs
- Unit tests

On Wed, Dec 15, 2021 at 10:13 AM Peter Bacsko  wrote:

> +1 (non-binding)
>
> Verified the following:
> * Signature
> * Hash
> * Built image locally
> * Installed YK with helm on Minikube
> * Run sleep batch job
> * Checked some REST endpoints (nodes/queues/apps/statedump/loglevel)
> * Checked YK logs
>
> BR,
> Peter
>
> On Wed, Dec 15, 2021 at 6:58 PM Craig Condit 
> wrote:
>
> > +1.
> >
> > Verified the following:
> >
> >
> > - Verified SHA256 hash
> > - Verified GPG signature
> > - Ran unit and e2e tests
> > - Built docker images
> > - Installed via helm chart into local Minikube cluster
> > - Tested basic functionality including gang scheduling
> >
> >
> > Craig
> >
> >
> > > On Dec 14, 2021, at 1:46 PM, Chaoran Yu 
> wrote:
> > >
> > > Hi all,
> > >
> > > I'd like to call a vote for a release candidate for Apache YuniKorn
> > (incubating) 0.12.1 release.
> > >
> > > The release artifacts have been uploaded to
> > https://dist.apache.org/repos/dist/dev/incubator/yunikorn/0.12.1/ <
> > https://dist.apache.org/repos/dist/dev/incubator/yunikorn/0.12.1/>
> > >
> > > My public key is located here
> > https://dist.apache.org/repos/dist/release/incubator/yunikorn/KEYS <
> > https://dist.apache.org/repos/dist/release/incubator/yunikorn/KEYS>
> > >
> > > The release has been tagged with "v0.12.1" in all our git repositories.
> > >
> > > The JIRA issues that have been resolved in this release can be found
> > here  >.
> > >
> > > Please review and vote. The vote will be open for at least 72 hours and
> > > closes on Friday, December 17 2021, 1pm PDT.
> > >
> > > [ ] +1 approve
> > > [ ] +0 no opinion
> > > [ ] -1 disapprove (and the reason why)
> > >
> > >
> > > Thank you,
> > > Chaoran
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> > For additional commands, e-mail: dev-h...@yunikorn.apache.org
> >
> >
>


Fwd: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal

2021-11-30 Thread Chenya Zhang
FYI. Volcano folks are proposing more native integration with Spark. It
would be great if we also get involved in the discussion.

-- Forwarded message -
From: Yikun Jiang 
Date: Tue, Nov 30, 2021 at 12:53 AM
Subject: [DISCUSSION] SPIP: Support Volcano/Alternative Schedulers Proposal
To: dev 


Hey everyone,

I'd like to start a discussion on "Support Volcano/Alternative Schedulers
Proposal".

This SPIP is proposed to make spark k8s schedulers provide more YARN like
features (such as queues and minimum resources before scheduling jobs) that
many folks want on Kubernetes.

The goal of this SPIP is to improve current spark k8s scheduler
implementations, add the ability of batch scheduling and support volcano as
one of implementations.

Design doc:
https://docs.google.com/document/d/1xgQGRpaHQX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg
JIRA: https://issues.apache.org/jira/browse/SPARK-36057
Part of PRs:
Ability to create resources https://github.com/apache/spark/pull/34599
Add PodGroupFeatureStep: https://github.com/apache/spark/pull/34456

Regards,
Yikun


Re: Prepare for YuniKorn Nov 18th Meetup

2021-11-18 Thread Chenya Zhang
Thanks Weiwei and Sunil! Friendly reminder on our YuniKorn meetup at 4:30pm
PST today. Zoom link to join:
https://cloudera.zoom.us/j/96818279633

On Sun, Nov 14, 2021 at 9:31 PM Weiwei Yang  wrote:

> Hi Sunil, thank you!
>
> On Sun, Nov 14, 2021 at 7:31 AM Sunil Govindan  wrote:
>
> > Hello Folks,
> >
> > Here is our Twitter feed for the upcoming meetup.
> > https://twitter.com/YuniKorn_Sched/status/1459906107279757316
> >
> > Thanks
> > Sunil
> >
> > On Fri, Oct 29, 2021 at 5:42 PM Weiwei Yang  wrote:
> >
> >> Awesome, thank you Chenya!
> >>
> >> On Fri, Oct 29, 2021 at 4:39 PM Chenya Zhang <
> chenyazhangche...@gmail.com>
> >> wrote:
> >>
> >>> + Our LinkedIn event posted:
> >>> https://www.linkedin.com/events/6859992516305932288/
> >>>
> >>> On Fri, Oct 29, 2021 at 3:00 PM Weiwei Yang  wrote:
> >>>
> >>> > Thanks. Info updated to:
> http://yunikorn.apache.org/community/sessions
> >>> > Please spread this out via twitter, linkedin, etc.
> >>> >
> >>> > On Fri, Oct 29, 2021 at 8:53 AM Chenya Zhang <
> >>> chenyazhangche...@gmail.com>
> >>> > wrote:
> >>> >
> >>> > > Hey devs,
> >>> > >
> >>> > > This is the blurb for our meetup. Thanks to Wilfred to provide the
> >>> > details
> >>> > > <
> >>> >
> >>>
> https://docs.google.com/document/d/1-NP0J22-Gp3cZ_hfKyA9htXJw7tlk-BmljF-7CBJg44/edit?usp=sharing
> >>> > >
> >>> > > !
> >>> > >
> >>> > > Feedback welcome! Sending it out today.
> >>> > >
> >>> > > *A Big Milestone for Apache YuniKorn (Incubating)**Time*
> >>> > > Nov. 18th, 4:30 - 6:00 pm PST*Location*
> >>> > > Online (TBD)*Topic*
> >>> > > Apache YuniKorn (Incubating) has released v0.11 earlier this year
> >>> with a
> >>> > > number of new features and improvements like Gang scheduling, REST
> >>> API
> >>> > > enhancements and Kubernetes 1.19 support. In a month, we are
> >>> planning the
> >>> > > major v1.0.0 release with Kubernetes 1.20 & 1.21 support, improved
> >>> node
> >>> > > sorting and numerous small fixes & enhancements! In this meetup, we
> >>> will
> >>> > > deep dive into the implementation of Gang scheduling behind the use
> >>> of
> >>> > > temporary placeholder pods on Kubernetes, significant performance
> >>> > > improvement with simplified scheduler core code and a new node
> >>> sorting
> >>> > > algorithm, and a future roadmap of leveraging Kubernetes plugin
> >>> > > architecture.
> >>> > >
> >>> > > On Sun, Oct 24, 2021 at 7:12 PM Chenya Zhang <
> >>> > chenyazhangche...@gmail.com>
> >>> > > wrote:
> >>> > >
> >>> > >> Sounds good, thanks Sunil!
> >>> > >>
> >>> > >> On Sun, Oct 24, 2021 at 4:21 PM Sunil Govindan  >
> >>> > wrote:
> >>> > >>
> >>> > >>> Awesome. Thanks, Chenya.
> >>> > >>>
> >>> > >>> I think we are good with the initial task lists. We need more
> >>> > >>> participation, and I will reach to other meetup groups to send
> this
> >>> > >>> invite from their members as well.
> >>> > >>>
> >>> > >>> Thanks
> >>> > >>> Sunil
> >>> > >>>
> >>> > >>> On Sun, Oct 24, 2021 at 3:20 PM Chenya Zhang <
> >>> > >>> chenyazhangche...@gmail.com> wrote:
> >>> > >>>
> >>> > >>>> Hi devs,
> >>> > >>>>
> >>> > >>>> I created an umbrella ticket to hold necessary work for our
> >>> external
> >>> > >>>> meetup setup & broadcast. The upcoming one, as we discussed, is
> >>> Nov
> >>> > 18th.
> >>> > >>>>
> >>> > >>>> => https://issues.apache.org/jira/browse/YUNIKORN-912
> >>> > >>>>
> >>> > >>>> Please take a look. Esp. Wilfred, Weiwei, Sunil, and Chaoran:
> Let
> >>> me
> >>> > >>>> know if you have questions or suggestions. Feel free to create
> >>> more
> >>> > tasks.
> >>> > >>>>
> >>> > >>>> We are also calling for more volunteers. Please reply to this
> >>> email if
> >>> > >>>> you are interested, thanks!
> >>> > >>>>
> >>> > >>>> Best,
> >>> > >>>> Chenya
> >>> > >>>>
> >>> > >>>>
> >>> > >>>>
> >>> >
> >>>
> >>
>


Re: [ANNOUNCE] New committer: Craig Condit

2021-11-14 Thread Chenya Zhang
Congrats Craig! Well deserved!

On Sun, Nov 14, 2021 at 6:54 PM Manikandan R  wrote:

> Congratulations Craig.
>
> Thanks,
> Mani
>
> On Mon, Nov 15, 2021 at 8:21 AM Sunil Govindan  wrote:
>
> > Congratulations Craig, welcome aboard!
> >
> > Thanks
> > Sunil
> >
> > On Sun, Nov 14, 2021 at 4:44 PM Wilfred Spiegelenburg <
> wilfr...@apache.org
> > >
> > wrote:
> >
> > > The Project Management Committee (PMC) for Apache YuniKorn has invited
> > > Craig to
> > > become a committer and we are pleased to announce that he has accepted.
> > > Please join me in congratulating him.
> > >
> > > Congratulations & Welcome aboard Craig!
> > >
> > > Wilfred
> > > on behalf of The Apache YuniKorn PPMC
> > >
> >
>


Re: YK Perf benchmark

2021-11-07 Thread Chenya Zhang
Great effort, Tingyao and Yuteng!

You guys can write a tech blog to share with the community. We will all
help you to spread it. It would also be nice if you can present the
benchmark in a YuniKorn Meetup early 2022.

Best,
Chenya


On Sun, Nov 7, 2021 at 3:17 PM Wilfred Spiegelenburg 
wrote:

> Cannot wait to see the result. Lots has changed since the last time we
> published this.
>
> Wilfred
>
> On Sat, 6 Nov 2021 at 04:05, Sunil Govindan  wrote:
>
> > This is great. Thank you Tingyao and Yuteng.
> > This will be helping to showcase the performance advantage of YuniKorn in
> > K8s.
> >
> > Thanks
> > Sunil
> >
> > On Thu, Nov 4, 2021 at 11:14 PM Chaoran Yu 
> > wrote:
> >
> > > Thanks so much Tingyao and Yuteng! These results will certainly help
> > > promote YuniKorn to newcomers and those who are hesitant in adopting
> it.
> > >
> > > On Thu, Nov 4, 2021 at 10:45 PM Weiwei Yang  wrote:
> > >
> > > > Hi all
> > > >
> > > > As you know or may not know, in the last 2 months, Tingyao and Yuteng
> > are
> > > > helping to perform a new round of perf-bench marking based on the
> > latest
> > > > codebase. They've done this from scratch, set up a simulated env
> using
> > > > their lab's hardware, setup up metrics servers to observe and
> discover
> > > perf
> > > > bottleneck, tuning the perf to a reasonably good result. I really
> > > > appreciate their efforts to get this done.
> > > >
> > > > Now, they are summarizing the result and preparing to update the YK
> > > > website with a bunch of good docs, including perf result, and
> benchmark
> > > > tutorial. This will be done before the next release time frame.
> > > >
> > > > Again, thank you Tingyao, Yuteng, this is a great example of
> community
> > > > collaboration! Well done!
> > > >
> > > > Thanks
> > > > Weiwei
> > > >
> > >
> >
>


Re: Discussion with K8s Sig Team

2021-11-04 Thread Chenya Zhang
btw. We have the YK Meetup on Nov 18th 4:30 - 6:00pm. Hope it will not be
in conflict with the K8s Sig call.

On Thu, Nov 4, 2021 at 5:33 PM Sunil Govindan  wrote:

> Thanks, Chaoron.
>
>
> On Thu, Nov 4, 2021 at 5:23 PM Chaoran Yu  wrote:
>
> > Sure, let's use my list as the agenda and finalize next Wednesday
> >
> > On Thu, Nov 4, 2021 at 3:58 PM Sunil Govindan  wrote:
> >
> > > Hi Weiwei
> > >
> > > Yes. I suggest meeting next Wed inside of the YK community.
> > > Based on that we can take a call whether to join on 18th.
> > >
> > > Thanks, Chaoron for summing it up. And we could use that as agenda. Is
> > that
> > > fine?
> > >
> > > Thanks
> > > Sunil
> > >
> > > On Thu, Nov 4, 2021 at 3:21 PM Weiwei Yang  wrote:
> > >
> > > > I think Chaoran covered most of the questions we discussed yesterday.
> > > > Sunil, are you suggesting to meet next Wed inside of the YK community
> > > > before setting up a call with sig-scheduling?
> > > > I noticed the next sig-scheduling meeting is Nov 18, we probably can
> > try
> > > to
> > > > set up sometime earlier than that with a smaller group of people.
> > > >
> > > > On Thu, Nov 4, 2021 at 1:02 PM Chaoran Yu 
> > > wrote:
> > > >
> > > > > Hi Sunil,
> > > > >
> > > > > Next Wednesday at the same time works for me. From what I
> understand
> > in
> > > > the
> > > > > meeting yesterday, we want to initiate a conversation with
> > > sig-scheduling
> > > > > without committing to any actions just yet.
> > > > >
> > > > > The information we want to gather from them include the following:
> > > > >
> > > > > * What's the overall process and timeline when contributing a
> plugin?
> > > > > * Besides contributing code directly in the plugins repo, can the
> > code
> > > > live
> > > > > in an outside repo (e.g. Apache repos)? Any other options?
> > > > > * The scheduler-plugins repo doesn't align with Kubernetes itself
> in
> > > > terms
> > > > > of release cadence nor does it match the latest k8s version. How is
> > it
> > > > > maintained?
> > > > > * How does the default scheduler in the core K8s repo incorporate
> > > > > community-contributed plugins? What's the process of promoting a
> > > plugin?
> > > > >
> > > > > We can finalize the agenda when we meet next week
> > > > >
> > > > > On Thu, Nov 4, 2021 at 12:21 PM Sunil Govindan 
> > > > wrote:
> > > > >
> > > > > > Hi Folks,
> > > > > >
> > > > > > Thanks for joining our weekly sync call.
> > > > > >
> > > > > > We have an action item to discuss with the SIG team sooner. Could
> > we
> > > > > > conclude the meeting agenda earlier within the YuniKorn community
> > > > itself
> > > > > > before we have that call?
> > > > > > We can have one more YuniKorn community call next week to
> conclude
> > > > before
> > > > > > starting the discussion with K8s. I could help schedule this at
> the
> > > > same
> > > > > > time as yesterday (3.00 pm PST) if there are no objections.
> > > > > >
> > > > > > Please share your thoughts.
> > > > > >
> > > > > > Thanks
> > > > > > Sunil
> > > > > >
> > > > >
> > > >
> > >
> >
>


[jira] [Resolved] (YUNIKORN-918) Test YouTube Live to broadcast YuniKorn Meetup in Nov 2021

2021-11-01 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang resolved YUNIKORN-918.
---
Resolution: Later

> Test YouTube Live to broadcast YuniKorn Meetup in Nov 2021
> --
>
> Key: YUNIKORN-918
> URL: https://issues.apache.org/jira/browse/YUNIKORN-918
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: community
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>
> Follow [https://www.youtube.com/] to create a live broadcast and test its 
> functionality including connectivity, video quality, recording, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-915) Create LinkedIn Event for YuniKorn Meetup in Nov 2021

2021-11-01 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang resolved YUNIKORN-915.
---
Resolution: Fixed

> Create LinkedIn Event for YuniKorn Meetup in Nov 2021
> -
>
> Key: YUNIKORN-915
> URL: https://issues.apache.org/jira/browse/YUNIKORN-915
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: community
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>
> Follow [https://www.linkedin.com/mynetwork/network-manager/events/] to create 
> an event.
> Share the event to yunikorn_dev@ list to invite more people to spread it in 
> their professional network.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-917) Update website to include YuniKorn Meetup contents

2021-11-01 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang closed YUNIKORN-917.
-

> Update website to include YuniKorn Meetup contents
> --
>
> Key: YUNIKORN-917
> URL: https://issues.apache.org/jira/browse/YUNIKORN-917
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: community
>    Reporter: Chenya Zhang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> We can consolidate the current content in
> [https://yunikorn.apache.org/community/sessions/] and rebuild it with the
> following sections:
> Coming Meetup session
>  - speaker
>  - abstract
>  - date and time
> Past sessions, conf talks
>   - a list of past session recordings
> And then add a section somewhere in the homepage for the "upcoming meetup".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-914) Prepare an abstract of talk for YuniKorn meetup in Nov 2021

2021-11-01 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang closed YUNIKORN-914.
-

> Prepare an abstract of talk for YuniKorn meetup in Nov 2021
> ---
>
> Key: YUNIKORN-914
> URL: https://issues.apache.org/jira/browse/YUNIKORN-914
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: community
>    Reporter: Chenya Zhang
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Fix For: 1.0.0
>
>
> An abstract of 200 - 300 words for the talk which could share about features 
> in the latest release, Gang scheduling, bin packing, node sorting 
> improvement, etc.
> It will be used to post on YuniKorn website, LinkedIn, Twitter, etc. to 
> broadcast the upcoming event. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-913) Invite speaker and set date for YuniKorn Meetup in Nov 2021

2021-11-01 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang closed YUNIKORN-913.
-

> Invite speaker and set date for YuniKorn Meetup in Nov 2021
> ---
>
> Key: YUNIKORN-913
> URL: https://issues.apache.org/jira/browse/YUNIKORN-913
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: community
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>
> Send email to yunikorn_dev@ list to call for speakers and confirm on 
> date/time.
> Note that for each monthly meetup, we could do one on PST morning and then 
> during the evening. This will help to cover all time zones as we have folks 
> from Budapest, India, China, East Asia, and Australia.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: Prepare for YuniKorn Nov 18th Meetup

2021-10-29 Thread Chenya Zhang
+ Our LinkedIn event posted:
https://www.linkedin.com/events/6859992516305932288/

On Fri, Oct 29, 2021 at 3:00 PM Weiwei Yang  wrote:

> Thanks. Info updated to: http://yunikorn.apache.org/community/sessions
> Please spread this out via twitter, linkedin, etc.
>
> On Fri, Oct 29, 2021 at 8:53 AM Chenya Zhang 
> wrote:
>
> > Hey devs,
> >
> > This is the blurb for our meetup. Thanks to Wilfred to provide the
> details
> > <
> https://docs.google.com/document/d/1-NP0J22-Gp3cZ_hfKyA9htXJw7tlk-BmljF-7CBJg44/edit?usp=sharing
> >
> > !
> >
> > Feedback welcome! Sending it out today.
> >
> > *A Big Milestone for Apache YuniKorn (Incubating)**Time*
> > Nov. 18th, 4:30 - 6:00 pm PST*Location*
> > Online (TBD)*Topic*
> > Apache YuniKorn (Incubating) has released v0.11 earlier this year with a
> > number of new features and improvements like Gang scheduling, REST API
> > enhancements and Kubernetes 1.19 support. In a month, we are planning the
> > major v1.0.0 release with Kubernetes 1.20 & 1.21 support, improved node
> > sorting and numerous small fixes & enhancements! In this meetup, we will
> > deep dive into the implementation of Gang scheduling behind the use of
> > temporary placeholder pods on Kubernetes, significant performance
> > improvement with simplified scheduler core code and a new node sorting
> > algorithm, and a future roadmap of leveraging Kubernetes plugin
> > architecture.
> >
> > On Sun, Oct 24, 2021 at 7:12 PM Chenya Zhang <
> chenyazhangche...@gmail.com>
> > wrote:
> >
> >> Sounds good, thanks Sunil!
> >>
> >> On Sun, Oct 24, 2021 at 4:21 PM Sunil Govindan 
> wrote:
> >>
> >>> Awesome. Thanks, Chenya.
> >>>
> >>> I think we are good with the initial task lists. We need more
> >>> participation, and I will reach to other meetup groups to send this
> >>> invite from their members as well.
> >>>
> >>> Thanks
> >>> Sunil
> >>>
> >>> On Sun, Oct 24, 2021 at 3:20 PM Chenya Zhang <
> >>> chenyazhangche...@gmail.com> wrote:
> >>>
> >>>> Hi devs,
> >>>>
> >>>> I created an umbrella ticket to hold necessary work for our external
> >>>> meetup setup & broadcast. The upcoming one, as we discussed, is Nov
> 18th.
> >>>>
> >>>> => https://issues.apache.org/jira/browse/YUNIKORN-912
> >>>>
> >>>> Please take a look. Esp. Wilfred, Weiwei, Sunil, and Chaoran: Let me
> >>>> know if you have questions or suggestions. Feel free to create more
> tasks.
> >>>>
> >>>> We are also calling for more volunteers. Please reply to this email if
> >>>> you are interested, thanks!
> >>>>
> >>>> Best,
> >>>> Chenya
> >>>>
> >>>>
> >>>>
>


Re: Prepare for YuniKorn Nov 18th Meetup

2021-10-29 Thread Chenya Zhang
Hey devs,

This is the blurb for our meetup. Thanks to Wilfred to provide the details
<https://docs.google.com/document/d/1-NP0J22-Gp3cZ_hfKyA9htXJw7tlk-BmljF-7CBJg44/edit?usp=sharing>
!

Feedback welcome! Sending it out today.

*A Big Milestone for Apache YuniKorn (Incubating)**Time*
Nov. 18th, 4:30 - 6:00 pm PST*Location*
Online (TBD)*Topic*
Apache YuniKorn (Incubating) has released v0.11 earlier this year with a
number of new features and improvements like Gang scheduling, REST API
enhancements and Kubernetes 1.19 support. In a month, we are planning the
major v1.0.0 release with Kubernetes 1.20 & 1.21 support, improved node
sorting and numerous small fixes & enhancements! In this meetup, we will
deep dive into the implementation of Gang scheduling behind the use of
temporary placeholder pods on Kubernetes, significant performance
improvement with simplified scheduler core code and a new node sorting
algorithm, and a future roadmap of leveraging Kubernetes plugin
architecture.

On Sun, Oct 24, 2021 at 7:12 PM Chenya Zhang 
wrote:

> Sounds good, thanks Sunil!
>
> On Sun, Oct 24, 2021 at 4:21 PM Sunil Govindan  wrote:
>
>> Awesome. Thanks, Chenya.
>>
>> I think we are good with the initial task lists. We need more
>> participation, and I will reach to other meetup groups to send this
>> invite from their members as well.
>>
>> Thanks
>> Sunil
>>
>> On Sun, Oct 24, 2021 at 3:20 PM Chenya Zhang 
>> wrote:
>>
>>> Hi devs,
>>>
>>> I created an umbrella ticket to hold necessary work for our external
>>> meetup setup & broadcast. The upcoming one, as we discussed, is Nov 18th.
>>>
>>> => https://issues.apache.org/jira/browse/YUNIKORN-912
>>>
>>> Please take a look. Esp. Wilfred, Weiwei, Sunil, and Chaoran: Let me
>>> know if you have questions or suggestions. Feel free to create more tasks.
>>>
>>> We are also calling for more volunteers. Please reply to this email if
>>> you are interested, thanks!
>>>
>>> Best,
>>> Chenya
>>>
>>>
>>>


Re: Prepare for YuniKorn Nov 18th Meetup

2021-10-24 Thread Chenya Zhang
Sounds good, thanks Sunil!

On Sun, Oct 24, 2021 at 4:21 PM Sunil Govindan  wrote:

> Awesome. Thanks, Chenya.
>
> I think we are good with the initial task lists. We need more
> participation, and I will reach to other meetup groups to send this
> invite from their members as well.
>
> Thanks
> Sunil
>
> On Sun, Oct 24, 2021 at 3:20 PM Chenya Zhang 
> wrote:
>
>> Hi devs,
>>
>> I created an umbrella ticket to hold necessary work for our external
>> meetup setup & broadcast. The upcoming one, as we discussed, is Nov 18th.
>>
>> => https://issues.apache.org/jira/browse/YUNIKORN-912
>>
>> Please take a look. Esp. Wilfred, Weiwei, Sunil, and Chaoran: Let me know
>> if you have questions or suggestions. Feel free to create more tasks.
>>
>> We are also calling for more volunteers. Please reply to this email if
>> you are interested, thanks!
>>
>> Best,
>> Chenya
>>
>>
>>


Prepare for YuniKorn Nov 18th Meetup

2021-10-24 Thread Chenya Zhang
Hi devs,

I created an umbrella ticket to hold necessary work for our external meetup
setup & broadcast. The upcoming one, as we discussed, is Nov 18th.

=> https://issues.apache.org/jira/browse/YUNIKORN-912

Please take a look. Esp. Wilfred, Weiwei, Sunil, and Chaoran: Let me know
if you have questions or suggestions. Feel free to create more tasks.

We are also calling for more volunteers. Please reply to this email if you
are interested, thanks!

Best,
Chenya


[jira] [Created] (YUNIKORN-919) Create YuniKorn Youtube channel and upload past videos

2021-10-24 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-919:
-

 Summary: Create YuniKorn Youtube channel and upload past videos
 Key: YUNIKORN-919
 URL: https://issues.apache.org/jira/browse/YUNIKORN-919
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: community
Reporter: Chenya Zhang


Follow [https://www.youtube.com/] to create a dedicated Youtube channel for 
YuniKorn.

Upload all related past talks, meetups, tutorials to the channel.

Share the channel to yunikorn_dev@ list to invite more people to spread it in 
their network.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-918) Test YouTube Live to broadcast YuniKorn Meetup in Nov 2021

2021-10-24 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-918:
-

 Summary: Test YouTube Live to broadcast YuniKorn Meetup in Nov 2021
 Key: YUNIKORN-918
 URL: https://issues.apache.org/jira/browse/YUNIKORN-918
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: community
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Follow [https://www.youtube.com/] to create a live broadcast and test its 
functionality including connectivity, video quality, recording, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-917) Update website for YuniKorn Meetup in Nov 2021

2021-10-24 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-917:
-

 Summary: Update website for YuniKorn Meetup in Nov 2021
 Key: YUNIKORN-917
 URL: https://issues.apache.org/jira/browse/YUNIKORN-917
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: community
Reporter: Chenya Zhang
Assignee: Weiwei Yang


We can consolidate the current content in
[https://yunikorn.apache.org/community/sessions/] and rebuild it with the
following sections:

Coming Meetup session
 - speaker
 - abstract
 - date and time

Past sessions, conf talks
  - a list of past session recordings

And then add a section somewhere in the homepage for the "upcoming meetup".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-916) Create Twitter Post for YuniKorn Meetup in Nov 2021

2021-10-24 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-916:
-

 Summary: Create Twitter Post for YuniKorn Meetup in Nov 2021
 Key: YUNIKORN-916
 URL: https://issues.apache.org/jira/browse/YUNIKORN-916
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: community
Reporter: Chenya Zhang
Assignee: Sunil G


Use the "YuniKorn twitter handle" to create a post.

Share the post to yunikorn_dev@ list to invite more people to spread it in 
their network.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-915) Create LinkedIn Event for YuniKorn Meetup in Nov 2021

2021-10-24 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-915:
-

 Summary: Create LinkedIn Event for YuniKorn Meetup in Nov 2021
 Key: YUNIKORN-915
 URL: https://issues.apache.org/jira/browse/YUNIKORN-915
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: community
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Follow [https://www.linkedin.com/mynetwork/network-manager/events/] to create 
an event.

Share the event to yunikorn_dev@ list to invite more people to spread it in 
their professional network.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-914) Prepare an abstract of talk for YuniKorn meetup in Nov 2021

2021-10-24 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-914:
-

 Summary: Prepare an abstract of talk for YuniKorn meetup in Nov 
2021
 Key: YUNIKORN-914
 URL: https://issues.apache.org/jira/browse/YUNIKORN-914
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: community
Reporter: Chenya Zhang
Assignee: Wilfred Spiegelenburg


An abstract of 200 - 300 words for the talk which could share about features in 
the latest release, Gang scheduling, bin packing, node sorting improvement, etc.

It will be used to post on YuniKorn website, LinkedIn, Twitter, etc. to 
broadcast the upcoming event. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-913) Invite speaker and set date for YuniKorn Meetup on Nov 2021

2021-10-24 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang resolved YUNIKORN-913.
---
Resolution: Fixed

> Invite speaker and set date for YuniKorn Meetup on Nov 2021
> ---
>
> Key: YUNIKORN-913
> URL: https://issues.apache.org/jira/browse/YUNIKORN-913
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: community
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>
> Send email to yunikorn_dev@ list to call for speakers and confirm on 
> date/time.
> Note that for each monthly meetup, we could do one on PST morning and then 
> during the evening. This will help to cover all time zones as we have folks 
> from Budapest, India, China, East Asia, and Australia.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-913) [Umbrella] Apache YuniKorn External Meetup on 11/18/2021

2021-10-24 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-913:
-

 Summary: [Umbrella] Apache YuniKorn External Meetup on 11/18/2021
 Key: YUNIKORN-913
 URL: https://issues.apache.org/jira/browse/YUNIKORN-913
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: community
Reporter: Chenya Zhang
Assignee: Chenya Zhang


This is the umbrella ticket to hold all related work of Apache YuniKorn 
External Meetup on 11/18/2021. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-912) [Umbrella] Apache YuniKorn External Meetups

2021-10-24 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-912:
-

 Summary: [Umbrella] Apache YuniKorn External Meetups
 Key: YUNIKORN-912
 URL: https://issues.apache.org/jira/browse/YUNIKORN-912
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: community
Reporter: Chenya Zhang
Assignee: Chenya Zhang


This is the umbrella ticket to hold the related work of all Apache YuniKorn 
external meetups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: Broadcast YuniKorn to the Industry!

2021-10-22 Thread Chenya Zhang
+1 need a short description from Wilfred, thanks!

Meanwhile, I'm thinking about adding a new section "Meetup" under
"Community" on our website and also adding a section on our homepage for
the next event.

Do we have a process that can be shared to update our website?

Thanks,
Chenya

On Fri, Oct 22, 2021 at 11:42 AM Weiwei Yang  wrote:

> ACK'd, sounds great.
> We need to have a sharable message to adv this. @Wilfred Spiegelenburg
>  , could you pls share a short description of the
> session so we can put that in a message + planedDate + howToJoin and start
> to spread?
>
> On Fri, Oct 22, 2021 at 7:17 AM Chenya Zhang 
> wrote:
>
> > Wonderful, thanks Sunil and Chaoran for the input!
> >
> > Our first speaker will be Wilfred sharing about features in the latest
> > release, Gang scheduling, bin packing, node sorting improvement, etc.
> >
> > We will post an event from YuniKorn website, use LinkedIn meetup & posts
> to
> > attract more attendees, and YouTube for a live broadcast.
> >
> > If everyone agrees, we will finalize our 1st external meetup this year on
> > Nov 18th 4:30 - 6:00pm PST.
> >
> > Look forward to more ideas from everyone, and let’s get ready & excited!
> >
> > Cheers,
> > Chenya
> >
> >
> > On Thu, Oct 21, 2021 at 11:57 PM Chaoran Yu 
> > wrote:
> >
> > > Thanks Sunil for the input. Then let's do Nov 18th for the first event.
> > > 4:30-6:00pm PST should be a good start. We can gauge interest and then
> > > change time to cater to multiple time zones for alternating
> occurrences.
> > >
> > >
> > > On Mon, Oct 18, 2021 at 7:42 AM Sunil Govindan 
> > wrote:
> > >
> > > > Thanks, Charon. Yes, I second this.
> > > > Monthly events could be organized from Linkedin also. Meetup is not
> > much
> > > > used, I think.
> > > >
> > > > Let's start with November 18th, as you suggested. And each
> alternative
> > > > month, we can do one on PST morning and then during the evening.
> > > > This will help to cover all time zones as we have folks from
> Budapest,
> > > > India, China, East Asia, and Australia.
> > > >
> > > > On Fri, Oct 15, 2021 at 8:02 AM Chenya Zhang <
> > > chenyazhangche...@gmail.com>
> > > > wrote:
> > > >
> > > > > Sure, that sounds great!
> > > > >
> > > > > If tentatively circle Nov 18th 4:30 - 6:00pm PST for our 1st event,
> > how
> > > > > does this slot fit into your schedules?
> > > > >
> > > > > Besides speaker talks, we could also have panel or roundtable
> > > > discussions.
> > > > >
> > > > > To broadcast YuniKorn to more people, we could also find if there
> are
> > > > other
> > > > > highly popular meetups in the industry and do a joint talk or
> > > discussion.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Cheers,
> > > > > Chenya
> > > > >
> > > > > On Thu, Oct 14, 2021 at 10:03 PM Chaoran Yu <
> yuchaoran2...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Great, that works too. Chenya, do you want to schedule such a
> > monthly
> > > > > > meeting starting November? The first one can be during the week
> > > before
> > > > > > Thanksgiving. Once we finalize the schedule, we can think about
> > > > speakers
> > > > > > and topics.
> > > > > >
> > > > > > On Thu, Oct 14, 2021 at 9:49 PM Weiwei Yang 
> > wrote:
> > > > > >
> > > > > >> I am not aware of that sort of sponsorship. We can at least
> setup
> > > > > virtual
> > > > > >> meetings via Zoom.
> > > > > >> A monthly recurring meeting might be a good start : )
> > > > > >>
> > > > > >> On Thu, Oct 14, 2021 at 9:27 PM Chaoran Yu <
> > yuchaoran2...@gmail.com
> > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> > I second this proposal!
> > > > > >> >
> > > > > >> > Instead of a one-time event, we can do a recurring meetup
> > (monthly
> > > > or
> > > > > >> > bi-monthly) and invite YuniKorn develop

Re: [DISCUSS] release v1.0.0 planning

2021-10-22 Thread Chenya Zhang
Great to see that we are going to have a new 1.0.0 release!

I could help Chaoran and shadow this release. Would be interested in
becoming our next release manager but happy to take a fight when it comes.
:D

Will follow up on this thread for v1.0.0.

Best,
Chenya

On Fri, Oct 22, 2021 at 11:37 AM Weiwei Yang  wrote:

> Hi Craig
>
> I think officially supporting 3 major K8s versions should be enough, but we
> need to update to 1.19, 1.20, and 1.21.
> 1.18 is EoL per https://endoflife.date/kubernetes . If we continue our
> release cadence, most likely we can move 1 major release at a time and that
> aligns with the K8s releases.
>
> On Fri, Oct 22, 2021 at 9:29 AM Craig Condit 
> wrote:
>
> > Looking forward to 1.0 release as well, and thanks Chaoran for
> > volunteering to manage the release.
> >
> > Should we also update our supported version matrix to include 1.18, 1.19,
> > 1.20, and 1.21? Updating the e2e test matrix should be a one-line change,
> > and as part of the rebuild against 1.20, I verified that 1.21 is
> functional.
> >
> > Craig
> >
> > > On Oct 22, 2021, at 2:15 AM, Weiwei Yang  wrote:
> > >
> > > Sounds great, thank you Chaoran!
> > >
> > > On Thu, Oct 21, 2021 at 11:51 PM Chaoran Yu 
> > wrote:
> > >
> > >> Hey guys,
> > >>
> > >> I volunteer to be the release manager this time, if nobody else has
> > >> volunteered already.
> > >>
> > >> The proposed timeline sounds right to be. It will allow for ample time
> > for
> > >> stabilization and verification.
> > >>
> > >> Chaoran
> > >>
> > >>
> > >> On Thu, Oct 21, 2021 at 11:48 PM Weiwei Yang  wrote:
> > >>
> > >>> Hi Wilfred
> > >>>
> > >>> Thanks. Sounds good to me.
> > >>> Does anyone want to be the release manager for 1.0? The apache way
> > >>> encourages more people to get involved in the release process.
> > >>>
> > >>> On Thu, Oct 21, 2021 at 10:45 PM Wilfred Spiegelenburg <
> > >>> wilfr...@apache.org>
> > >>> wrote:
> > >>>
> >  Hi,
> > 
> >  We have been making big steps since our last release. Some major
> > >> changes
> >  have gone in and some are almost ready. The changes include:
> >  - rest API updates
> >  - new node storage and sorting
> >  - upgrade to a later K8s version as a build dependency
> >  - scheduler interface change
> > 
> >  Work is ongoing on generating a new set of performance figures. This
> >  includes documenting how to run our performance tests so we can
> repeat
> > >>> them
> >  when we want to.
> > 
> >  Based on all this work I would like to propose a 1.0.0 release to be
> > >>> ready
> >  for a vote by the incubator PMC by the start of December 2021. This
> > >> will
> >  give us some time to get the last fixes in and stabilise the
> release.
> > A
> >  release for us is still a multistep project:
> >  - fork and prepare the release
> >  - vote in the project
> >  - vote in the incubator PMC
> >  Looking back at the last release cycles that means we should have a
> > >> build
> >  ready for voting by 22 November on the dev list. Which would mean
> that
> > >> we
> >  fork the release at the latest in the second week of November.
> > 
> >  Please let me know if the timeline is too ambitious or not ambitious
> >  enough.
> > 
> >  Wilfred
> > 
> > >>>
> > >>
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> > For additional commands, e-mail: dev-h...@yunikorn.apache.org
> >
> >
>


Re: Broadcast YuniKorn to the Industry!

2021-10-22 Thread Chenya Zhang
Wonderful, thanks Sunil and Chaoran for the input!

Our first speaker will be Wilfred sharing about features in the latest
release, Gang scheduling, bin packing, node sorting improvement, etc.

We will post an event from YuniKorn website, use LinkedIn meetup & posts to
attract more attendees, and YouTube for a live broadcast.

If everyone agrees, we will finalize our 1st external meetup this year on
Nov 18th 4:30 - 6:00pm PST.

Look forward to more ideas from everyone, and let’s get ready & excited!

Cheers,
Chenya


On Thu, Oct 21, 2021 at 11:57 PM Chaoran Yu  wrote:

> Thanks Sunil for the input. Then let's do Nov 18th for the first event.
> 4:30-6:00pm PST should be a good start. We can gauge interest and then
> change time to cater to multiple time zones for alternating occurrences.
>
>
> On Mon, Oct 18, 2021 at 7:42 AM Sunil Govindan  wrote:
>
> > Thanks, Charon. Yes, I second this.
> > Monthly events could be organized from Linkedin also. Meetup is not much
> > used, I think.
> >
> > Let's start with November 18th, as you suggested. And each alternative
> > month, we can do one on PST morning and then during the evening.
> > This will help to cover all time zones as we have folks from Budapest,
> > India, China, East Asia, and Australia.
> >
> > On Fri, Oct 15, 2021 at 8:02 AM Chenya Zhang <
> chenyazhangche...@gmail.com>
> > wrote:
> >
> > > Sure, that sounds great!
> > >
> > > If tentatively circle Nov 18th 4:30 - 6:00pm PST for our 1st event, how
> > > does this slot fit into your schedules?
> > >
> > > Besides speaker talks, we could also have panel or roundtable
> > discussions.
> > >
> > > To broadcast YuniKorn to more people, we could also find if there are
> > other
> > > highly popular meetups in the industry and do a joint talk or
> discussion.
> > >
> > > Thoughts?
> > >
> > > Cheers,
> > > Chenya
> > >
> > > On Thu, Oct 14, 2021 at 10:03 PM Chaoran Yu 
> > > wrote:
> > >
> > > > Great, that works too. Chenya, do you want to schedule such a monthly
> > > > meeting starting November? The first one can be during the week
> before
> > > > Thanksgiving. Once we finalize the schedule, we can think about
> > speakers
> > > > and topics.
> > > >
> > > > On Thu, Oct 14, 2021 at 9:49 PM Weiwei Yang  wrote:
> > > >
> > > >> I am not aware of that sort of sponsorship. We can at least setup
> > > virtual
> > > >> meetings via Zoom.
> > > >> A monthly recurring meeting might be a good start : )
> > > >>
> > > >> On Thu, Oct 14, 2021 at 9:27 PM Chaoran Yu  >
> > > >> wrote:
> > > >>
> > > >> > I second this proposal!
> > > >> >
> > > >> > Instead of a one-time event, we can do a recurring meetup (monthly
> > or
> > > >> > bi-monthly) and invite YuniKorn developers and users to talk about
> > > their
> > > >> > experience and insights. A recurring event is well suited for
> > > >> cultivating a
> > > >> > community. I took a look at Meetup.com. They charge a monthly fee
> > for
> > > >> > organizing a meetup group (see attached). Does ASF have any
> programs
> > > >> that
> > > >> > can sponsor such an effort? Or we can explore other options as
> well.
> > > >> >
> > > >> > Chaoran
> > > >> >
> > > >> >
> > > >> > On Tue, Oct 12, 2021 at 6:22 PM Weiwei Yang 
> > wrote:
> > > >> >
> > > >> >> Hi Chenya
> > > >> >>
> > > >> >> Thanks. This is a great idea, definitely +1.
> > > >> >> A few suggestions:
> > > >> >>
> > > >> >>- We can call for volunteers for the 1st event
> > > >> >>- Once we have that decided, we need to spread this to a wider
> > > >> group of
> > > >> >>audiences, via email, twitter, LinkedIn, or ASF, etc
> > > >> >>- We need to publish the agenda & schedule of the next meeting
> > on
> > > >> our
> > > >> >>web-site
> > > >> >>- We need to log the event recording onto the yunikorn youtube
> > > >> channel
> > > >> >><https://www.youtube.com/channel/UCDSJ2z-l

Re: Broadcast YuniKorn to the Industry!

2021-10-15 Thread Chenya Zhang
Sure, that sounds great!

If tentatively circle Nov 18th 4:30 - 6:00pm PST for our 1st event, how
does this slot fit into your schedules?

Besides speaker talks, we could also have panel or roundtable discussions.

To broadcast YuniKorn to more people, we could also find if there are other
highly popular meetups in the industry and do a joint talk or discussion.

Thoughts?

Cheers,
Chenya

On Thu, Oct 14, 2021 at 10:03 PM Chaoran Yu  wrote:

> Great, that works too. Chenya, do you want to schedule such a monthly
> meeting starting November? The first one can be during the week before
> Thanksgiving. Once we finalize the schedule, we can think about speakers
> and topics.
>
> On Thu, Oct 14, 2021 at 9:49 PM Weiwei Yang  wrote:
>
>> I am not aware of that sort of sponsorship. We can at least setup virtual
>> meetings via Zoom.
>> A monthly recurring meeting might be a good start : )
>>
>> On Thu, Oct 14, 2021 at 9:27 PM Chaoran Yu 
>> wrote:
>>
>> > I second this proposal!
>> >
>> > Instead of a one-time event, we can do a recurring meetup (monthly or
>> > bi-monthly) and invite YuniKorn developers and users to talk about their
>> > experience and insights. A recurring event is well suited for
>> cultivating a
>> > community. I took a look at Meetup.com. They charge a monthly fee for
>> > organizing a meetup group (see attached). Does ASF have any programs
>> that
>> > can sponsor such an effort? Or we can explore other options as well.
>> >
>> > Chaoran
>> >
>> >
>> > On Tue, Oct 12, 2021 at 6:22 PM Weiwei Yang  wrote:
>> >
>> >> Hi Chenya
>> >>
>> >> Thanks. This is a great idea, definitely +1.
>> >> A few suggestions:
>> >>
>> >>- We can call for volunteers for the 1st event
>> >>- Once we have that decided, we need to spread this to a wider
>> group of
>> >>audiences, via email, twitter, LinkedIn, or ASF, etc
>> >>- We need to publish the agenda & schedule of the next meeting on
>> our
>> >>web-site
>> >>- We need to log the event recording onto the yunikorn youtube
>> channel
>> >><https://www.youtube.com/channel/UCDSJ2z-lEZcjdK27tTj_hGw>
>> >>- Can we create a website page to track all these events?
>> >>
>> >> Let's start with calling for speakers : )
>> >>
>> >> On Tue, Oct 12, 2021 at 5:57 PM Chenya Zhang <
>> chenyazhangche...@gmail.com
>> >> >
>> >> wrote:
>> >>
>> >> > Hi YuniKorn devs,
>> >> >
>> >> > We have achieved a lot together in the past year on major releases
>> >> > including exciting new features, important bug fixes, and multiple
>> >> > improvements!
>> >> >
>> >> > It could be a wonderful time for us to further broadcast YuniKorn to
>> the
>> >> > industry and welcome more people onboard.
>> >> >
>> >> > What could be some interesting topics and the best event time for
>> you?
>> >> >
>> >> > Some starters here:
>> >> > - Apache YuniKorn Gang Scheduling and Bin Packing?
>> >> > - Spark on K8s Autoscaling and Cost Efficiency?
>> >> > - Flink Stream Processing with YuniKorn?
>> >> > - TensorFlow Model Training with YuniKorn?
>> >> >
>> >> > Your thoughts? :)
>> >> >
>> >> > Best,
>> >> > Chenya
>> >> >
>> >> >
>> >> >
>> >>
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
>> > For additional commands, e-mail: dev-h...@yunikorn.apache.org
>>
>


Broadcast YuniKorn to the Industry!

2021-10-12 Thread Chenya Zhang
Hi YuniKorn devs,

We have achieved a lot together in the past year on major releases
including exciting new features, important bug fixes, and multiple
improvements!

It could be a wonderful time for us to further broadcast YuniKorn to the
industry and welcome more people onboard.

What could be some interesting topics and the best event time for you?

Some starters here:
- Apache YuniKorn Gang Scheduling and Bin Packing?
- Spark on K8s Autoscaling and Cost Efficiency?
- Flink Stream Processing with YuniKorn?
- TensorFlow Model Training with YuniKorn?

Your thoughts? :)

Best,
Chenya


Re: Design review request: (YUNIKORN-337) interface message complexity

2021-09-28 Thread Chenya Zhang
+1 on splitting the large message sending between Shim and Core and
processing them in a parallel way. Look forward to the performance
improvement!

On Tue, Sep 28, 2021 at 1:16 AM Wilfred Spiegelenburg 
wrote:

> Oops: something went wrong with this sentence:
>
> * k8shim: the proxy is pushed down all the way into the applications and
> > nodes. The tasks (allocations) leverage the context to call the api. We
> > thus have a widespread impact in the cache code as the call is not only
> > direct from the cache but the objects are generated in
>
>
> * k8shim: the proxy is pushed down all the way into the applications and
> nodes. The tasks (allocations) leverage the context to call the api. We
> thus have a widespread impact in the cache code as the call is not only
> direct from the cache context but from the objects that are part of the
> cache.
>
> On Tue, 28 Sept 2021 at 18:11, Wilfred Spiegelenburg 
> wrote:
>
> > Hi Weiwei,
> >
> > Mani and I had started thinking about this work a while ago. This is just
> > the start to make sure we have the basics covered.
> > There is further detail that needs to be worked out that is exactly why
> > the document was created and the mail send.
> >
> > On Tue, 28 Sept 2021 at 10:07, Weiwei Yang  wrote:
> >
> >> Hi Mani
> >>
> >> Thank you for starting this. Really nice to see proposals like this to
> >> further improve the efficiency and readability.
> >>
> >> As this is going to change the scheduler-interface pretty dramatically,
> I
> >> think we need to bring this to all dev's attention. Please help to
> review
> >> the doc and share your thoughts. Doc:
> >>
> >>
> https://docs.google.com/document/d/1qNGz7JgA5ArfFy5gLC_uhCsP-w9ii-pJoV-Fbh01awU/edit
> >> .
> >>
> >> Starting from some high-level questions
> >>
> >>1. Is this planned for 1.0.0 release? 1.0.0 will be released sometime
> >> in
> >>Nov or Dec this year.
> >>
> >
> > Yes updated the target release in the jira
> >
> >
> >>2. Have we done PoC to verify the size of this change? I assume we
> will
> >>need changes in both repos.
> >>
> >
> > There will be  changes in 3 repos:
> > * scheduler interface
> > * core
> > * k8shim
> >
> > The impact on each repository is different. Splitting this into the
> > interface change and message changes.
> > Interface changes:
> > * scheduler interface: as per the doc
> > * core: the si.UpdateRequest objects are unpacked and packed in the
> > context and RMproxy, The interface is implemented in the RMProxy only.
> > There should be no impact outside of that code of the interface change.
> > * k8shim: the proxy is pushed down all the way into the applications and
> > nodes. The tasks (allocations) leverage the context to call the api. We
> > thus have a widespread impact in the cache code as the call is not only
> > direct from the cache but the objects are generated in
> >
> > Message changes:
> > * core: the individual messages are unwrapped and passed down into the
> > partition at that point everything is converted into core objects. The
> > conversion routines will need updating but that is it. The "si" objects
> > were factored out as part of the cache removal.
> > * k8shim: all code uses the si_helper for the object creation except for
> > Allocations. That also seem to have filtered into the app manager
> >
> >3. Do we need an umbrella JIRA and a feature branch for this work?
> >>
> >
> > YUNIKORN-337 is the umbrella, changed it
> >
> >
> >>4. We need to cover the doc changes as well, such as
> >>http://yunikorn.apache.org/docs/next/design/scheduler_core_design
> >>
> >
> > Jira is logged and added as a task
> >
> >
> >> Apart from that, I have one more ask for this. Today, we have a callback
> >> interface: "ResourceManagerCallback". The shim registers a callback to
> the
> >> core via:
> >>
> >>
> https://github.com/apache/incubator-yunikorn-k8shim/blob/2507b8b68b1c385e67ac2afcd88626e890d2e268/pkg/shim/scheduler.go#L227
> >> .
> >> Is it possible to change this to an event handling model? I have some
> >> thoughts, could we do something like this:
> >> https://play.golang.org/p/Kibbq4QtRUf. This should work for both local
> vs
> >> gRPC communications (currently we are using the local, aka inner-process
> >> communication mode). The gRPC mode just requires us to implement a
> >> client-side lib to receive gRPC messages from the core, and trigger
> event
> >> handling based on the messages. This way I think it is clearer than the
> >> current approach. Would this make sense? I have also commented in the
> >> design doc.
> >>
> >
> > The gRPC interface has a separate definition.
> > For using gRPC there has to be a server in both the k8shim and the core.
> > The core can send messages to the shim and the other way around. The
> server
> > code cannot be used to send a new message, just a response to an incoming
> > message. The server and client code are generated using the "service
> > Scheduler" section with the rpc tags. So yes we 

[jira] [Created] (YUNIKORN-858) Pass node labels as node attributes for node sorting

2021-09-22 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-858:
-

 Summary: Pass node labels as node attributes for node sorting
 Key: YUNIKORN-858
 URL: https://issues.apache.org/jira/browse/YUNIKORN-858
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: shim - kubernetes
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Node labels need to be passed into "YuniKorn Core" via "YuniKorn Scheduler 
Interface" (YK SI) as node attributes. It then can be considered during node 
sorting.

In particular, to keep the protobuf value type of node attributes still string, 
we plan to marshal node labels (map[string]string) into a JSON string before 
passing to YK SI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-853) Update the value type of node attribute to support Any type

2021-09-17 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-853:
-

 Summary: Update the value type of node attribute to support Any 
type
 Key: YUNIKORN-853
 URL: https://issues.apache.org/jira/browse/YUNIKORN-853
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: scheduler-interface
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Node labels need to be passed into YuniKorn as node attributes for node 
sorting. The type of node labels is a map of string to string. 

The current value type of node attribute is string and cannot support a map. It 
can be updated to google.protobug.any: 
https://developers.google.com/protocol-buffers/docs/proto3#any



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-830) Node sorting need to support "preferredDuringScheduling" node affinity

2021-08-27 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-830:
-

 Summary: Node sorting need to support "preferredDuringScheduling" 
node affinity
 Key: YUNIKORN-830
 URL: https://issues.apache.org/jira/browse/YUNIKORN-830
 Project: Apache YuniKorn
  Issue Type: New Feature
  Components: core - scheduler
Reporter: Chenya Zhang
Assignee: Chenya Zhang


YuniKorn scheduler is able to allocate pods based on the node affinity policy 
"requiredDuringScheduling", but not "preferredDuringScheduling" yet.

YuniKorn currently does a full node sorting every time. After 
[https://github.com/apache/incubator-yunikorn-core/pull/307], we will do 
sorting incrementally with an ordered B-tree. It can make scheduling different 
pods to preferred nodes a bit complicated.

Discussed with [~cheersyang] [~yuchaoran2011] for a potential solution:
 * Parse node label’s from the shim and send that to the core over SI  (via 
node attributes).
 * Parse the pod node-affinity preference in the shim and send that to the core.
 * Implement something in GetSchedulableNodeIterator().. today we directly 
retrieve nodes from btree with asce order; we still loop the nodes once, but 
keep 2 list, one for preferred nodes; when we iterate nodes in the scheduling 
cycle, we iterate the preferred list first.

Adding [~wilfreds] [~kmarton] for more discussion too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: [ANNOUNCE] Release v0.11.0 available for download

2021-08-26 Thread Chenya Zhang
Exciting and thanks Kinga!


On Wed, Aug 25, 2021 at 10:44 PM Weiwei Yang  wrote:

> Congrats!
> Thanks for making this happen Kinga!
>
> On Tue, Aug 24, 2021 at 1:54 AM Julia Kinga Marton
>  wrote:
>
> > Hi, All,
> >
> > I am happy to announce that we have now updated the website and the new
> > release v0.11.0 is available for download.
> >
> > The release announcement is located here:
> > https://yunikorn.apache.org/release-announce/0.11.0
> >
> > Download of the source code and convenience images is part of the
> downloads
> > page here:
> > https://yunikorn.apache.org/community/download
> >
> > The corresponding helm chart is in the Artifact HUB:
> > https://artifacthub.io/packages/helm/yunikorn/yunikorn
> >
> > Thank you all for your work on this release!
> >
> > Regards,
> > Kinga
> >
>


[jira] [Created] (YUNIKORN-828) Add YuniKorn core's queue-level capacity metrics

2021-08-25 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-828:
-

 Summary: Add YuniKorn core's queue-level capacity metrics
 Key: YUNIKORN-828
 URL: https://issues.apache.org/jira/browse/YUNIKORN-828
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - common
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Queue-level capacity metrics are not implemented in code.

Users need to adjust the capacity threshold manually from any monitoring 
dashboard if their queue capacity changes.

It is hard for users to evaluate and demonstrate historical usage trend. Not a 
small amount of manual work is needed here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: [VOTE] Release Apache YuniKorn (incubating) 0.11.0 RC2

2021-08-03 Thread Chenya Zhang
Thanks Kinga and everyone!

+1 approve

+1 on facing the same issue as Wilfred mentioned

Verified:
* Built images from the source
* Ran unit tests
* Ran the scheduler in K8s 1.17 and 1.20. Verified that regular scheduling
works and gang scheduling works for Spark.
* Verified UI and REST APIs

Best,
Chenya

On Mon, Aug 2, 2021 at 9:52 PM Wilfred Spiegelenburg 
wrote:

> +1 binding
>
> One remark: we cannot use the link to the JIRA issues resolved as given
> above. Any link to the "issues.apache/YUNIKORN/versions/..." requires
> you to log into jira.
> We should not expect people that are just looking at an announcement or
> using the software to authenticate to jira to see the changes
> We should have an URL that does not require authentication like:
> https://issues.apache.org/jira/issues/?filter=12350521 as part of the
> message and website.
>
> Beside that remark:
> * checked signature and hash
> * checked all the licenses
> * build from scratch
> * ran the unit test
> * started scheduler and web interface in a K8s cluster
>
> Wilfred
>
> On Tue, 3 Aug 2021 at 05:05, Julia Kinga Marton 
> wrote:
>
> > Hi all,
> >
> > Even though the first RC passed this vote here, the IPMC found some
> > licensing issues (missing license headers), which we fixed in this second
> > RC.
> >
> > I'd like to call a vote for the second release candidate for Apache
> > YuniKorn (incubating) 0.11.0 release.
> >
> > The release artifacts are uploaded to:
> > https://dist.apache.org/repos/dist/dev/incubator/yunikorn/0.11/
> >
> > My public key is located here:
> > https://dist.apache.org/repos/dist/release/incubator/yunikorn/KEYS
> >
> > The release has been tagged with "v0.11.0" in all our git repositories.
> >
> > The JIRA issues that have been resolved in this release can be found on
> > this link:
> > https://issues.apache.org/jira/projects/YUNIKORN/versions/12350025
> >
> > Please review and vote. The vote will be open for at least 72 hours and
> > closes on Thursday, 5 August 2021, 12:00 PST.
> >
> > [ ] +1 approve
> > [ ] +0 no opinion
> > [ ] -1 disapprove (and the reason why)
> >
> >
> > Thank you,
> > Kinga
> >
>


[jira] [Resolved] (YUNIKORN-714) Refactor YuniKorn core's scheduler metrics for application submission

2021-06-30 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang resolved YUNIKORN-714.
---
Resolution: Fixed

PR: https://github.com/apache/incubator-yunikorn-core/pull/281

> Refactor YuniKorn core's scheduler metrics for application submission
> -
>
> Key: YUNIKORN-714
> URL: https://issues.apache.org/jira/browse/YUNIKORN-714
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - common
>    Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>  Labels: pull-request-available
>
> Defining "totalApplicationsAccepted" and "totalApplicationsRejected" to 
> initialize scheduler metrics is redundant in code.
>  * It can be combined into one "applicationSubmissions" metrics
>  * It can use a "prometheus.CounterVec" with different Prometheus labels



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-713) Align YuniKorn core's scheduler metrics with queue metrics for total applications accepted

2021-06-30 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang resolved YUNIKORN-713.
---
Resolution: Fixed

PR: https://github.com/apache/incubator-yunikorn-core/pull/280

> Align YuniKorn core's scheduler metrics with queue metrics for total 
> applications accepted
> --
>
> Key: YUNIKORN-713
> URL: https://issues.apache.org/jira/browse/YUNIKORN-713
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - common
>Reporter: Chenya Zhang
>    Assignee: Chenya Zhang
>Priority: Minor
>  Labels: pull-request-available
>
> The metrics naming and operations are sometimes confusing in the code due to 
> not aligning with each other on similar concepts.
>  * Replace "totalApplicationsAdded" with "totalApplicationsAccepted"
>  * Update related metrics operation functions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-722) Refactor YuniKorn core's queue-level resource metrics

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-722:
-

 Summary: Refactor YuniKorn core's queue-level resource metrics
 Key: YUNIKORN-722
 URL: https://issues.apache.org/jira/browse/YUNIKORN-722
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - common
Reporter: Chenya Zhang
Assignee: Chenya Zhang


To make YuniKorn core's queue resource related metrics and functions more 
meaningful:
 * Refactor and add related metrics operation functions
 * Use meaningful metrics naming and help messages 
 * Update in-line comments and documentations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-721) Refactor YuniKorn core's queue-level application metrics

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-721:
-

 Summary: Refactor YuniKorn core's queue-level application metrics
 Key: YUNIKORN-721
 URL: https://issues.apache.org/jira/browse/YUNIKORN-721
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - common
Reporter: Chenya Zhang
Assignee: Chenya Zhang


To make YuniKorn core's queue "app_metrics" and related functions more 
meaningful:
 * Differentiate queue level metrics with scheduler metrics, e.g. using 
"IncQueueTotalApplicationsAccepted"
 * Refactor related functions
 * Update in-line comments and documentations
 * Use meaningful metrics naming and help messages 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-720) Add and improve queue metrics throughout the scheduling cycle

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-720:
-

 Summary: Add and improve queue metrics throughout the scheduling 
cycle
 Key: YUNIKORN-720
 URL: https://issues.apache.org/jira/browse/YUNIKORN-720
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: core - common
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Add and improve queue metrics throughout the scheduling cycle



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-719) Refactor YuniKorn core's scheduler metrics for sorting latency

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-719:
-

 Summary: Refactor YuniKorn core's scheduler metrics for sorting 
latency
 Key: YUNIKORN-719
 URL: https://issues.apache.org/jira/browse/YUNIKORN-719
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - scheduler
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Defining "nodeSortingLatency", "appSortingLatency", and "queueSortingLatency" 
to initialize scheduler metrics is redundant in code.
 * It can be combined into "sortingLatency"
 * It can use a "prometheus.CounterVec" with different Prometheus labels



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-718) Redefine YuniKorn core's scheduler metrics names and help messages for usability and clarity

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-718:
-

 Summary: Redefine YuniKorn core's scheduler metrics names and help 
messages for usability and clarity
 Key: YUNIKORN-718
 URL: https://issues.apache.org/jira/browse/YUNIKORN-718
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - scheduler
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Some issues observed:
 * Metrics name is not meaningful
 * Help messages are lack of some details
 * Prometheus labels are not consistent

Need to improve for usability and clarity.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-717) Refactor YuniKorn core's scheduler metrics for node status

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-717:
-

 Summary: Refactor YuniKorn core's scheduler metrics for node status
 Key: YUNIKORN-717
 URL: https://issues.apache.org/jira/browse/YUNIKORN-717
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - scheduler
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Defining "totalNodeActive" and "totalNodeFailed" to initialize scheduler 
metrics is redundant in code.
 * It can be combined into "nodeStatus"
 * It can use a "prometheus.CounterVec" with different Prometheus labels



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-716) Refactor YuniKorn core's scheduler metrics for application running and completed

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-716:
-

 Summary: Refactor YuniKorn core's scheduler metrics for 
application running and completed
 Key: YUNIKORN-716
 URL: https://issues.apache.org/jira/browse/YUNIKORN-716
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - scheduler
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Defining "totalApplicationRunning" and "totalApplicationCompleted" to 
initialize scheduler metrics is redundant in code.
 * It can be combined into one "application" metrics
 * It can use a "prometheus.CounterVec" with different Prometheus labels



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-715) Refactor YuniKorn core's scheduler metrics for container allocation

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-715:
-

 Summary: Refactor YuniKorn core's scheduler metrics for container 
allocation
 Key: YUNIKORN-715
 URL: https://issues.apache.org/jira/browse/YUNIKORN-715
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - scheduler
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Defining "allocatedContainers", "rejectedContainers", "schedulingErrors" and 
"releasedContainers" to initialize scheduler metrics is redundant in code.
 * It can be combined into "containerAllocation"
 * It can use a "prometheus.CounterVec" with different Prometheus labels



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-714) Refactor YuniKorn core's scheduler metrics for application submission

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-714:
-

 Summary: Refactor YuniKorn core's scheduler metrics for 
application submission
 Key: YUNIKORN-714
 URL: https://issues.apache.org/jira/browse/YUNIKORN-714
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - scheduler
Reporter: Chenya Zhang
Assignee: Chenya Zhang


Defining "totalApplicationsAccepted" and "totalApplicationsRejected" to 
initialize scheduler metrics is redundant in code.

It can be combined into "applicationSubmissions" using a 
"prometheus.CounterVec" and different Prometheus labels.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-713) Align YuniKorn core's scheduler metrics with queue metrics for total applications accepted

2021-06-21 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-713:
-

 Summary: Align YuniKorn core's scheduler metrics with queue 
metrics for total applications accepted
 Key: YUNIKORN-713
 URL: https://issues.apache.org/jira/browse/YUNIKORN-713
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - scheduler
Reporter: Chenya Zhang
Assignee: Chenya Zhang


The metrics naming and operations are sometimes confusing in the code due to 
not aligning with each other on similar concepts.
 * Replace "totalApplicationsAdded" with "totalApplicationsAccepted"
 * Update related metrics operation functions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Reopened] (YUNIKORN-3) Add scheduling metrics throughout the scheduling cycle

2021-06-21 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang reopened YUNIKORN-3:
-

Need more improvement on the scheduling metrics part. Creating subtasks.

> Add scheduling metrics throughout the scheduling cycle
> --
>
> Key: YUNIKORN-3
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Chenya Zhang
>Priority: Critical
> Fix For: 0.11
>
>
> The current metrics collection is limited to a small number of collections 
> points.
> We need to add metric collections through out the process.
> See PR89 for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-647) Add new metrics to monitor pending applications: "long_pending_app"

2021-05-25 Thread Chenya Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenya Zhang closed YUNIKORN-647.
-
Resolution: Invalid

> Add new metrics to monitor pending applications: "long_pending_app"
> ---
>
> Key: YUNIKORN-647
> URL: https://issues.apache.org/jira/browse/YUNIKORN-647
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - common
>Reporter: Chenya Zhang
>Assignee: Chenya Zhang
>Priority: Major
>
> Based on our observation, if there is one application pending for more than a 
> threshold (e.g. 10 minutes), the scheduler is likely down.
> We would like to capture it for more timely alerting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-647) Add new metrics to monitor pending applications: "long_pending_app"

2021-04-20 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-647:
-

 Summary: Add new metrics to monitor pending applications: 
"long_pending_app"
 Key: YUNIKORN-647
 URL: https://issues.apache.org/jira/browse/YUNIKORN-647
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - common
Reporter: Chenya Zhang
Assignee: Chenya Zhang
 Fix For: 0.11


Based on our observation, if there is one application pending for more than a 
threshold (e.g. 10 minutes), the scheduler is likely down.

We would like to capture it for more timely alerting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-646) Metrics missing implementation: "allocating_latency_seconds"

2021-04-20 Thread Chenya Zhang (Jira)
Chenya Zhang created YUNIKORN-646:
-

 Summary: Metrics missing implementation: 
"allocating_latency_seconds"
 Key: YUNIKORN-646
 URL: https://issues.apache.org/jira/browse/YUNIKORN-646
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: core - common
Reporter: Chenya Zhang
 Fix For: 0.11


Observation: Container allocating latency stays at 0. The number of allocation 
attempts fluctuates normally.

Root cause analysis: This metrics is not fully implemented or the 
implementation is missed in recent releases. For example, 
{{ObserveSchedulingLatency()}} is currently not called when allocating 
containers. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org