Developers at Bull have been discussing a SLURM license management project with SchedMD. The first phase of this project would provide initial support of licenses for accounting purposes within SLURM. This would involve the introduction of new license tables in the accounting information. The current thought is to define both a system_license_table along with a new cluster_license_table for each cluster. Initially these tables will be populated using sacctmgr. A database query for the cluster would obtain both the specific information for the cluster and also the associated system license information for licenses available for use by that cluster. Taking this approach makes license information available for assoc_mgr and slurmctld. It lays the groundwork for managing licenses within associations, which provides a SLURM administrator the capability of restricting license usage for users/accounts . Phase two of this development effort introduces the integration of SLURM with license managers. Developing this capability involves defining the interfaces with the license managers and providing the communication protocols between SLURM and the license managers. At this point specific details for this effort have not been defined. We are glad to see the interest the SLURM development community has recently expressed regarding this topic. If you have already begun development of modifications to SLURM license management we are willing to collaborate in this effort. Please continue posting your ideas and concerns in this forum.
Best Regards, Bill Brophy Bull Worldwide Information Systems -----Gary Brown <[email protected]> wrote: ----- To: "slurm-dev" <[email protected]>From: Gary Brown <[email protected]>Date: 07/05/2013 03:07PMSubject: [slurm-dev] Re: slurm integration with FlexLM license manager The first paragraph of the email to which you responded contains the race condition description. I will reproduce it here with some modifications to make it clearer. We considered your suggested solution, but, unfortunately, it does not get rid of *the race condition that exists between the time the scheduler is told how many licenses are available and when the job actually checks them out, during which time external user(s) check out sufficient licenses that the job cannot receive the licenses it requested of the scheduler.* When this happens and the job attempts to check out its required licenses from FlexLM, the FlexLM license server denies the job the requested licenses, at which point the job fails. Is that clearer? Gary D. Brown On Wed, Jul 3, 2013 at 5:13 PM, Hongjia Cao <[email protected]> wrote: I agree that the solution is poor in performance since a option filere-read is required each time a job allocates/deallocates licenses. ButI don't see the race condition. Could you please explain it?在 2013-07-03三的 10:49 -0700,Gary Brown写道: > We considered your suggested solution, but, unfortunately, it does not> get > rid of the race condition that exists between the time the> scheduler is told > how many licenses are available and when the job> actually checks them out, > during which time an external user can check> out a license. When this > happens and the job attempts to check out> its required licenses, the FlexLM > license server denies the job the> requested licenses, at which point the job > fails.>> No matter how fast one tries to keep the job scheduler updated, the> > race condition exists (especially if the job does not need the> licenses > until three hours after it started!) and when the user's job> aborts and the > job was in the queue for two weeks, the user is very> irate and does not care > why the job did not get the licenses.>> I have found no way around this race > condition dilemma, which means> Flexera Software would have to modify the > FlexLM license manager to> adopt a new two-step model, which, of course, > would mean all the ISVs> must modify their software products to use the > two-step model. But> the two-step model means users would need fewer > licenses, which no ISV> is willing to allow; hence, no ISV will adopt the > two-step model, and> Flexera Software has said it will not implement the > two-step model due> to "lack of demand".>> The other problem with your > suggested solution is it does not scale.>> Gary D. Brown>>>>>> On Tue, Jul 2, > 2013 at 7:22 PM, 曹宏嘉 <[email protected]> wrote:> I thought about the > reservation/commit model you mentioned.> Indeed FlexLM has a very > simple support of license reservation> by project. My idea is as > follows:>> 1. in the vendor option file, reserve the number of > licenses> (features, in term of flexlm) configured in SLURM. For> > example, create a project name with a random string (in case a> > user can easily guess it and use it to checkout licenses), and> > reserve the proper number of licenses. This ensures that SLURM> has > the configured licenses.>> 2. on job resource allocation, create a > project name according> to the job id, and reserved the allocated > licenses to the> project. The application must set the environment > LM_PROJECT> to the project name to checkout licenses.>> 3. on > job resource deallocation, the licenses reserved to the> project of > the job is taken back(reservation in vendor option> file deleted and > lmreread executed).>> This is not a very good approach because a user > may cheat by> guessing LM_PROJECT environment variables. It could > work> if the users are all well behaved.>>> > -----原始邮件-----> 发件人: "Gary Brown" > <[email protected]>> 发送时间: 2013-07-03 00:06:15 > (星期三)> 收件人: slurm-dev <[email protected]>> > 抄送:> 主题: [slurm-dev] Re: slurm integration with FlexLM> > license manager>>> Three years ago I tried to work > with Flexera Software> (FlexLM) to resolve race conditions > that arose between> a scheduler and FlexLM because the FlexLM > license> manager was also serving licenses to external > users;> i.e., the scheduler was not the only one trying to> > obtain licenses.> I proposed a > "reservation/commit" model similar to> that used by the > credit card industry to handle> charges where a retail > establishment will obtain an> "authorization" for a specific > amount, which the> credit card system "reserves" against a > customer's> credit limit, and then when the retail > establishment> "settles" the charge, the reserved amount is > actually> added to the customer's credit card balance and > the> "authorization" deleted. This would properly handle> > the situation where a scheduler "reserves" licenses> > through FlexLM and the a running job actually "checks> > out" the reserved licenses.> Despite the company and product > names, Flexera was> completely inflexible and would not do > anything in> this direction since its customers, the > Independent> Software Vendors (ISVs), would actually sell > fewer> software licenses under this model, which is what> > users actually want, and Flexera's customers would> > take a very dim view of Flexera if it implemented this> > model. No logic (cloud model also needs this),> cajoling, or > begging would get Flexera to budge.> I do not know if Flexera > has done anything to resolve> the issue of race conditions > between when a scheduler> tries to schedule licenses and when > a job actually> checks the licenses out during which interval > an> external user checks out licenses unbeknownst to the> > scheduler, but I suspect they have done nothing.> > If anyone hears of anything different, I, for one,> would be > happy to know.>> Gary D. Brown>>> On Tue, Jul > 2, 2013 at 8:38 AM, David Bigagli> <[email protected]> > wrote:> Indeed currently there is no integration> > between Flexlm and SLURM, but some ideas are> > being passed around what to do about it. I am> > one of the original designers and developers> of > Platform License Scheduler.>>> The item 1) you > mentioned is certainly the> first step but consider > even that may not be> easy, just imagine an > electronic design> application that is running in the > cluster and> jobs checking in and out hundreds of > features> per second. It is important to choose > which> features has to be managed by the scheduler> > and it has to be 'well behaved' one, meaning> > the behavior of the application from license> > perspective has to be well know. One of the> > difficulties is to understand how the> application > uses the licenses as you observed> in item 2).>>> > The only way to get license information out of> > Flexlm is indeed lmstat, which could be quite> > slow if the license servers and handling many> > applications there is no other supported> interface, > a possible alternative could be> parsing the lmgrd > log file.>>>>> /David>>>> On > Tue, Jul 2, 2013 at 2:57 PM, Hongjia Cao> > <[email protected]> wrote:>> I don't think > there is integration> with FlexLM in SLURM. > There is a> simple license management in > SLURM by> counting the licenses used.>> > I am also considering the interaction> > between SLURM and FlexLM, but I> > have no good result yet. The> > difficulty is that FlexLM has no open> API> > (except for a command line tool> > lmutil). And the function provided by> > FlexLM is not enough for SLURM to> > totally controlling the licenses. For> now, I > think the following issues> should be > addressed:>> 1. Keep the license count in > SLURM> consistent with FlexLM. There may be> > applications run out of SLURM which> > may check out licenses. And a job> > may request wrong number of licenses> > (intentionally or unintentionally).>> 2. > Force a job to release the licenses> on job > termination, even if there> are job processes > not killed. With> LS-DYNA I have run into the > case that> after the application completes, > the> licenses will not be released until> > a long time period (even with out job> > processes left). LS-DYNA is not> > using FlexLM for license control and I> > am not sure whether this could> happen for > FlexLM managed> applications.>> > To handle various applications and the> > licenses managers, a license> > plug-in should be introduced. But the> > interface of the plug-in is not> clear yet.>> > I'd like to know if anyone has> > experiences with SLURM integration> > with> FlexLM or other license > managers. Any> requirements or > considerations> would also be welcomed.>>> > 在 2013-07-01一的 17:17 -0700,Eva> > Hocks写道:> >> > >> > The documentation > announced the> integration since 2.4. I am > running> > slurm 2.4.3.> > >> > Could anyone please point me > to> where I can find how to onfigure the> > > FlexLM license manager integration> > with slurm?> >> > >> > Thanks> > > Eva>>>>>>>
