We considered your suggested solution, but, unfortunately, it does not get rid of the race condition that exists between the time the scheduler is told how many licenses are available and when the job actually checks them out, during which time an external user can check out a license. When this happens and the job attempts to check out its required licenses, the FlexLM license server denies the job the requested licenses, at which point the job fails.
No matter how fast one tries to keep the job scheduler updated, the race condition exists (especially if the job does not need the licenses until three hours after it started!) and when the user's job aborts and the job was in the queue for two weeks, the user is very irate and does not care why the job did not get the licenses. I have found no way around this race condition dilemma, which means Flexera Software would have to modify the FlexLM license manager to adopt a new two-step model, which, of course, would mean all the ISVs must modify their software products to use the two-step model. But the two-step model means users would need fewer licenses, which no ISV is willing to allow; hence, no ISV will adopt the two-step model, and Flexera Software has said it will not implement the two-step model due to "lack of demand". The other problem with your suggested solution is it does not scale. Gary D. Brown On Tue, Jul 2, 2013 at 7:22 PM, 曹宏嘉 <[email protected]> wrote: > I thought about the reservation/commit model you mentioned. Indeed > FlexLM has a very simple support of license reservation by project. My idea > is as follows: > > 1. in the vendor option file, reserve the number of licenses (features, in > term of flexlm) configured in SLURM. For example, create a project name > with a random string (in case a user can easily guess it and use it to > checkout licenses), and reserve the proper number of licenses. This ensures > that SLURM has the configured licenses. > > 2. on job resource allocation, create a project name according to the job > id, and reserved the allocated licenses to the project. The application > must set the environment LM_PROJECT to the project name to checkout > licenses. > > 3. on job resource deallocation, the licenses reserved to the project of > the job is taken back(reservation in vendor option file deleted and > lmreread executed). > > This is not a very good approach because a user may cheat by guessing > LM_PROJECT environment variables. It could work if the users are all well > behaved. > > -----原始邮件----- > *发件人:* "Gary Brown" <[email protected]> > *发送时间:* 2013-07-03 00:06:15 (星期三) > *收件人:* slurm-dev <[email protected]> > *抄送:* > *主题:* [slurm-dev] Re: slurm integration with FlexLM license manager > > > Three years ago I tried to work with Flexera Software (FlexLM) to resolve > race conditions that arose between a scheduler and FlexLM because the > FlexLM license manager was also serving licenses to external users; i.e., > the scheduler was not the only one trying to obtain licenses. > I proposed a "reservation/commit" model similar to that used by the credit > card industry to handle charges where a retail establishment will obtain an > "authorization" for a specific amount, which the credit card system > "reserves" against a customer's credit limit, and then when the retail > establishment "settles" the charge, the reserved amount is actually added > to the customer's credit card balance and the "authorization" deleted. > This would properly handle the situation where a scheduler "reserves" > licenses through FlexLM and the a running job actually "checks out" the > reserved licenses. > Despite the company and product names, Flexera was completely inflexible > and would not do anything in this direction since its customers, the > Independent Software Vendors (ISVs), would actually sell fewer software > licenses under this model, which is what users actually want, and > Flexera's customers would take a very dim view of Flexera if it implemented > this model. No logic (cloud model also needs this), cajoling, or begging > would get Flexera to budge. > I do not know if Flexera has done anything to resolve the issue of race > conditions between when a scheduler tries to schedule licenses and when a > job actually checks the licenses out during which interval an external user > checks out licenses unbeknownst to the scheduler, but I suspect they have > done nothing. > If anyone hears of anything different, I, for one, would be happy to know. > > Gary D. Brown > > On Tue, Jul 2, 2013 at 8:38 AM, David Bigagli <[email protected]> wrote: > >> Indeed currently there is no integration between Flexlm and SLURM, but >> some ideas are being passed around what to do about it. I am one of the >> original designers and developers of Platform License Scheduler. >> >> The item 1) you mentioned is certainly the first step but consider even >> that may not be easy, just imagine an electronic design application that is >> running in the cluster and jobs checking in and out hundreds of features >> per second. It is important to choose which features has to be managed by >> the scheduler and it has to be 'well behaved' one, meaning the behavior of >> the application from license perspective has to be well know. One of the >> difficulties is to understand how the application uses the licenses as you >> observed in item 2). >> >> The only way to get license information out of Flexlm is indeed lmstat, >> which could be quite slow if the license servers and handling many >> applications there is no other supported interface, a possible alternative >> could be parsing the lmgrd log file. >> >> >> */David* >> >> >> On Tue, Jul 2, 2013 at 2:57 PM, Hongjia Cao <[email protected]> wrote: >> >>> >>> I don't think there is integration with FlexLM in SLURM. There is a >>> simple license management in SLURM by counting the licenses used. >>> >>> I am also considering the interaction between SLURM and FlexLM, but I >>> have no good result yet. The difficulty is that FlexLM has no open API >>> (except for a command line tool lmutil). And the function provided by >>> FlexLM is not enough for SLURM to totally controlling the licenses. For >>> now, I think the following issues should be addressed: >>> >>> 1. Keep the license count in SLURM consistent with FlexLM. There may be >>> applications run out of SLURM which may check out licenses. And a job >>> may request wrong number of licenses (intentionally or unintentionally). >>> >>> 2. Force a job to release the licenses on job termination, even if there >>> are job processes not killed. With LS-DYNA I have run into the case that >>> after the application completes, the licenses will not be released until >>> a long time period (even with out job processes left). LS-DYNA is not >>> using FlexLM for license control and I am not sure whether this could >>> happen for FlexLM managed applications. >>> >>> To handle various applications and the licenses managers, a license >>> plug-in should be introduced. But the interface of the plug-in is not >>> clear yet. >>> >>> I'd like to know if anyone has experiences with SLURM integration with >>> FlexLM or other license managers. Any requirements or considerations >>> would also be welcomed. >>> >>> >>> 在 2013-07-01一的 17:17 -0700,Eva Hocks写道: >>> > >>> > >>> > The documentation announced the integration since 2.4. I am running >>> > slurm 2.4.3. >>> > >>> > Could anyone please point me to where I can find how to onfigure the >>> > FlexLM license manager integration with slurm? >>> > >>> > >>> > Thanks >>> > Eva >>> >>> >> >
