I agree that the solution is poor in performance since a option file
re-read is required each time a job allocates/deallocates licenses.  But
I don't see the race condition. Could you please explain it?


在 2013-07-03三的 10:49 -0700,Gary Brown写道:
> We considered your suggested solution, but, unfortunately, it does not
> get rid of the race condition that exists between the time the
> scheduler is told how many licenses are available and when the job
> actually checks them out, during which time an external user can check
> out a license.  When this happens and the job attempts to check out
> its required licenses, the FlexLM license server denies the job the
> requested licenses, at which point the job fails.
> 
> No matter how fast one tries to keep the job scheduler updated, the
> race condition exists (especially if the job does not need the
> licenses until three hours after it started!) and when the user's job
> aborts and the job was in the queue for two weeks, the user is very
> irate and does not care why the job did not get the licenses.
> 
> I have found no way around this race condition dilemma, which means
> Flexera Software would have to modify the FlexLM license manager to
> adopt a new two-step model, which, of course, would mean all the ISVs
> must modify their software products to use the two-step model.  But
> the two-step model means users would need fewer licenses, which no ISV
> is willing to allow; hence, no ISV will adopt the two-step model, and
> Flexera Software has said it will not implement the two-step model due
> to "lack of demand".
> 
> The other problem with your suggested solution is it does not scale.
> 
> Gary D. Brown
> 
> 
> 
> 
>  
> On Tue, Jul 2, 2013 at 7:22 PM, 曹宏嘉 <[email protected]> wrote:
>         I thought about the reservation/commit model you mentioned.
>         Indeed FlexLM has a very simple support of license reservation
>         by project. My idea is as follows:
>         
>         1. in the vendor option file, reserve the number of licenses
>         (features, in term of flexlm) configured in SLURM. For
>         example, create a project name with a random string (in case a
>         user can easily guess it and use it to checkout licenses), and
>         reserve the proper number of licenses. This ensures that SLURM
>         has the configured licenses.
>         
>         2. on job resource allocation, create a project name according
>         to the job id, and reserved the allocated licenses to the
>         project. The application must set the environment LM_PROJECT
>         to the project name to checkout licenses.
>         
>         3. on job resource deallocation, the licenses reserved to the
>         project of the job is taken back(reservation in vendor option
>         file deleted and lmreread executed). 
>         
>         This is not a very good approach because a user may cheat by
>         guessing LM_PROJECT environment variables. It could work
>         if the users are all well behaved. 
>         
>         
>                 -----原始邮件-----
>                 发件人: "Gary Brown" <[email protected]>
>                 发送时间: 2013-07-03 00:06:15 (星期三)
>                 收件人: slurm-dev <[email protected]>
>                 抄送: 
>                 主题: [slurm-dev] Re: slurm integration with FlexLM
>                 license manager
>                 
>                 
>                 Three years ago I tried to work with Flexera Software
>                 (FlexLM) to resolve race conditions that arose between
>                 a scheduler and FlexLM because the FlexLM license
>                 manager was also serving licenses to external users;
>                 i.e., the scheduler was not the only one trying to
>                 obtain licenses.
>                 I proposed a "reservation/commit" model similar to
>                 that used by the credit card industry to handle
>                 charges where a retail establishment will obtain an
>                 "authorization" for a specific amount, which the
>                 credit card system "reserves" against a customer's
>                 credit limit, and then when the retail establishment
>                 "settles" the charge, the reserved amount is actually
>                 added to the customer's credit card balance and the
>                 "authorization" deleted.  This would properly handle
>                 the situation where a scheduler "reserves" licenses
>                 through FlexLM and the a running job actually "checks
>                 out" the reserved licenses.
>                 Despite the company and product names, Flexera was
>                 completely inflexible and would not do anything in
>                 this direction since its customers, the Independent
>                 Software Vendors (ISVs), would actually sell fewer
>                 software licenses under this model, which is what
>                 users actually want, and Flexera's customers would
>                 take a very dim view of Flexera if it implemented this
>                 model.  No logic (cloud model also needs this),
>                 cajoling, or begging would get Flexera to budge.
>                 I do not know if Flexera has done anything to resolve
>                 the issue of race conditions between when a scheduler
>                 tries to schedule licenses and when a job actually
>                 checks the licenses out during which interval an
>                 external user checks out licenses unbeknownst to the
>                 scheduler, but I suspect they have done nothing.
>                 If anyone hears of anything different, I, for one,
>                 would be happy to know.
>                  
>                 Gary D. Brown
>                 
>                 
>                 On Tue, Jul 2, 2013 at 8:38 AM, David Bigagli
>                 <[email protected]> wrote:
>                         Indeed currently there is no integration
>                         between Flexlm and SLURM, but some ideas are
>                         being passed around what to do about it. I am
>                         one of the original designers and developers
>                         of Platform License Scheduler.
>                         
>                         
>                         The item 1) you mentioned is certainly the
>                         first step but consider even that may not be
>                         easy, just imagine an electronic design
>                         application that is running in the cluster and
>                         jobs checking in and out hundreds of features
>                         per second. It is important to choose which
>                         features has to be managed by the scheduler
>                         and it has to be 'well behaved' one, meaning
>                         the behavior of the application from license
>                         perspective has to be well know. One of the
>                         difficulties is to understand how the
>                         application uses the licenses as you observed
>                         in item 2).
>                         
>                         
>                         The only way to get license information out of
>                         Flexlm is indeed lmstat, which could be quite
>                         slow if the license servers and handling many
>                         applications there is no other supported
>                         interface, a possible alternative could be
>                         parsing the lmgrd log file.
>                         
>                         
>                         
>                         
>                         /David
>                         
>                         
>                         
>                         On Tue, Jul 2, 2013 at 2:57 PM, Hongjia Cao
>                         <[email protected]> wrote:
>                                 
>                                 I don't think there is integration
>                                 with FlexLM in SLURM. There is a
>                                 simple license management in SLURM by
>                                 counting the licenses used.
>                                 
>                                 I am also considering the interaction
>                                 between SLURM and FlexLM, but I
>                                 have no good result yet. The
>                                 difficulty is that FlexLM has no open
>                                 API
>                                 (except for a command line tool
>                                 lmutil). And the function provided by
>                                 FlexLM is not enough for SLURM to
>                                 totally controlling the licenses. For
>                                 now, I think the following issues
>                                 should be addressed:
>                                 
>                                 1. Keep the license count in SLURM
>                                 consistent with FlexLM. There may be
>                                 applications run out of SLURM which
>                                 may check out licenses. And  a job
>                                 may request wrong number of licenses
>                                 (intentionally or unintentionally).
>                                 
>                                 2. Force a job to release the licenses
>                                 on job termination, even if there
>                                 are job processes not killed. With
>                                 LS-DYNA I have run into the case that
>                                 after the application completes, the
>                                 licenses will not be released until
>                                 a long time period (even with out job
>                                 processes left). LS-DYNA is not
>                                 using FlexLM for license control and I
>                                 am not sure whether this could
>                                 happen for FlexLM managed
>                                 applications.
>                                 
>                                 To handle various applications and the
>                                 licenses managers, a license
>                                 plug-in should be introduced. But the
>                                 interface of the plug-in is not
>                                 clear yet.
>                                 
>                                 I'd like to know if anyone has
>                                 experiences with SLURM integration
>                                 with
>                                 FlexLM or other license managers. Any
>                                 requirements or considerations
>                                 would also be welcomed.
>                                 
>                                 
>                                 在 2013-07-01一的 17:17 -0700,Eva
>                                 Hocks写道:
>                                 >
>                                 >
>                                 > The documentation announced the
>                                 integration since 2.4. I am running
>                                 > slurm 2.4.3.
>                                 >
>                                 > Could anyone please point me to
>                                 where I can find how to onfigure the
>                                 > FlexLM license manager integration
>                                 with slurm?
>                                 >
>                                 >
>                                 > Thanks
>                                 > Eva
>                                 
>                         
>                         
>                 
>                 
> 
> 

Reply via email to