Am 07.12.2011 um 13:28 schrieb William Hay:

> On 7 December 2011 11:24, Reuti <re...@staff.uni-marburg.de> wrote:
>> Am 07.12.2011 um 09:33 schrieb William Hay:
>> 
>>> On 6 December 2011 16:44, Reuti <re...@staff.uni-marburg.de> wrote:
>>>> Am 06.12.2011 um 17:32 schrieb William Hay:
>>>> 
>>>>> On 6 December 2011 16:03, Reuti <re...@staff.uni-marburg.de> wrote:
>>>>>> Am 06.12.2011 um 17:01 schrieb William Hay:
>>>>>> 
>>>>>>> On 6 December 2011 13:10, Reuti <re...@staff.uni-marburg.de> wrote:
>>>>>>>> Am 06.12.2011 um 12:16 schrieb William Hay:
>>>>>>>> 
>>>>>>>>> On 6 December 2011 09:48, Reuti <re...@staff.uni-marburg.de> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> Am 06.12.2011 um 10:04 schrieb William Hay:
>>>>>>>>>> 
>>>>>>>>>>> 128 > 2147483648
>>>>>>>>>>> 
>>>>>>>>>>> A user has submitted a job requesting 128 slots:
>>>>>>>>>>> qstat -j producing the following output:
>>>>>>>>>>> parallel environment:  qlc-[1ABCDEFGHIJTWKLMNOPX] range: 128
>>>>>>>> 
>>>>>>>> How are you submitting this? You can use wildcards, but not a regular 
>>>>>>>> expression.
>>> Not according to the qsub manpage which implies only a literal pe_name
>>> ie wildcards are verbotten.
>> 
>> Well, `man qsub` refers to `man sge_types` and there under "parallel_env" 
>> wildcards are mentioned.
>> 
>> 
>>>>>>> <snip>
>>>> 
>>>> Whoa, then I wonder whether this is an undocumented feature, as it's not 
>>>> possible for me to enter such a PE name on the command line in 6.2u5.
>>> I believe it is as I can't enter it at the qsub command line either.
>>> I fix it up in the jsv.  I believe you can qalter the PE to include
>>> square brackets though which works.
>> 
>> It works in a different way than `qsub`, i.e. a different error message, but 
>> it's in the end not working too.
>> 
>> $ qalter -pe "smp[abc]" 12 3590
>> denied: parallel environment "smp[abc]" does not exist
>> 
>> Sure, it should be rejected not searched for.
>> 
>> 
>>>   The man page says the -pe option
>>> accepts a parallel_environment and the commentary implies that this is
>>> a pe_name.
>> 
>> In `man sge_types` it says "pe_name := object_name" which being:
>> 
>>   object_name
>>       An  object name is a sequence of up to 512 ASCII string characters 
>> except "\n", "\t", "\r", " ", "/", ":", "ยด", "\", "[",
>>       "]", "{", "}", "|", "(", ")", "@", "%", "," or the " character itself.
>> 
>> 
>>>  However while qsub enforces this (excluding square
>>> brackets) the scheduler interprets it as a wc_pe_name
>>> which permits them.
>> 
>> But you entered them with a trick, at least it all should work consistently.
>> 
>> - a JSV allows to enter a "wc_pe_name", which is not allowed for `qsub`or 
>> `qalter`
>> - `qalter` takes in the specified PE literally the [ or ], it should be 
>> rejected as with `qsub` or both should handle it in the same way
>> - documentation should be more consistent. A "wc_pe_name" is defined in `man 
>> sge_types` but nowhere allowed to be entered (besides the JSV trick)
>> -  "wc_pe_name" allows a wildcard * anywhere (defined under "pattern"), but 
>> `qsub` or `qalter` allow it only at the end of the string.
>> 
>> 
>>>  The ? and * characters are valid in either
>>> pe_name and wc_pe_name but with different meanings (literal in the
>>> first, metacharacters in the second).
>> 
>> Where is it used literally? In both cases it's used to match zero or 
>> infinity characters. But for a PE only at end of the string.
> Well by implication when pe_name is equated to object type neither of
> which define a special meaning to any characters.  I read the -pe
> switch as taking a pe_name since it passes a parameter of that name to
> the JSV.

The JSV should test the parameter according to the constraints for a PE name, 
what is obviously not happening.


>>> However that isn't the problem.  The wildcard PE works for every other
>>> job.  The oddity is that this job doesn't get a reservation when other
>>> jobs with lower priority and less restrictive resource requirements
>>> (and similar wildcard PEs) do and that qalter claims these PEs have
>>> "only" 2 billion slots which is apparently not enough for a 128 slot
>>> job.  While I suppose it is legal for use of an undocumented feature
>>> to cause demons to fly out of my nose I doubt that is what is
>>> happening here.
>> 
>> Can you try to create a copy of the job with `qresub` and change for the 
>> copy the resource requests like a time limit. Any change?
> It's not my job as such I'll need to get the user's permission to
> impersonate them.  I tried resub as a manager and it complains that
> I'm not a member of the relevant project.

Hehe, first I thought `qresub -w n <job_id>` would do it, but it's not an 
option to `qalter`and "-w n" has no effect for `qsub`. Looks like a different 
bug/feature. It should accept the job, as you can be added to the project later 
on. For now you can submit jobs, get removed from the project, and the jobs 
will still being scheduled. It's only tested at submission time, never ever 
again later on.

https://arc.liv.ac.uk/trac/SGE/ticket/1384

But it's a different issue.


>> The error happened already in the past, but the discussion threads closed 
>> before they spotted the real cause:
>> 
>> http://arc.liv.ac.uk/pipermail/gridengine-users/2009-February/thread.html#23435
>> 
>> (search for 2147483648).
> Well there is a slight distinction.  That bug report appears to be
> about pe's with only -2147483648 slots available while mine is

Aha, yes right. I missed it. The positive range should only go up to one less , 
i.e. 2147483647 for a signed 32 bit value. Or maybe you get a 64 bit output of 
an original 32 bit signed value, hence the same in the end again.

-- Reuti


> complaining about a positive number of slots available which makes
> even less sense.
>> 
>> -- Reuti
>> 
> 


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to