Hi,

I don't keep track of the bugs fixed by other forks (Univa Grid Engine
or Son of Grid Engine), and thus I can't provide support for other
forks (even for my paying clients). One reason is that Open Grid
Scheduler is independent of other efforts and thus we should not be
copycats and dig into their changes, and the other reason is that it
is just too much work if I need to maintain yet another fork.

I believe Univa's "ge-8.0.0alpha*" went out in 2011, so you can go to
its changelog and see if it included the vmem bug fix.

- For Open Grid Scheduler, the memory bug was fixed almost a year ago:

    RH-2010-09-09-0: Bugfix:      vmem error workaround

- And coding for the hwloc enhancement was mostly done in early April.

It is a bit hard for me to go back to code that I worked on much a
long time ago. While there are less changes going into the Open Grid
Scheduler, most of the fixed bugs were encountered by our real users,
and we know what we need to fix and what is important to add.

Rayson



On Mon, Aug 8, 2011 at 1:33 PM, William Deegan
<[email protected]> wrote:
> Rayson,
>
> I initially installed this grid from the binaries here:
> http://bioteam.net/dag/gridengine-courtesy-binaries/
>
> The ge-8.0.0alpha*
>
> How can I tell definitively if I have this problem on my install?
> I am seeing some maxvmem values over 4GB. But none over 8GB, which is not
> expected.
> I would expect to see some at 15GB and higher.
>
> Does this stat get the max for the process tree launched by gridengine?
>
> Thanks,
> Bill
>
>
> On 8/2/2011 9:58 AM, Rayson Ho wrote:
>>
>> It's a bug introduced by another bug fix in SGE 6.2u5, and Oracle was
>> first who fixed the bug in Oracle Grid Engine. Then we added a
>> workaround in SGE 6.2u5p1 in Open Grid Scheduler, and Son of Grid
>> Engine copied it. I think Univa also fixed the bug at some point, as
>> the fix was copied by Son of Grid Engine (and dropped the workaround).
>> OGS will just stick with the workaround as we don't like the
>> workaround or the fix...
>>
>> You will just need to upgrade your SGE 6.2u5 cluster with a patched
>> SGE execd - either compile execd yourself or in fact you can get it
>> from the hwloc drop-in upgrade package:
>>
>> http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html
>>
>> Rayson
>>
>>
>> On Tue, Aug 2, 2011 at 8:15 AM, Jesse Becker<[email protected]>
>>  wrote:
>>>
>>> On Mon, Aug 01, 2011 at 07:41:41PM -0400, William Deegan wrote:
>>>>
>>>> Should the maxvmem column in the accounting file be the true max memory
>>>> footprint of the running process? (and children?)
>>>
>>> I've seen problems with 6.2u5 in the accounting records.  It appears to
>>> "wrap" at 4GB, which probably indicates a 32/64 bit issue.  I think
>>> there's information about it in the mailing list.
>>>
>>> I'm not sure about child processes.
>>>
>>> --
>>>
>
>

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to