Am 15.07.2011 um 22:32 schrieb [email protected]:

> We're running SGE 6.2u5 (Sun) under Linux (CentOS5.6 x86_64) and we're having
> a very odd problem when specifying an h_vmem value.
> 
> We use environment modules[1] to load default settings and allow the selection
> of different packages. Only when h_vmem is specified, we get errors very early
> in the environment modules initialization.
> 
> The actual h_vmem value and the job are immaterial. For example, if the
> job is a shell script that consists of the command "date", it succeeds
> when h_vmem is not set, but fails when h_vmem is set to 1G (or 15G).
> 
> SGE correctly dispatches the job to a compute node, and the job runs--but
> the environment isn't initialized correctly. For something as trivial as
> "date", the job works, but for scientific processing that depends on
> the environment modules initialization, jobs fail.
> 
> The first part of the STDERR is very odd; it is an error message like:
> 
>       id: cannot find name for group ID 40193

SGE assigns an additional group ID to each job track usage. It looks like the 
definition of a particular modules settings tries to resolve all group IDs it 
finds attached.

Is this error only there for failing jobs or for all (I would assume the 
latter).

==

If h_vmem is set, some applications need the h_stack set too. Often a value of 
128M or 256M does enable the application to run again.

-- Reuti


> where the group ID number varies, always above 40000. That's the group_id
> range assigned in the qmaster.
> 
> None of the shell initialization scripts use the GID, and they all succeed
> when "h_vmem" is not specified.
> 
> 
> Any suggestions of the next step in troubleshooting this issue?
> 
> Thanks,
> 
> Mark
> 
>       [1] http://modules.sourceforge.net/
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to