Hi,

Am 06.03.2016 um 18:04 schrieb Yuri Burmachenko:

> Hallo to distinguished forum members,
>  
> Recently we have found that something is wrong with SGE Job IDs – they are 
> getting reset very fast: 6-7 times in a month.
> We don’t really have so many jobs executed in such a short period of time.
>  
> We use JobId (via qacct) as a primary key for different home-made analytic 
> tools, and this very quick jobId switch impairs the reliability of the tools.
>  
> This started after we had a full electricity shutdown during which we have 
> halted all our systems including SGE master/shadow and its execution hosts.

To elaborate this. When it suddenly jumps to 99999999: what was the highest 
JOB_ID which was recorded before that skip in the accounting file?

-- Reuti


> Perhaps something sets $SGE_ROOT/default/spool/qmaster/jobseqnum to “9999999” 
> and then something (related or not) restarts SGE setting that jobid.
>  
> Any tips and advices where to look for the root cause, will be greatly 
> appreciated.
> Thank You.
>  
>  
>  
> Yuri Burmachenko | Sr. Engineer | IT | Mellanox Technologies Ltd.
> Work: +972 74 7236386 | Cell +972 54 7542188 |Fax: +972 4 959 3245
> Follow us on Twitter and Facebook
>  
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to