Please keep the list posted.

Am 08.04.2013 um 21:06 schrieb Ehud Barnea:

> These 2 lines are there twice probably because of a mistake on my behalf. I 
> copied them incorrectly (they are just after the first "page", when 
> redirecting the output of the command to a txt file and opening it with nano).
> 
> The jobs were created by themselves. The output I gave is of 3 jobs that I 
> did not create. The 3rd one was created right after the first 2 finished (and 
> it was created along with another 30 jobs).
> The job IDs were increased one by one, as they usually do when submitting 
> jobs. Job IDs did not appear twice.
> The only commands I ran were
> qsub (several times)
> qalter (no wrapper, just qalter -q <job id>)

`qresub` is a symbolic link to `qalter`. If you can reproduce this, it seems to 
be a wrong interpretation of the intended change of attributes by `qalter`. 
Which version of SGE are you using in detail?

But: if the job is resubmitted by accident, then there is no "version:" showing 
up at all, as it wasn't modified.

-- Reuti

PS: The "/fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$" is edited, or 
does it really end in a "$"?


> On Mon, Apr 8, 2013 at 6:22 PM, Reuti <[email protected]> wrote:
> Hi,
> 
> Am 08.04.2013 um 15:11 schrieb Ehud Barnea:
> 
> > Thanks for looking at it. I wasn't sure whether to submit to the users 
> > group or just you.
> > Anyway, the I ran the same thing again and the problem occurred again.
> > The job that I moved (with qalter) finished quickly so I couldn't check 
> > it's version, but when it finished it spawned another 3 new jobs (with new 
> > job ids and all of them with version 1).
> > After these 3 finished another 30 jobs were spawned (also with a different 
> > job id and all with version 1).
> 
> Was the job number increased directly from the original job id and you ended 
> up in having the same job id twice in the system, or were new ones created 
> after any later submitted job? Any `qalter`-wrapper in the way? You edited 
> the output below, or is:
> 
> > stderr_path_list:           logs
> > mail_list:                  barneaeh@sge01
> > stderr_path_list:           logs
> > mail_list:                  barneaeh@sge01
> 
> really there twice?
> 
> -- Reuti
> 
> 
> > After doing qalter the only commands I ran were qstat, also at first I 
> > created 5 job arrays and only did qalter on 1 of them, so I am certain that 
> > I did not accidentally created all these jobs.
> >
> > I supply here the output of qstat -j. The first 2 belong to the first batch 
> > of 3 jobs and the last one is of one of the 30 jobs the spawned later:
> > (it probably doesn't matter, but it sits on a Dropbox folder, but dropbox 
> > isn't active, so it's just a normal folder)
> >
> > ==============================================================
> > job_number:                 8698176
> > exec_file:                  job_scripts/8698176
> > submission_time:            Mon Apr  8 15:55:10 2013
> > owner:                      barneaeh
> > uid:                        52647
> > group:                      obs
> > gid:                        1009
> > sge_o_home:                 /storage/users/barneaeh
> > sge_o_log_name:             barneaeh
> > sge_o_path:                 
> > /fastspace/users/barneaeh/PCL-1.6.0/bin:/fastspace/$
> > sge_o_shell:                /bin/tcsh
> > sge_o_workdir:              
> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$
> > sge_o_host:                 sge01
> > account:                    sge
> > cwd:                        
> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$
> > path_aliases:               /tmp_mnt/ * * /
> > stderr_path_list:           logs
> > mail_list:                  barneaeh@sge01
> > stderr_path_list:           logs
> > mail_list:                  barneaeh@sge01
> > notify:                     FALSE
> > job_name:                   run.sh
> > stdout_path_list:           logs
> > jobshare:                   0
> > hard_queue_list:            intel_all.q
> > shell_list:                 /bin/sh
> > env_list:
> > script_file:                run.sh
> > version:                    1
> > job-array tasks:            1-246:1
> > ==============================================================
> > job_number:                 8698175
> > exec_file:                  job_scripts/8698175
> > submission_time:            Mon Apr  8 15:55:10 2013
> > owner:                      barneaeh
> > uid:                        52647
> > group:                      obs
> > gid:                        1009
> > sge_o_home:                 /storage/users/barneaeh
> > sge_o_log_name:             barneaeh
> > sge_o_path:                 
> > /fastspace/users/barneaeh/PCL-1.6.0/bin:/fastspace/$
> > sge_o_shell:                /bin/tcsh
> > sge_o_workdir:              
> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$
> > sge_o_host:                 sge01
> > account:                    sge
> > cwd:                        
> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$
> > path_aliases:               /tmp_mnt/ * * /
> > stderr_path_list:           logs
> > mail_list:                  barneaeh@sge01
> > stderr_path_list:           logs
> > mail_list:                  barneaeh@sge01
> > notify:                     FALSE
> > job_name:                   run.sh
> > stdout_path_list:           logs
> > jobshare:                   0
> > hard_queue_list:            intel_all.q
> > shell_list:                 /bin/sh
> > env_list:
> > script_file:                run.sh
> > version:                    1
> > job-array tasks:            1-246:1
> > ==============================================================
> > job_number:                 8698182
> > exec_file:                  job_scripts/8698182
> > submission_time:            Mon Apr  8 16:00:28 2013
> > owner:                      barneaeh
> > uid:                        52647
> > group:                      obs
> > gid:                        1009
> > sge_o_home:                 /storage/users/barneaeh
> > sge_o_log_name:             barneaeh
> > sge_o_path:                 
> > /fastspace/users/barneaeh/PCL-1.6.0/bin:/fastspace/$
> > sge_o_shell:                /bin/tcsh
> > sge_o_workdir:              
> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$
> > sge_o_host:                 sge01
> > account:                    sge
> > cwd:                        
> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$
> > path_aliases:               /tmp_mnt/ * * /
> > stderr_path_list:           logs
> > mail_list:                  barneaeh@sge01
> > stderr_path_list:           logs
> > mail_list:                  barneaeh@sge01
> > notify:                     FALSE
> > job_name:                   run.sh
> > stdout_path_list:           logs
> > jobshare:                   0
> > hard_queue_list:            intel_all.q
> > shell_list:                 /bin/sh
> > env_list:
> > script_file:                run.sh
> > version:                    1
> > job-array tasks:            1-246:1
> >
> >
> > On Mon, Apr 8, 2013 at 3:43 PM, Reuti <[email protected]> wrote:
> > Am 08.04.2013 um 10:21 schrieb Semi:
> >
> > > Any ideas about user's question?
> > >
> > > I am working with job arrays and encountered something weird. At first a 
> > > sent several job arrays to obs.q. Then I took one of the job arrays (that 
> > > didn't start executing any task) and sent it to intel_all.q (using 
> > > qalter).
> > > After that the specific tasks started running, however, the same jobArray 
> > > was duplicated about 20-30 times. Now qstat shows me:
> > > 8698022 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:23         
> > >                            1 1-245:1
> > > 8698023 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:23         
> > >                            1 1-245:1
> > > 8698024 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:23         
> > >                            1 1-245:1
> > > 8698025 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:23         
> > >                            1 1-245:1
> > > 8698026 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:23         
> > >                            1 1-245:1
> > > 8698027 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:23         
> > >                            1 1-245:1
> > > 8698028 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:24         
> > >                            1 1-245:1
> > > 8698030 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:27         
> > >                            1 1-245:1
> > > 8698031 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:27         
> > >                            1 1-245:1
> > > 8698032 0.50500 run.sh     barneaeh     qw    04/08/2013 10:25:27         
> > >                            1 1-245:1
> >
> > `qalter` doesn't change the submission time (only increasing the value of 
> > "version:" in `qstat -j <job_id>`). But above they have different 
> > submission times. Was this the time `qalter` was issued?
> >
> > All have a value of "version: 1" in `qstat -j <job_id>`?
> >
> > -- Reuti
> >
> >
> > > This jobArray was the only one with 245, so I am sure it was duplicated...
> > > Now it seems that the job array is run again and again.
> > >
> > > _______________________________________________
> > > users mailing list
> > > [email protected]
> > > https://gridengine.org/mailman/listinfo/users
> >
> >
> 
> 

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to