Am 09.04.2013 um 08:54 schrieb Semi: > SGE version 6.08
Well, this is too old by far to make any reliable statement about it. I don't recall hearing about a problem with SGE in this aspect. Can you try it with a test installation of a newer version of SGE to see whether it was fixed? -- Reuti > On 08-Apr-13 23:52, Reuti wrote: >> Please keep the list posted. >> >> Am 08.04.2013 um 21:06 schrieb Ehud Barnea: >> >>> These 2 lines are there twice probably because of a mistake on my behalf. I >>> copied them incorrectly (they are just after the first "page", when >>> redirecting the output of the command to a txt file and opening it with >>> nano). >>> >>> The jobs were created by themselves. The output I gave is of 3 jobs that I >>> did not create. The 3rd one was created right after the first 2 finished >>> (and it was created along with another 30 jobs). >>> The job IDs were increased one by one, as they usually do when submitting >>> jobs. Job IDs did not appear twice. >>> The only commands I ran were >>> qsub (several times) >>> qalter (no wrapper, just qalter -q <job id>) >> >> `qresub` is a symbolic link to `qalter`. If you can reproduce this, it seems >> to be a wrong interpretation of the intended change of attributes by >> `qalter`. Which version of SGE are you using in detail? >> >> But: if the job is resubmitted by accident, then there is no "version:" >> showing up at all, as it wasn't modified. >> >> -- Reuti >> >> PS: The "/fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$" is edited, or >> does it really end in a "$"? >> >> >>> On Mon, Apr 8, 2013 at 6:22 PM, Reuti <[email protected]> wrote: >>> Hi, >>> >>> Am 08.04.2013 um 15:11 schrieb Ehud Barnea: >>> >>> > Thanks for looking at it. I wasn't sure whether to submit to the users >>> > group or just you. >>> > Anyway, the I ran the same thing again and the problem occurred again. >>> > The job that I moved (with qalter) finished quickly so I couldn't check >>> > it's version, but when it finished it spawned another 3 new jobs (with >>> > new job ids and all of them with version 1). >>> > After these 3 finished another 30 jobs were spawned (also with a >>> > different job id and all with version 1). >>> >>> Was the job number increased directly from the original job id and you >>> ended up in having the same job id twice in the system, or were new ones >>> created after any later submitted job? Any `qalter`-wrapper in the way? You >>> edited the output below, or is: >>> >>> > stderr_path_list: logs >>> > mail_list: barneaeh@sge01 >>> > stderr_path_list: logs >>> > mail_list: barneaeh@sge01 >>> >>> really there twice? >>> >>> -- Reuti >>> >>> >>> > After doing qalter the only commands I ran were qstat, also at first I >>> > created 5 job arrays and only did qalter on 1 of them, so I am certain >>> > that I did not accidentally created all these jobs. >>> > >>> > I supply here the output of qstat -j. The first 2 belong to the first >>> > batch of 3 jobs and the last one is of one of the 30 jobs the spawned >>> > later: >>> > (it probably doesn't matter, but it sits on a Dropbox folder, but dropbox >>> > isn't active, so it's just a normal folder) >>> > >>> > ============================================================== >>> > job_number: 8698176 >>> > exec_file: job_scripts/8698176 >>> > submission_time: Mon Apr 8 15:55:10 2013 >>> > owner: barneaeh >>> > uid: 52647 >>> > group: obs >>> > gid: 1009 >>> > sge_o_home: /storage/users/barneaeh >>> > sge_o_log_name: barneaeh >>> > sge_o_path: >>> > /fastspace/users/barneaeh/PCL-1.6.0/bin:/fastspace/$ >>> > sge_o_shell: /bin/tcsh >>> > sge_o_workdir: >>> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$ >>> > sge_o_host: sge01 >>> > account: sge >>> > cwd: >>> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$ >>> > path_aliases: /tmp_mnt/ * * / >>> > stderr_path_list: logs >>> > mail_list: barneaeh@sge01 >>> > stderr_path_list: logs >>> > mail_list: barneaeh@sge01 >>> > notify: FALSE >>> > job_name: run.sh >>> > stdout_path_list: logs >>> > jobshare: 0 >>> > hard_queue_list: intel_all.q >>> > shell_list: /bin/sh >>> > env_list: >>> > script_file: run.sh >>> > version: 1 >>> > job-array tasks: 1-246:1 >>> > ============================================================== >>> > job_number: 8698175 >>> > exec_file: job_scripts/8698175 >>> > submission_time: Mon Apr 8 15:55:10 2013 >>> > owner: barneaeh >>> > uid: 52647 >>> > group: obs >>> > gid: 1009 >>> > sge_o_home: /storage/users/barneaeh >>> > sge_o_log_name: barneaeh >>> > sge_o_path: >>> > /fastspace/users/barneaeh/PCL-1.6.0/bin:/fastspace/$ >>> > sge_o_shell: /bin/tcsh >>> > sge_o_workdir: >>> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$ >>> > sge_o_host: sge01 >>> > account: sge >>> > cwd: >>> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$ >>> > path_aliases: /tmp_mnt/ * * / >>> > stderr_path_list: logs >>> > mail_list: barneaeh@sge01 >>> > stderr_path_list: logs >>> > mail_list: barneaeh@sge01 >>> > notify: FALSE >>> > job_name: run.sh >>> > stdout_path_list: logs >>> > jobshare: 0 >>> > hard_queue_list: intel_all.q >>> > shell_list: /bin/sh >>> > env_list: >>> > script_file: run.sh >>> > version: 1 >>> > job-array tasks: 1-246:1 >>> > ============================================================== >>> > job_number: 8698182 >>> > exec_file: job_scripts/8698182 >>> > submission_time: Mon Apr 8 16:00:28 2013 >>> > owner: barneaeh >>> > uid: 52647 >>> > group: obs >>> > gid: 1009 >>> > sge_o_home: /storage/users/barneaeh >>> > sge_o_log_name: barneaeh >>> > sge_o_path: >>> > /fastspace/users/barneaeh/PCL-1.6.0/bin:/fastspace/$ >>> > sge_o_shell: /bin/tcsh >>> > sge_o_workdir: >>> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$ >>> > sge_o_host: sge01 >>> > account: sge >>> > cwd: >>> > /fastspace/users/barneaeh/Dropbox/SGE/srv/6/evaluat$ >>> > path_aliases: /tmp_mnt/ * * / >>> > stderr_path_list: logs >>> > mail_list: barneaeh@sge01 >>> > stderr_path_list: logs >>> > mail_list: barneaeh@sge01 >>> > notify: FALSE >>> > job_name: run.sh >>> > stdout_path_list: logs >>> > jobshare: 0 >>> > hard_queue_list: intel_all.q >>> > shell_list: /bin/sh >>> > env_list: >>> > script_file: run.sh >>> > version: 1 >>> > job-array tasks: 1-246:1 >>> > >>> > >>> > On Mon, Apr 8, 2013 at 3:43 PM, Reuti <[email protected]> wrote: >>> > Am 08.04.2013 um 10:21 schrieb Semi: >>> > >>> > > Any ideas about user's question? >>> > > >>> > > I am working with job arrays and encountered something weird. At first >>> > > a sent several job arrays to obs.q. Then I took one of the job arrays >>> > > (that didn't start executing any task) and sent it to intel_all.q >>> > > (using qalter). >>> > > After that the specific tasks started running, however, the same >>> > > jobArray was duplicated about 20-30 times. Now qstat shows me: >>> > > 8698022 0.50500 run.sh barneaeh qw 04/08/2013 10:25:23 >>> > > 1 1-245:1 >>> > > 8698023 0.50500 run.sh barneaeh qw 04/08/2013 10:25:23 >>> > > 1 1-245:1 >>> > > 8698024 0.50500 run.sh barneaeh qw 04/08/2013 10:25:23 >>> > > 1 1-245:1 >>> > > 8698025 0.50500 run.sh barneaeh qw 04/08/2013 10:25:23 >>> > > 1 1-245:1 >>> > > 8698026 0.50500 run.sh barneaeh qw 04/08/2013 10:25:23 >>> > > 1 1-245:1 >>> > > 8698027 0.50500 run.sh barneaeh qw 04/08/2013 10:25:23 >>> > > 1 1-245:1 >>> > > 8698028 0.50500 run.sh barneaeh qw 04/08/2013 10:25:24 >>> > > 1 1-245:1 >>> > > 8698030 0.50500 run.sh barneaeh qw 04/08/2013 10:25:27 >>> > > 1 1-245:1 >>> > > 8698031 0.50500 run.sh barneaeh qw 04/08/2013 10:25:27 >>> > > 1 1-245:1 >>> > > 8698032 0.50500 run.sh barneaeh qw 04/08/2013 10:25:27 >>> > > 1 1-245:1 >>> > >>> > `qalter` doesn't change the submission time (only increasing the value of >>> > "version:" in `qstat -j <job_id>`). But above they have different >>> > submission times. Was this the time `qalter` was issued? >>> > >>> > All have a value of "version: 1" in `qstat -j <job_id>`? >>> > >>> > -- Reuti >>> > >>> > >>> > > This jobArray was the only one with 245, so I am sure it was >>> > > duplicated... >>> > > Now it seems that the job array is run again and again. >>> > > >>> > > _______________________________________________ >>> > > users mailing list >>> > > [email protected] >>> > > https://gridengine.org/mailman/listinfo/users >>> > >>> > >>> >>> >> > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
