On 3/28/2012 5:53 AM, Reuti wrote:
Am 27.03.2012 um 23:27 schrieb Hung-sheng Tsao:May be just copy the /opt/ grid engine from one of the compute node Add this as submit host from the frontendIt may be good to add an explanation: to me it looks like the original poster installed a separate SGE cluster on just one machine, including the qmaster daemon and hence it's just running local which explains the job id of being 1.
sorry, if one just copy the /opt/gridengine from compute nodes thenit will have the full directory of /opt/gridengine/default/common and /opt/gridengine/bin
yes there is also default/spool that one could delete the demon should not run! of course one will need the home directory, uid etc from the rocks frontend IMHO, it is much simpler then install a new version of SGEof course if the submit host is not running the same centos/redhat of compute node that is another story
regards
To add a submit host to an existing cluster it isn't necessary to have any daemon running on it, and installing a different version of SGE will most likely not work too, as the internal protocol changes between the releases. I suggest to: - Stop the daemons you started on the new submit host - Remove the compilation you did - Share the users from the existing cluster by NIS/LDAP (unless you want to define them all by hand on the new machine too) - Mount /home from the existing cluster - Mount /usr/sge or /opt/grid whereever you have SGE installed in the exisitng cluster - Add the machine in question as submit host in the original cluster - Source during login $SGE_ROOT/default/common/settings.sh on the submit machine Then you should be able to submit jobs from this machine. As there is no builtin file staging in SGE, it's most common to share /home. == Nevertheless it could be done to have a separate single machine cluster (with a different version of SGE) and use file staging (which you have to implement on your own) but it's to much overhead for adding just this particular machine IMO. It's a suitable setup to combine clusters by the use of a transfer queue this way. I did it once and used the job context to name the files which have to be copied back and forth to copy them then on my own in a starter method. -- ReutiLT Sent from my iPhone On Mar 27, 2012, at 4:36 PM, Robert Chase<[email protected]> wrote:Hello, A number of years ago, our group created a rocks cluster consisting of a head node, a data node and eight execution nodes. The eight execution nodes can only be accessed by the head node. My goal is to add a submit node to the existing cluster. I have downloaded GE2011.11 and compiled from source without errors. When I try the command: qsub simple.sh I get the error: Unable to run job: warning: root your job is not allowed to run in any queue When I look at qstat I get: job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 1 0.55500 simple.sh root qw 03/27/2012 09:41:11 1 I have added the new submit node to the list of submit nodes on the head node using the command qconf -as When I run qconf -ss on the new submit node I see the head node, the data node and the new submit node. When I run qconf -ss on the head node, I see the head node, the data node, the new submit node and all eight execution nodes. When I run qhost on the new submit node, I get HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS ------------------------------------------------------------------------------- global - - - - - - - Other posts have asked about the output of qconf -sq all.q... [root@HEADNODE jobs]# qconf -sq all.q qname all.q hostlist @allhosts seq_no 0 load_thresholds np_load_avg=1.75 suspend_thresholds NONE nsuspend 1 suspend_interval 00:05:00 priority 0 min_cpu_interval 00:05:00 processors UNDEFINED qtype BATCH INTERACTIVE ckpt_list NONE pe_list make mpi mpich multicore orte rerun FALSE slots 1,[compute-0-0.local=16],[compute-0-1.local=16], \ [compute-0-2.local=16],[compute-0-3.local=16], \ [compute-0-4.local=16],[compute-0-6.local=16], \ [compute-0-7.local=16] tmpdir /tmp shell /bin/csh prolog NONE epilog NONE shell_start_mode posix_compliant starter_method NONE suspend_method NONE resume_method NONE terminate_method NONE notify 00:00:60 owner_list NONE user_lists NONE xuser_lists NONE subordinate_list NONE complex_values NONE projects NONE xprojects NONE calendar NONE initial_state default s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_fsize INFINITY h_fsize INFINITY s_data INFINITY h_data INFINITY s_stack INFINITY h_stack INFINITY s_core INFINITY h_core INFINITY s_rss INFINITY h_rss INFINITY s_vmem INFINITY h_vmem INFINITY [root@SUBMITNODE jobs]# qconf -sq all.q qname all.q hostlist @allhosts seq_no 0 load_thresholds np_load_avg=1.75 suspend_thresholds NONE nsuspend 1 suspend_interval 00:05:00 priority 0 min_cpu_interval 00:05:00 processors UNDEFINED qtype BATCH INTERACTIVE ckpt_list NONE pe_list make rerun FALSE slots 1 tmpdir /tmp shell /bin/csh prolog NONE epilog NONE shell_start_mode posix_compliant starter_method NONE suspend_method NONE resume_method NONE terminate_method NONE notify 00:00:60 owner_list NONE user_lists NONE xuser_lists NONE subordinate_list NONE complex_values NONE projects NONE xprojects NONE calendar NONE initial_state default s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_fsize INFINITY h_fsize INFINITY s_data INFINITY h_data INFINITY s_stack INFINITY h_stack INFINITY s_core INFINITY h_core INFINITY s_rss INFINITY h_rss INFINITY s_vmem INFINITY h_vmem INFINITY I would like to know how to get qsub working. Thanks, -Robert Paul Chase Channing Labs _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
-- Hung-Sheng Tsao Ph D. Founder& Principal HopBit GridComputing LLC cell: 9734950840 http://laotsao.blogspot.com/ http://laotsao.wordpress.com/ http://blogs.oracle.com/hstsao/
<<attachment: laotsao.vcf>>
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
