Hi, Am 11.05.2012 um 20:15 schrieb Hung-Sheng Tsao Ph.D.:
> not sure what is the issues here, see in-line > On 5/11/2012 8:11 AM, iqtcub wrote: >> Hi, >> >> Following up with the thread with the same subject ( >> http://thread.gmane.org/gmane.comp.clustering.opengridengine.user/894/ ). >> >> We're using sge 6.2u5, our setup is 2 machines(its a testing cluster) with 2 >> cores each machine. >> -qsub -q v20z.q -pe smp 1 script.sub >> -wait until the job runs >> -qsub -q v20z.q -pe smp 1 script.sub Why are you requesting a PE, as it's only a serial job? There is: https://blogs.oracle.com/sgrell/entry/grid_engine_scheduler_hacks_least To set up a round_robin or fill_up distribution of jobs. But this works for serial jobs only, not for parallel ones unless you request $pe_slots in the PE definition like you do below. Hence it should work for you in this special case. -- Reuti >> Each job enters a different node. > this is not ok for you? >> However if we do: >> -for i in 1 2; do qsub -q v20z.q -pe smp 1 script.sub; done >> >> Then both jobs enter into the same node. > this is ok for you? >> >> Our scheduling conf is as follows: >> >> ---------------------------- >> algorithm default >> schedule_interval 0:0:15 >> maxujobs 0 >> queue_sort_method load >> job_load_adjustments NONE >> load_adjustment_decay_time 0:7:30 >> load_formula slots >> schedd_job_info true >> flush_submit_sec 0 >> flush_finish_sec 0 >> params MONITOR=1 >> reprioritize_interval 0:0:0 >> halftime 168 >> usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000 >> compensation_factor 2.000000 >> weight_user 0.250000 >> weight_project 0.250000 >> weight_department 0.250000 >> weight_job 0.250000 >> weight_tickets_functional 0 >> weight_tickets_share 1000000 >> share_override_tickets TRUE >> share_functional_shares FALSE >> max_functional_jobs_to_schedule 200 >> report_pjob_tickets TRUE >> max_pending_tasks_per_job 50 >> halflife_decay_list none >> policy_hierarchy OS >> weight_ticket 0.890000 >> weight_waiting_time 0.000000 >> weight_deadline 3600000.000000 >> weight_urgency 0.100000 >> weight_priority 0.010000 >> max_reservation 50 >> default_duration 9999:00:00 >> -------------------------------------------- >> >> The smp PE config is: >> pe_name smp >> slots 999 >> user_lists NONE >> xuser_lists NONE >> start_proc_args /bin/true >> stop_proc_args /bin/true >> allocation_rule $pe_slots >> control_slaves FALSE >> job_is_first_task TRUE >> urgency_slots min >> accounting_summary TRUE >> ------------------------- >> >> The config on both nodes are like this: >> hostname v20z-03 >> load_scaling NONE >> complex_values mem_free=7891.796875M,slots=2 >> load_values arch=lx24-amd64,num_proc=2,mem_total=7935.984375M, \ >> swap_total=4095.992188M,virtual_total=12031.976562M, \ >> h_fsize=9.7G,load_avg=0.180000,load_short=0.080000, \ >> load_medium=0.180000,load_long=0.090000, \ >> mem_free=7830.246094M,swap_free=4095.992188M, \ >> virtual_free=11926.238281M,mem_used=105.738281M, \ >> swap_used=0.000000M,virtual_used=105.738281M, \ >> cpu=0.000000,m_topology=SCSC,m_topology_inuse=SCSC, \ >> m_socket=2,m_core=2,np_load_avg=0.090000, \ >> np_load_short=0.040000,np_load_medium=0.090000, \ >> np_load_long=0.045000 >> processors 2 >> user_lists NONE >> xuser_lists NONE >> projects NONE >> xprojects NONE >> usage_scaling cpu=12.300000 >> report_variables NONE >> ------------------------------- >> The queue config: >> qname v20z.q >> hostlist @v20z >> seq_no 0 >> load_thresholds np_load_avg=1.75 >> suspend_thresholds NONE >> nsuspend 1 >> suspend_interval 00:01:00 >> priority 0 >> min_cpu_interval 00:01:00 >> processors UNDEFINED >> qtype BATCH INTERACTIVE >> ckpt_list BLCR >> pe_list make smp >> rerun FALSE >> slots 2 >> tmpdir /scratch >> shell /bin/csh >> prolog NONE >> epilog NONE >> shell_start_mode posix_compliant >> starter_method NONE >> suspend_method NONE >> resume_method NONE >> terminate_method NONE >> notify 00:00:60 >> owner_list NONE >> user_lists NONE >> xuser_lists NONE >> subordinate_list NONE >> complex_values split=2 >> projects NONE >> xprojects NONE >> calendar NONE >> initial_state default >> s_rt INFINITY >> h_rt INFINITY >> s_cpu INFINITY >> h_cpu INFINITY >> s_fsize INFINITY >> h_fsize INFINITY >> s_data INFINITY >> h_data INFINITY >> s_stack INFINITY >> h_stack INFINITY >> s_core INFINITY >> h_core INFINITY >> s_rss INFINITY >> h_rss INFINITY >> s_vmem INFINITY >> h_vmem INFINITY >> >> ----------------------------------- >> >> From what i understood, its possible that this method is broken, am i right? >> >> I've also tried the scheduler configuration in the following links, with the >> same result: >> http://article.gmane.org/gmane.comp.clustering.opengridengine.user/1037 >> http://wiki.gridengine.info/wiki/index.php/StephansBlog >> >> Thanks in advance! >> _______________________________________________ >> users mailing list >> users@gridengine.org >> https://gridengine.org/mailman/listinfo/users > > -- > > > <laotsao.vcf>_______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users