Hi All,

 I have a user level benchmark that does
 for (i = 0; i < nthreads; i++)
      (void) thr_create(NULL, 0, testaes, (void *)0,
                            THR_NEW_LWP, &tid);

 I found that running this benchmark with nthreads == ncpus
 schedules each thread to a separate CPU. The system is a Niagara 2
 with 128 CPUs/strands.

 However, for a kernel module/benchmark that does
 for (i = 0; i < nthreads; i++)
    (void) thread_create(NULL, 0, &process_aes, (void *)i, 0, &p0,
                                TS_RUN, minclsyspri);

 the scheduling is very uneven and a whole set of CPUs from
 64-127 did not have any thread scheduled on them. The distribution
 among 0-63 is also uneven.

 I assume the thread scheduling behavior is different for system threads
 which do not have a LWP. But, is this not sub optimal? Is the
 assumption that kernel subsystems that need to use a large number
 of threads do their own CPU binding/scheduling to assure even distribution?

Thanks,
-Krishna
 

 
 

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to