Thanks Dave for your inputs .:-) On Mon, Mar 26, 2012 at 4:44 PM, Davidlohr Bueso <[email protected]> wrote:
> On Mon, 2012-03-26 at 16:28 +0530, Mahesh Gondi wrote: > > > > > > On Mon, Mar 26, 2012 at 3:38 PM, Davidlohr Bueso <[email protected]> wrote: > > On Mon, 2012-03-26 at 02:56 -0300, Felipe Astroza Araya wrote: > > > Good aproach. It's like a stack (LIFO) of sched_connections > > BUT I'd prefer a linked list, because it's simpler. You could > > use just a "free list" and not two arrays (stack and queue). > > When a connection is closed his sched_connection is returned > > to the "free list" (head). > > > > > > > > > It seems to be this is an overkill and the optimization is too > > small. We > > can always maintain a global variable with the size of current > > capacity > > - since in a threaded context it would require locking which > > might lead > > to contention. Another alternative is to keep a bitmap instead > > of a new > > free_in_queue array. So the bitmap would have a size of > > work_capacity > > each time a slot is occupied, the corresponding bit is set. > > Bitmaps are > > O(1) as well and the overhead is just 1 bit per setting > > > > In case of bitmaps O(1) will be if I am checking the value of a > > particular position. But for finding a free bit it will be O(n/t) , > > t=sizeof(datatype used beneath)==> O(n), with very very low constants. > > No! The whole point of bitmaps is speed and simplicity: we don't iterate > over the map sequentially and use instead bit operations. For example to > set a bit at position N: > > unsigned long *p = ((unsigned long *)worker_capacity_addr) + > worker_capacity/BITS_PER_LONG; > > *p &= ~((1 << ((worker_capcity) % BITS_PER_LONG))) > > where BITS_PER_LONG is 32 or 64 depending on the architecture. > while looking for a free spot, search is for first free bit(0). which could be on any of the (worker_capcity) / BITS_PER_LONG long number.So shouldn't this be in worst case O(worker_capacity/BITS_PER_LONG) ? > > > Since, we're not working with a worker_capacity being very very large > > these low-valued constants will come into play and prove very > > helpyful. > > Correct, which is why I think that optimizations here aren't very useful > - O(n) is just fine when dealing with small ranges. sometime, one has to hate asymptotics :P Dave, What are your thoughts on using the sched_conn pointer(*ptr) instead of socket file descriptor(fd) in the epoll_event's data ? What all modifications may take place ? > > > > > > > Another issue in mk_scheduler is mk_sched_get_connection(). > > This function is called from mk_conn_write() and > > mk_sched_remove_client(). The mk_sched_get_connection()'s > > complexity is O(work_capacity), is used two times at least in > > connection life when it could be avoid completely. epoll_wait > > returns a event array and Monkey uses the socket fd as > > epoll_event data. That's wrong decision!, epoll_event data > > should be the sched_connection and NOT the socket fd. It's > > possible to improve it, but need hard work. > > > > > > El 26-03-2012, a las 1:22, Eduardo Silva escribió: > > > > > > > Hi, > > > > > > > > thanks for the patch. Looking with valgrind seems to be > > optimized a > > > > little bit, screenshot here: > > > > > > > > > > http://edsiper.linuxchile.cl/sched_optimization_001.png > > > > > > > > without optimization mk_sched_register() takes 0.40 for > > 5000 calls, > > > > the same situation but for an optimized code takes 0.36. > > Its an > > > > improvement. > > > > > > > > Dave, Zeus and Max, what do you think about the patch ? > > > > > > > > cheers, > > > > > > > > > > > > On Sun, Mar 25, 2012 at 9:43 PM, Mahesh Gondi > > <[email protected]> wrote: > > > >> Hi all, > > > >> > > > >> I made some changes to mk_scheduler.c. First I will > > explain in brief what I > > > >> did before the results. > > > >> > > > >> In mk_scheduler.c , the mk_sched_register_client serves > > the purpose of > > > >> adding new client requests to the worker thread > > queue(everything discussed > > > >> here happens in the thread context). Adding was done by > > iterating over the > > > >> queue to looking for an available spot to be inserted. > > When the load on > > > >> server is at near max, then this insertion cost rises to > > O(work_capacity). > > > >> > > > >> Instead I maintained free spots on the queue(list of > > client requests > > > >> received), in a simple array of size (work_capacity+1) > > with each element > > > >> pointing to an index in queue(first element kept a count > > of number of free > > > >> spots available). Array(arr) contains free spots as > > pointed by the index > > > >> values stored at the position from 1 to arr[0]. Insertion > > now only takes a > > > >> constant time. Hence this has contributed in running > > monkey a bit cheaper. > > > >> Similar modifications are in progress, should help monkey > > run more and more > > > >> faster . :) > > > >> > > > >> Below are the results > > > >> > > > >> Output I got for running with "siege -c 300 -t 30S > > 127.0.01:2001", > > > >> > > > >> //WITH CONSTANT TIME INSERTION > > > >> Transactions: 18051 hits > > > >> Availability: 100.00 % > > > >> Elapsed time: 29.96 secs > > > >> Data transferred: 23.48 MB > > > >> Response time: 0.00 secs > > > >> Transaction rate: 602.50 trans/sec > > > >> Throughput: 0.78 MB/sec > > > >> Concurrency: 2.30 > > > >> Successful transactions: 18051 > > > >> Failed transactions: 0 > > > >> Longest transaction: 0.23 > > > >> Shortest transaction: 0.00 > > > >> > > > >> ============================================ > > > >> > > > >> //EARLIER > > > >> Transactions: 17711 hits > > > >> Availability: 100.00 % > > > >> Elapsed time: 30.01 secs > > > >> Data transferred: 23.04 MB > > > >> Response time: 0.00 secs > > > >> Transaction rate: 590.17 trans/sec > > > >> Throughput: 0.77 MB/sec > > > >> Concurrency: 1.18 > > > >> Successful transactions: 17711 > > > >> Failed transactions: 0 > > > >> Longest transaction: 0.17 > > > >> Shortest transaction: 0.00 > > > >> > > > >> i had taken output for each case just after a fresh > > restart. Reason for only > > > >> ~600 trans/sec is that it was run ec2 t1.small instance. > > > >> > > > >> Thanks & Regards, > > > >> mahesh gondi > > > >> > > > >> _______________________________________________ > > > >> Monkey mailing list > > > >> [email protected] > > > >> http://lists.monkey-project.com/listinfo/monkey > > > >> > > > > > > > > > > > > > > > > -- > > > > Eduardo Silva > > > > http://edsiper.linuxchile.cl > > > > http://www.monkey-project.com > > > > _______________________________________________ > > > > Monkey mailing list > > > > [email protected] > > > > http://lists.monkey-project.com/listinfo/monkey > > > > > > _______________________________________________ > > > Monkey mailing list > > > [email protected] > > > http://lists.monkey-project.com/listinfo/monkey > > > > > > > > > _______________________________________________ > > Monkey mailing list > > [email protected] > > http://lists.monkey-project.com/listinfo/monkey > > > > > > >
_______________________________________________ Monkey mailing list [email protected] http://lists.monkey-project.com/listinfo/monkey
