On Mon, 2012-03-26 at 16:28 +0530, Mahesh Gondi wrote:
> 
> 
> On Mon, Mar 26, 2012 at 3:38 PM, Davidlohr Bueso <[email protected]> wrote:
>         On Mon, 2012-03-26 at 02:56 -0300, Felipe Astroza Araya wrote:
>         > Good aproach. It's like a stack (LIFO) of sched_connections
>         BUT I'd prefer a linked list, because it's simpler. You could
>         use just a "free list" and not two arrays (stack and queue).
>         When a connection is closed his sched_connection is returned
>         to the "free list" (head).
>         >
>         
>         
>         It seems to be this is an overkill and the optimization is too
>         small. We
>         can always maintain a global variable with the size of current
>         capacity
>         - since in a threaded context it would require locking which
>         might lead
>         to contention. Another alternative is to keep a bitmap instead
>         of a new
>         free_in_queue array. So the bitmap would have a size of
>         work_capacity
>         each time a slot is occupied, the corresponding bit is set.
>         Bitmaps are
>         O(1) as well and the overhead is just 1 bit per setting
> 
> In case of bitmaps O(1) will be if I am checking the value of a
> particular position. But for finding a free bit it will be O(n/t) ,
> t=sizeof(datatype used beneath)==> O(n), with very very low constants.

No! The whole point of bitmaps is speed and simplicity: we don't iterate
over the map sequentially and use instead bit operations. For example to
set a bit at position N:

unsigned long *p = ((unsigned long *)worker_capacity_addr) +
worker_capacity/BITS_PER_LONG; 

*p &= ~((1 << ((worker_capcity) % BITS_PER_LONG)))

where BITS_PER_LONG is 32 or 64 depending on the architecture.

>  Since, we're not working with a worker_capacity being very very large
> these low-valued constants will come into play and prove very
> helpyful.

Correct, which is why I think that optimizations here aren't very useful
- O(n) is just fine when dealing with small ranges.

> 
>         
>         > Another issue in mk_scheduler is mk_sched_get_connection().
>         This function is called from mk_conn_write() and
>         mk_sched_remove_client(). The mk_sched_get_connection()'s
>         complexity is O(work_capacity), is used two times at least in
>         connection life when it could be avoid completely. epoll_wait
>         returns a event array and Monkey uses the socket fd as
>         epoll_event data. That's wrong decision!, epoll_event data
>         should be the sched_connection and NOT the socket fd. It's
>         possible to improve it, but need hard work.
>         >
>         > El 26-03-2012, a las 1:22, Eduardo Silva escribió:
>         >
>         > > Hi,
>         > >
>         > > thanks for the patch. Looking with valgrind seems to be
>         optimized a
>         > > little bit, screenshot here:
>         > >
>         > >
>         http://edsiper.linuxchile.cl/sched_optimization_001.png
>         > >
>         > > without optimization mk_sched_register() takes 0.40 for
>         5000 calls,
>         > > the same situation but for an optimized code takes 0.36.
>         Its an
>         > > improvement.
>         > >
>         > > Dave, Zeus and Max, what do you think about the patch ?
>         > >
>         > > cheers,
>         > >
>         > >
>         > > On Sun, Mar 25, 2012 at 9:43 PM, Mahesh Gondi
>         <[email protected]> wrote:
>         > >> Hi all,
>         > >>
>         > >> I made some changes to mk_scheduler.c. First I will
>         explain in brief what I
>         > >> did before the results.
>         > >>
>         > >> In mk_scheduler.c , the mk_sched_register_client serves
>         the purpose of
>         > >> adding new client requests to the worker thread
>         queue(everything discussed
>         > >> here happens in the thread context). Adding was done by
>         iterating over the
>         > >> queue to looking for an available spot to be inserted.
>         When the load on
>         > >> server is at near max, then this insertion cost rises to
>         O(work_capacity).
>         > >>
>         > >> Instead I maintained free spots on the queue(list of
>         client requests
>         > >> received), in a simple array of size (work_capacity+1)
>         with each element
>         > >> pointing to an index in queue(first element kept a count
>         of number of free
>         > >> spots available). Array(arr) contains free spots as
>         pointed by the index
>         > >> values stored at the position from 1 to arr[0]. Insertion
>         now only takes a
>         > >> constant time. Hence this has contributed in running
>         monkey a bit cheaper.
>         > >> Similar modifications are in progress, should help monkey
>         run more and more
>         > >> faster . :)
>         > >>
>         > >> Below are the results
>         > >>
>         > >> Output I got for running with "siege -c 300 -t 30S
>         127.0.01:2001",
>         > >>
>         > >> //WITH CONSTANT TIME INSERTION
>         > >> Transactions:                  18051 hits
>         > >> Availability:                 100.00 %
>         > >> Elapsed time:                  29.96 secs
>         > >> Data transferred:              23.48 MB
>         > >> Response time:                  0.00 secs
>         > >> Transaction rate:             602.50 trans/sec
>         > >> Throughput:                   0.78 MB/sec
>         > >> Concurrency:                    2.30
>         > >> Successful transactions:       18051
>         > >> Failed transactions:               0
>         > >> Longest transaction:            0.23
>         > >> Shortest transaction:           0.00
>         > >>
>         > >> ============================================
>         > >>
>         > >> //EARLIER
>         > >> Transactions:                  17711 hits
>         > >> Availability:                 100.00 %
>         > >> Elapsed time:                  30.01 secs
>         > >> Data transferred:              23.04 MB
>         > >> Response time:                  0.00 secs
>         > >> Transaction rate:             590.17 trans/sec
>         > >> Throughput:                     0.77 MB/sec
>         > >> Concurrency:                    1.18
>         > >> Successful transactions:       17711
>         > >> Failed transactions:               0
>         > >> Longest transaction:            0.17
>         > >> Shortest transaction:           0.00
>         > >>
>         > >> i had taken output for each case just after a fresh
>         restart. Reason for only
>         > >> ~600 trans/sec is that it was run ec2 t1.small instance.
>         > >>
>         > >> Thanks & Regards,
>         > >> mahesh gondi
>         > >>
>         > >> _______________________________________________
>         > >> Monkey mailing list
>         > >> [email protected]
>         > >> http://lists.monkey-project.com/listinfo/monkey
>         > >>
>         > >
>         > >
>         > >
>         > > --
>         > > Eduardo Silva
>         > > http://edsiper.linuxchile.cl
>         > > http://www.monkey-project.com
>         > > _______________________________________________
>         > > Monkey mailing list
>         > > [email protected]
>         > > http://lists.monkey-project.com/listinfo/monkey
>         >
>         > _______________________________________________
>         > Monkey mailing list
>         > [email protected]
>         > http://lists.monkey-project.com/listinfo/monkey
>         >
>         
>         
>         _______________________________________________
>         Monkey mailing list
>         [email protected]
>         http://lists.monkey-project.com/listinfo/monkey
>         
> 


_______________________________________________
Monkey mailing list
[email protected]
http://lists.monkey-project.com/listinfo/monkey

Reply via email to