Sorry guys, didn't realise there was one until you told me. Maybe a link on the wiki to it. Request has been sent. On Nov 17, 3:18 pm, Kaylor Mail <[email protected]> wrote: > Can you send a pull request? > > On Nov 16, 2011, at 8:56 PM, Michael Lyons <[email protected]> wrote: > > > > > > > > > Great news Corey after a little bit of playing around I found what > > seems to be a possible solution. > > > I reworked the code in MsmqLoadBalancer so that after a number of > > failures to contact a worker it would then pause the thread for a > > second and reset the failure count back to zero. By doing so the load > > balancer dropped CPU usage to around 7%. > > > It worked perfectly in the situation when a worker was busy and > > another worker process was started alleviating the queue backlog > > without the load balancer trying to hog the system. > > > My code for the change to MsmqLoadBalancer.HandleStandardMessage can > > be found here:http://pastebin.com/0PbC6ecB > > > On Nov 17, 12:56 am, Corey Kaylor <[email protected]> wrote: > >> Ok, I'll take a look when I get into the office. I may suggest changes to > >> make and have you try them out. I have run into similar issues with rhino > >> queues being too eager in peeking messages in the past. > > >> On Wed, Nov 16, 2011 at 1:00 AM, Michael Lyons <[email protected]> > >> wrote: > >>> Sorry about that last message, for some reason it lost it's formatting > > >>> On Nov 16, 6:44 pm, Michael Lyons <[email protected]> wrote: > >>>> Strangely enough I'm going to be testing load balancing next week > >>>> across physical servers as I have provisioned another server last week > >>>> for the staging environment to test this out. > >>>> In our case the workers get tied up as they are contacting website > >>>> services which sometimes can be really slow (up to 120 seconds) > >>>> causing the load balancers queue to grow. My idea with the load > >>>> balancer was so I can spin up a new worker process when the queue > >>>> becomes too large, which is what I can do currently and it works > >>>> perfectly, it's just that the load balancer is consuming more > >>>> resources than it needs to while the machine is really not under any > >>>> other stress. > >>>> I've just done some quick profile and all the action seems to be > >>>> called from AbstractMsmqListener.PeekMessageOnBackgroundThread. It > >>>> spends 53% of its time in calls to > >>>> MsmqLoadBalancer.HandlePeekedMessage and it's children with the > >>>> remaining 47% in AbstractMsmqListener.TryPeek and it's children. > >>>> So over a total period of 4 minutes RSB consumed 183 seconds out of > >>>> 240 seconds of CPU time excluding my app's time. Which I think is a > >>>> bit excessive particularly since it peeked at 226130 messages. > >>>> Shouldn't the load balancer pause for a second if it failed to get in > >>>> contact with any of the workers, instead of just blindly retrying? > >>>> Here are the top offenders in csv format - if you want I can email you > >>>> a full csv (it's actually tab delimited) or a pdf. > >>>> Total Time with children (ms), Average Time with children (ms), Total > >>>> for self (ms), Average for self (ms), Calls, Method name > > >>> +183366,0.8,11384,0.1,226122,Rhino.ServiceBus.LoadBalancer.MsmqLoadBalancer > >>> .HandlePeekedMessage(Rhino.ServiceBus.Msmq.OpenedQueue,System.Messaging.Mes > >>> sage) > > >>> +160651,0.7,4743,0,226130,Rhino.ServiceBus.Msmq.AbstractMsmqListener.TryPee > >>> k(Rhino.ServiceBus.Msmq.OpenedQueue,System.Messaging.Message&) > > >>> +155724,0.7,155724,0.7,226130,Rhino.ServiceBus.Msmq.OpenedQueue.Peek(System > >>> .TimeSpan) > > >>> +134270,0.6,134138,0.6,226125,Rhino.ServiceBus.Msmq.OpenedQueue.TryGetMessa > >>> geFromQueue(System.String)29787,0.2,1159,0,180430,Rhino.ServiceBus.LoadBala > >>> ncer.MsmqLoadBalancer.HandleStandardMessage(Rhino.ServiceBus.Msmq.OpenedQue > >>> ue,System.Messaging.Message)28569,0.2,28546,0.2,180430,Rhino.ServiceBus.Msm > >>> q.OpenedQueue.Send(System.Messaging.Message)7759,0,1312,0,180432,Rhino.Serv > >>> iceBus.LoadBalancer.MsmqLoadBalancer.PersistEndpoint(Rhino.ServiceBus.Msmq. > >>> OpenedQueue,System.Messaging.Message)3825,0,3825,0,180432,Rhino.ServiceBus. > >>> Msmq.MsmqUtil.GetQueueUri(System.Messaging.MessageQueue)2622,0,2622,0,18043 > >>> 1,Rhino.ServiceBus.DataStructures.Set`1.Add(T) > >>>> On Nov 16, 4:39 pm, Corey Kaylor <[email protected]> wrote: > > >>>>> I am happy to take any form of contribution you can offer. > > >>>>> By adding additional worker endpoints I mean. > > >>>>> Load Balancer 1, 5 threads, deployed to MachineA > >>>>> 1 worker endpoint, configured to send to > >>> Machine1\queue1.readyforwork, 5 > >>>>> threads, deployed to NewMachineB > >>>>> 2 worker endpoint, configured to send to > >>> Machine1\queue1.readyforwork, 5 > >>>>> threads, deployed to NewMachineC > > >>>>> Load balancing although completely *possible* to run on one machine, > >>> was > >>>>> designed to distribute load to multiple machines. You're not gaining > >>> any > >>>>> benefits from load balancing when there is only one worker sending > >>> ready > >>>>> for work messages to the load balancer. You would be better off in this > >>>>> case just having two endpoints without load balancing. > > >>>>> On Tue, Nov 15, 2011 at 10:28 PM, Michael Lyons <[email protected] > >>>> wrote: > > >>>>>> I've run EQATEC profiler against the code and when the load balancer > >>>>>> process is under load it it records no activity between snapshots > >>>>>> indicating it is sitting in RSB code. > > >>>>>> I'd be happy to spot profile RSB in my app and point out where the > >>>>>> high CPU is coming from but I'm assuming you already have a fair > >>> idea. > > >>>>>> What do you mean by adding additional worker endpoints? Can you point > >>>>>> me to an example. > > >>>>>> On Nov 16, 3:40 pm, Corey Kaylor <[email protected]> wrote: > >>>>>>> I would try changing the thread counts on the consumers and the > >>> load > >>>>>>> balancer, and possibly add additional worker endpoint(s). > > >>>>>>> Ayende in previous conversations has recommended thread counts > >>> that are > >>>>>>> equal to the number of cores on the machine. I have found that > >>> isn't > >>>>>> always > >>>>>>> a perfect recipe. So in our case we have run load tests and > >>> changing the > >>>>>>> configuration of threads for each machine. > > >>>>>>> When changing the thread counts on each test run, try to observe > >>> which > >>>>>>> specific process is utilizing the most CPU. > > >>>>>>> There may be places to optimize for sure, but it sounds to me like > >>>>>> threads > >>>>>>> are competing for priority. > > >>>>>>> On Tue, Nov 15, 2011 at 9:24 PM, Michael Lyons < > >>> [email protected]> > >>>>>> wrote: > >>>>>>>> Yes you're correct, it's a staging environment where we do our > >>> testing > >>>>>>>> before releasing into production. > > >>>>>>>> That's pretty much the situation. > > >>>>>>>> Here are the xml configurations for the 2 load balancers: > > >>>>>>>> <loadBalancer threadCount="5" > >>>>>>>> endpoint="msmq://localhost/notifier.loadbalancer" > >>>>>>>> readyForWorkEndpoint="msmq://localhost/ > >>>>>>>> notifier.loadbalancer.acceptingwork" > >>>>>>>> /> > > >>>>>>>> <loadBalancer threadCount="5" > > >>> endpoint="msmq://localhost/processor.loadbalancer" > >>>>>>>> readyForWorkEndpoint="msmq://localhost/ > >>>>>>>> processor.loadbalancer.acceptingwork" > >>>>>>>> /> > > >>>>>>>> Consumers xml configuration is: > > >>>>>>>> <bus threadCount="20" > >>>>>>>> loadBalancerEndpoint="msmq://localhost/ > >>>>>>>> processor.loadbalancer.acceptingwork" > >>>>>>>> numberOfRetries="5" > >>>>>>>> endpoint="msmq://localhost/processor" > >>>>>>>> /> > > >>>>>>>> <bus threadCount="20" > >>>>>>>> loadBalancerEndpoint="msmq://localhost/ > >>>>>>>> notifier.loadbalancer.acceptingwork" > >>>>>>>> numberOfRetries="5" > >>>>>>>> endpoint="msmq://localhost/notifier" > >>>>>>>> /> > > >>>>>>>> On Nov 16, 3:13 pm, Corey Kaylor <[email protected]> wrote: > >>>>>>>>> To summarize your setup. > > >>>>>>>>> Load Balancer 1, configured for messages belonging to > >>> NamespaceA, > >>>>>> with 5 > >>>>>>>>> threads, deployed to MachineA\queue1 > >>>>>>>>> 1 worker endpoint sending sending ready for work to > >>>>>>>>> MachineA\queue1.readyforwork, configured with 20 threads, > >>> deployed to > >>>>>>>>> MachineA > > >>>>>>>>> Load Balancer 2, configured for messages belonging to > >>> NamespaceB, > >>>>>> with 5 > >>>>>>>>> threads, deployed to MachineA\queue2 > >>>>>>>>> 1 worker endpoint sending ready for work to > >>>>>>>>> MachineA\queue2.readyforwork, configured with 20 threads, > >>> deployed to > >>>>>>>>> MachineA > > >>>>>>>>> I assumed by staging server that you mean staging environment > >>> that is > >>>>>>>>> configured similarly above but with different machine specs as > >>> you've > >>>>>>>>> stated. > > >>>>>>>>> Is this correct? > > >>>>>>>>> On Tue, Nov 15, 2011 at 8:12 PM, Michael Lyons < > >>> [email protected] > > >>>>>>>> wrote: > >>>>>>>>>> The load balancers are configured with the > >>> readyForWorkEndpoint > >>>>>>>>>> attribute on the loadBalancer xml element. > > >>>>>>>>>> System is a quad core 2.83Ghz core 2 duo, on the staging > >>> server > >>>>>> which > >>>>>>>>>> is running an older single core 2.8Ghz xeon (Dell 2650) with > >>> hyper > >>>>>>>>>> threading it sits at about 80% and in production it sits > >>> between > >>>>>> 40 to > >>>>>>>>>> 80% on a quad core 2.8Ghz xeon (Dell R210) where it is > >>> allocated 2 > >>>>>>>>>> cores > > >>>>>>>>>> Forgot to mention that RSB is version 2.2 > > >>>>>>>>>> On Nov 16, 1:17 pm, Corey Kaylor <[email protected]> wrote: > >>>>>>>>>>> Also, how many cores are on the load balancer machine? > >>> There > >>>>>>>> shouldn't be > >>>>>>>>>>> that much demand on the cpu, but having said that it really > >>>>>> depends > >>>>>>>> on > >>>>>>>>>> the > >>>>>>>>>>> circumstances and environment. > > >>>>>>>>>>> On Tue, Nov 15, 2011 at 7:15 PM, Corey Kaylor < > >>> [email protected] > > >>>>>>>> wrote: > >>>>>>>>>>>> Is each load balancer configured with a ready for work > >>> uri? > > >>>>>>>>>>>> On Mon, Nov 14, 2011 at 5:06 PM, Michael Lyons < > >>>>>>>> [email protected] > >>>>>>>>>>> wrote: > > ... > > read more »
-- You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/rhino-tools-dev?hl=en.
