Re: [Dev] GSoC Project: HTTP Load Balancer on Top of WSO2 Gateway Discussion

Kasun Indrasiri Thu, 11 Aug 2016 09:43:24 -0700

Venkat.. please check whether this is bounded by the contention in a map or
something... may be that's why it slows down when we have multiple
endpoints.


On Thu, Aug 11, 2016 at 2:43 AM, Venkat Raman <[email protected]> wrote:

> Hi Isuru,
>
> Please find the attached bench mark results that were done today.  I've
> used only one OutboundEP for Nginx, LB and i-server.  LB with HealthCheck
> and Timeout features is performing close to i-server.  So based on this..
> Is number of connections causing overhead ?
>
> In previous tests also, only one OutboundEP was used for benchmarking
> i-server against 5 OutboundEPs for Nginx and GW-LB.
>
> Even JFR results of  i-server and LB are similar.  Contentions and Latency
> occurring in LB are occurring in LB also.
>
> Also please find the attached JFR files.
>
>
>
>
> *Thanks,*
> *Venkat.*
>
> On Wed, Aug 10, 2016 at 2:09 PM, Venkat Raman <[email protected]>
> wrote:
>
>> Hi Isuru,
>>
>> Please find the attached results with and without disruptor enablement.
>>
>>
>>
>>
>>
>> *Thanks,*
>> *Venkat.*
>>
>> On Wed, Aug 10, 2016 at 9:23 AM, Venkat Raman <[email protected]>
>> wrote:
>>
>>> Sure Kasun.  Will try to find it.
>>>
>>> Thanks,
>>> Venkat.
>>> On Aug 10, 2016 5:30 AM, "Kasun Indrasiri" <[email protected]> wrote:
>>>
>>>> Hi Venkat,
>>>>
>>>> The drop of the performance of LB compared to GW Framework seems to be
>>>> way too much. I think we can't afford to lose nearly 50% of throughput
>>>> because of the LB components. Let's try to identify the bottlenecks related
>>>> to LB code.
>>>>
>>>> On Tue, Aug 9, 2016 at 4:18 PM, Venkat Raman <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Isuru & Kasun,
>>>>>
>>>>> Please find the attached results document.  As discussed, I created a
>>>>> new VM for bench-marking.
>>>>>
>>>>> It seems like TPS of 20,000 (from yesterday's results) even at higher
>>>>> concurrency level is not accurate.  Sorry for the confusion caused.
>>>>> Most of the time endpoints were marked as unHealthy and direct error
>>>>> response is returned by LB Mediator which resulted in high TPS.  I tried
>>>>> multiple bench-marking from yesterday and I was never able to achieve that
>>>>> result.
>>>>>
>>>>> In this test, to avoid such cases, higher no of unHealthyRetries count
>>>>> has been configured.
>>>>>
>>>>> Also, I've bench-marked performance of GW-FMW using i-server with this
>>>>> simple
>>>>> <https://github.com/Venkat2811/product-http-load-balancer/blob/master/performance-benchmark/gw-framework/router.iflow>
>>>>> configuration.  It is a simple route even without if-else conditions.
>>>>>
>>>>> As you can see, It is performing 2X faster than LB.
>>>>>
>>>>> Next steps would be to do memory benchmark and plot graphs with these
>>>>> values.  Once repo, documentation and blog is ready, I'll be using JFR to
>>>>> identify bottle necks and on fine-tuning LB's performance.
>>>>>
>>>>> Will be looking forward to hear your feedback on this.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *Thanks,*
>>>>> *Venkat.*
>>>>>
>>>>> On Tue, Aug 9, 2016 at 1:13 AM, Venkat Raman <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Sure Kasun, I'll do a perf-benchmark between iserver and LB in a new
>>>>>> VM as discussed.
>>>>>>
>>>>>> Thanks,
>>>>>> Venkat.
>>>>>> On Aug 9, 2016 12:55 AM, "Kasun Indrasiri" <[email protected]> wrote:
>>>>>>
>>>>>>>
>>>>>>> - Compare GW framework perf vs LB (need to identify if any perf
>>>>>>> impact from the LB related code).
>>>>>>> - Identify the reason for the apparent perf bottleneck with high
>>>>>>> concurrency.
>>>>>>>
>>>>>>> On Mon, Aug 8, 2016 at 10:55 AM, Venkat Raman <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Kasun,
>>>>>>>>
>>>>>>>> Please find the latest results after Saturday's code review.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Thanks,*
>>>>>>>> *Venkat.*
>>>>>>>>
>>>>>>>> On Mon, Aug 8, 2016 at 10:06 AM, Venkat Raman <[email protected]
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Hi Isuru,
>>>>>>>>>
>>>>>>>>> Good morning.  Please find 11th week's progress.
>>>>>>>>>
>>>>>>>>> 1) Had code reviews and made few suggested corrections.
>>>>>>>>> 2) Did some groundwork for using JFR
>>>>>>>>>
>>>>>>>>> Will be continuing to work on performance tuning.
>>>>>>>>>
>>>>>>>>> @Kasun - Tomorrow is August 9th.  Can we have demo ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Thanks,*
>>>>>>>>> *Venkat.*
>>>>>>>>>
>>>>>>>>> On Sat, Aug 6, 2016 at 10:16 PM, Venkat Raman <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Isuru,
>>>>>>>>>>
>>>>>>>>>> Here are the findings from today's review:
>>>>>>>>>>
>>>>>>>>>> 1) Change CallMediatorMap from ConcurrentHashMap to HashMap
>>>>>>>>>> 2) Remove unnecessary Synchronized block while checking
>>>>>>>>>> areAllEndpointsUnhealthy()
>>>>>>>>>> 3) Rename LoadBalancerCallMediator to LBEndpointsCallMediator
>>>>>>>>>> 4) Give a PR by adding getUri() method to gateway-framework
>>>>>>>>>> 5) Use JavaFlightRecorder while doing benchmark to identify
>>>>>>>>>> bottlenecks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *Venkat.*
>>>>>>>>>>
>>>>>>>>>> On Fri, Aug 5, 2016 at 1:43 PM, Venkat Raman <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Isuru,
>>>>>>>>>>>
>>>>>>>>>>> Please find the attached latest bench-mark without
>>>>>>>>>>> synchronization, callBackpool, healthcheck.
>>>>>>>>>>>
>>>>>>>>>>> Throughput is just 1000 times faster than my current
>>>>>>>>>>> implementation.
>>>>>>>>>>>
>>>>>>>>>>> It is drastically falling because of some other reason.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *Thanks,*
>>>>>>>>>>> *Venkat.*
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Aug 5, 2016 at 10:09 AM, Venkat Raman <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi IsuruU,
>>>>>>>>>>>>
>>>>>>>>>>>> FYI
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *Thanks,*
>>>>>>>>>>>> *Venkat.*
>>>>>>>>>>>>
>>>>>>>>>>>> ---------- Forwarded message ----------
>>>>>>>>>>>> From: Venkat Raman <[email protected]>
>>>>>>>>>>>> Date: Thu, Aug 4, 2016 at 10:39 AM
>>>>>>>>>>>> Subject: Re: GSoC Project: HTTP Load Balancer on Top of WSO2
>>>>>>>>>>>> Gateway Discussion
>>>>>>>>>>>> To: Isuru Ranawaka <[email protected]>, Kasun Indrasiri <
>>>>>>>>>>>> [email protected]>
>>>>>>>>>>>> Cc: DEV <[email protected]>, Senduran Balasubramaniyam <
>>>>>>>>>>>> [email protected]>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Isuru & Kasun,
>>>>>>>>>>>>
>>>>>>>>>>>> Please find the attached result document
>>>>>>>>>>>> (raw-engine-transport.xlsx).  I've done test with 
>>>>>>>>>>>> raw-engine-transport
>>>>>>>>>>>> without any BE.  It is performing great  and is close to Netty 
>>>>>>>>>>>> based BE !!
>>>>>>>>>>>>
>>>>>>>>>>>> Problem is with LB only.
>>>>>>>>>>>>
>>>>>>>>>>>> My guess is that CallbackPool (using concurrent HashMap) that
>>>>>>>>>>>> we are using to determine timeout is the bottle neck.  I'll disable
>>>>>>>>>>>> Callback pool and do bench-mark and update you on that.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *Thanks,*
>>>>>>>>>>>> *Venkat.*
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Aug 4, 2016 at 9:48 AM, Venkat Raman <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Okay Isuru.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Thanks,*
>>>>>>>>>>>>> *Venkat.*
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Aug 4, 2016 at 9:42 AM, Isuru Ranawaka <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi venkat,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes we can. lets have a call today around 9.30 p.m
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Aug 4, 2016 at 9:34 AM, Venkat Raman <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Isuru,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Good morning.  Yesterday night, I spoke with Kasun regarding
>>>>>>>>>>>>>>> the latest update on bench-mark results. Even without any 
>>>>>>>>>>>>>>> locking
>>>>>>>>>>>>>>> performance is not good after concurrency of 5000.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As you have done bench-mark till concurrency of 3000, we
>>>>>>>>>>>>>>> both would like to do bench-marking on raw carbon-transport upto
>>>>>>>>>>>>>>> concurrency of 10,000 and 1,00,000 requests so that we get an 
>>>>>>>>>>>>>>> idea on this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> How do we do that ? Will a simple response from engine
>>>>>>>>>>>>>>> suffice ?  Can I use LB to send simple response directly 
>>>>>>>>>>>>>>> without doing any
>>>>>>>>>>>>>>> mediation ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Thanks,*
>>>>>>>>>>>>>>> *Venkat.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Aug 4, 2016 at 12:19 AM, Venkat Raman <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Isuru,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please find the attached bench-mark results.  As
>>>>>>>>>>>>>>>> discussed,  I've disabled health-checking and removed 
>>>>>>>>>>>>>>>> synchronized block
>>>>>>>>>>>>>>>> and used atomic Integer in one test and also did a test 
>>>>>>>>>>>>>>>> without any kind of
>>>>>>>>>>>>>>>> lock or use of atomic integers.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Throughput and latency results are positive.  But, after
>>>>>>>>>>>>>>>> concurrency level of 5000 it is not that good.  So even If we 
>>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>>> read-write lock or stamped lock, we will get performance 
>>>>>>>>>>>>>>>> little performance
>>>>>>>>>>>>>>>> gain only.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I feel that If we can do bench-mark with integration-server
>>>>>>>>>>>>>>>> upto 10000 concurrent connections we'll get a better idea.  Is 
>>>>>>>>>>>>>>>> that okay ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *Thanks,*
>>>>>>>>>>>>>>>> *Venkat.*
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 2, 2016 at 9:39 PM, Venkat Raman <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Isuru & Kasun,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please find the findings from today's code review.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1) Locking in getNextLBOutboundEndpoint() method in
>>>>>>>>>>>>>>>>> algorithm implementation is causing over-head.  We have to 
>>>>>>>>>>>>>>>>> find a way to
>>>>>>>>>>>>>>>>> efficiently handle communication between threads to reduce 
>>>>>>>>>>>>>>>>> locking overhead.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2) Code repo freeze by August 15th for the sake of GSoC.
>>>>>>>>>>>>>>>>> If we can find a way to overcome locking over-head before 
>>>>>>>>>>>>>>>>> August 15th that
>>>>>>>>>>>>>>>>> changes will be added to code repo.  Otherwise it will be 
>>>>>>>>>>>>>>>>> added after GSoC.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 3) TPS, Latency and Memory graphs to be added.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 4) Blog post and PDF documentation.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *Thanks,*
>>>>>>>>>>>>>>>>> *Venkat.*
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Aug 1, 2016 at 9:25 AM, Venkat Raman <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Isuru,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Good morning.  Please find 10th week's progress.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 1) Had discussion with Kasun.
>>>>>>>>>>>>>>>>>> 2) As suggested, did performance bench-marking using
>>>>>>>>>>>>>>>>>> Netty BE, and it turns out that our LB is beating Nginx till 
>>>>>>>>>>>>>>>>>> concurrency
>>>>>>>>>>>>>>>>>> level of 6000 after which it is not performing well.
>>>>>>>>>>>>>>>>>> I've attached the results.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I've started a new thread as Conversation arrangement is
>>>>>>>>>>>>>>>>>> not good in previous one.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It would be great if we can have a code review Isuru.
>>>>>>>>>>>>>>>>>> Based on your feedback I'll be abe to make changes and we 
>>>>>>>>>>>>>>>>>> can do
>>>>>>>>>>>>>>>>>> bench-marking again.  Can we do it today 9:30 PM ?  We have 
>>>>>>>>>>>>>>>>>> only 2 full
>>>>>>>>>>>>>>>>>> weeks more.  The last week will be for documentation.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> *Thanks,*
>>>>>>>>>>>>>>>>>> *Venkat.*
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best Regards
>>>>>>>>>>>>>> Isuru Ranawaka
>>>>>>>>>>>>>> M: +94714629880
>>>>>>>>>>>>>> Blog : http://isurur.blogspot.com/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Kasun Indrasiri
>>>>>>> Director, Integration Technologies
>>>>>>> WSO2, Inc.; http://wso2.com
>>>>>>> lean.enterprise.middleware
>>>>>>>
>>>>>>> cell: +1 650 450 2293
>>>>>>> Blog : http://kasunpanorama.blogspot.com/
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Kasun Indrasiri
>>>> Director, Integration Technologies
>>>> WSO2, Inc.; http://wso2.com
>>>> lean.enterprise.middleware
>>>>
>>>> cell: +1 650 450 2293
>>>> Blog : http://kasunpanorama.blogspot.com/
>>>>
>>>
>>
>


-- 
Kasun Indrasiri
Director, Integration Technologies
WSO2, Inc.; http://wso2.com
lean.enterprise.middleware

cell: +1 650 450 2293
Blog : http://kasunpanorama.blogspot.com/

_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Re: [Dev] GSoC Project: HTTP Load Balancer on Top of WSO2 Gateway Discussion

Reply via email to