Hi Graeme,

i created a simpler scenario comparing to what the Sip Stress testing uses.
In each scenario two subscribers try just to register to IMS and do not
make any call to each other. I run this scenario for 15000 pairs of
subscribers (30000 subscribers). The register requests are distributed in 1
minute time. It seems tha Sprout node is the bottleneck. The return code of
most of the failed messages is  503 (Service Unavailable) and of some of
them 408 (Request Timeout). I have added resources in Sprout (4 CPUs and
8Gb memory) so i don't believe that resources is the issue.

Does Sprout somehow exposes the latency measurements that lead to the
throttling? We would like to take a look at them.



 *Here is the the xml file* .


<scenario name="Call Load Test">

  <User variables="my_dn,peer_dn,call_repeat" />
  <nop hide="true">
    <action>
      <!-- Get my and peer's DN -->
      <assignstr assign_to="my_dn" value="[field0]" />
      <!-- field1 is my_auth, but we can't store it in a variable -->
      <assignstr assign_to="peer_dn" value="[field2]" />
      <!-- field3 is peer_auth, but we can't store it in a variable -->
      <assign assign_to="reg_repeat" value="0"/>
      <assign assign_to="call_repeat" value="0"/>
    </action>
  </nop>

  <pause distribution="uniform" min="0" max="60000" />

  <send>
    <![CDATA[

      REGISTER sip:[$my_dn]@[service] SIP/2.0
      Via: SIP/2.0/[transport]
[local_ip]:[local_port];rport;branch=[branch]-[$my_dn]-[$reg_repeat]
      Route: <sip:[service];transport=[transport];lr>
      Max-Forwards: 70
      From: <sip:[$my_dn]@[service]>;tag=[pid]SIPpTag00[call_number]
      To: <sip:[$my_dn]@[service]>
      Call-ID: [$my_dn]///[call_id]
      CSeq: [cseq] REGISTER
      User-Agent: Accession 4.0.0.0
      Supported: outbound, path
      Contact:
<sip:[$my_dn]@[local_ip]:[local_port];transport=[transport];ob>;+sip.ice;reg-id=1;+sip.instance="<urn:uuid:00000000-0000-0000-0000-000000000001>"
      Expires: 3600
      Allow: PRACK, INVITE, ACK, BYE, CANCEL, UPDATE, SUBSCRIBE, NOTIFY,
REFER, MESSAGE, OPTIONS
      Content-Length: 0

    ]]>
  </send>

  <recv response="401" auth="true">
    <action>
      <add assign_to="reg_repeat" value="1" />
    </action>
  </recv>

  <send>
    <![CDATA[

      REGISTER sip:[$my_dn]@[service] SIP/2.0
      Via: SIP/2.0/[transport]
[local_ip]:[local_port];rport;branch=[branch]-[$my_dn]-[$reg_repeat]
      Route: <sip:[service];transport=[transport];lr>
      Max-Forwards: 70
      From: <sip:[$my_dn]@[service]>;tag=[pid]SIPpTag00[call_number]
      To: <sip:[$my_dn]@[service]>
      Call-ID: [$my_dn]///[call_id]
      CSeq: [cseq] REGISTER
      User-Agent: Accession 4.0.0.0
      Supported: outbound, path
      Contact:
<sip:[$my_dn]@[local_ip]:[local_port];transport=[transport];ob>;+sip.ice;reg-id=1;+sip.instance="<urn:uuid:00000000-0000-0000-0000-000000000001>"
      Expires: 3600
      [field1]
      Allow: PRACK, INVITE, ACK, BYE, CANCEL, UPDATE, SUBSCRIBE, NOTIFY,
REFER, MESSAGE, OPTIONS
      Content-Length: 0

    ]]>
  </send>

  <recv response="200">
    <action>
      <ereg regexp="rport=([^;]*);.*received=([^;]*);" search_in="hdr"
header="Via:" assign_to="dummy" />
      <add assign_to="reg_repeat" value="1" />
    </action>
  </recv>
  <Reference variables="dummy" />

  <send>
    <![CDATA[

      REGISTER sip:[$peer_dn]@[service] SIP/2.0
      Via: SIP/2.0/[transport]
[local_ip]:[local_port];rport;branch=[branch]-[$peer_dn]-[$reg_repeat]
      Route: <sip:[service];transport=[transport];lr>
      Max-Forwards: 70
      From: <sip:[$peer_dn]@[service]>;tag=[pid]SIPpTag00[call_number]
      To: <sip:[$peer_dn]@[service]>
      Call-ID: [$peer_dn]///[call_id]
      CSeq: [cseq] REGISTER
      User-Agent: Accession 4.0.0.0
      Supported: outbound, path
      Contact:
<sip:[$peer_dn]@[local_ip]:[local_port];transport=[transport];ob>;+sip.ice;reg-id=1;+sip.instance="<urn:uuid:00000000-0000-0000-0000-000000000001>"
      Expires: 3600
      Allow: PRACK, INVITE, ACK, BYE, CANCEL, UPDATE, SUBSCRIBE, NOTIFY,
REFER, MESSAGE, OPTIONS
      Content-Length: 0

    ]]>
  </send>

  <recv response="401" auth="true">
    <action>
      <add assign_to="reg_repeat" value="1" />
    </action>
  </recv>

  <send>
    <![CDATA[

      REGISTER sip:[$peer_dn]@[service] SIP/2.0
      Via: SIP/2.0/[transport]
[local_ip]:[local_port];rport;branch=[branch]-[$peer_dn]-[$reg_repeat]
      Route: <sip:[service];transport=[transport];lr>
      Max-Forwards: 70
      From: <sip:[$peer_dn]@[service]>;tag=[pid]SIPpTag00[call_number]
      To: <sip:[$peer_dn]@[service]>
      Call-ID: [$peer_dn]///[call_id]
      CSeq: [cseq] REGISTER
      User-Agent: Accession 4.0.0.0
      Supported: outbound, path
      Contact:
<sip:[$peer_dn]@[local_ip]:[local_port];transport=[transport];ob>;+sip.ice;reg-id=1;+sip.instance="<urn:uuid:00000000-0000-0000-0000-000000000001>"
      Expires: 3600
      [field3]
      Allow: PRACK, INVITE, ACK, BYE, CANCEL, UPDATE, SUBSCRIBE, NOTIFY,
REFER, MESSAGE, OPTIONS
      Content-Length: 0

    ]]>
  </send>

  <recv response="200">
    <action>
      <add assign_to="reg_repeat" value="1" />
    </action>
  </recv>

</scenario>


Best Regards,
Michael Katsoulis



2016-09-16 21:25 GMT+03:00 Graeme Robertson (projectclearwater.org) <
gra...@projectclearwater.org>:

> Hi Michael,
>
>
>
> Can you tell me more about your scenario? It sounds like you’re not using
> the clearwater-sip-stress package, or at least not in exactly the form we
> package up. If you’re not using the clearwater-sip-stress package then
> please can you send details of your stress scenario?
>
>
>
> Depending on how powerful your Sprout node is, I would expect 15000 calls
> per second to be towards the upper limit of its performance powers.
> However, if the CPU is not particularly high then that would suggest that
> Sprout’s throttling controls might require further tuning. Do you know what
> return code the “unexpected messages” have? 503s indicate that there is
> overload somewhere. Sprout does adjust its throttling controls to match the
> load its able to process, but that process is not immediate, and we
> recommend building stress up gradually rather than immediately firing 15000
> calls per second into the system – for more information on that, see
> http://www.projectclearwater.org/clearwater-performance-
> and-our-load-monitor/.
>
>
>
> One final thought I had was that the node you’re running stress on might
> be overloaded. If the stress node is not responding to messages in a timely
> fashion then that will generate time outs and unexpected messages.
>
>
>
> Thanks,
> Graeme
>
>
>
> *From:* Clearwater [mailto:clearwater-boun...@lists.projectclearwater.org]
> *On Behalf Of *??????? ?ats?????
> *Sent:* 16 September 2016 15:16
> *To:* clearwater@lists.projectclearwater.org
> *Subject:* Re: [Project Clearwater] Performance limit measurement
>
>
>
> Hi Graeme,
>
>
>
> thanks a lot for your response.
>
>
>
> In our scenario we are using the Stress node to generate 15000 calls in 60
> seconds. The number of
>
> unsuccessful calls varies from ~500 to ~5000 even in subsequent
> repetitions of the same scenario.
>
> According to wireshark the failures happen because of Sprout that does not
> send the correct responses in time
>
> and so we get "time-outs" and "unexpected messages" in the Stress node.
>
> The Sprout node has sufficient CPU and memory resources.
>
> What could be the reason of this instability in our deployment?
>
>
>
> Thank you in advance,
>
> Michael Katsoulis
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> 2016-09-16 16:14 GMT+03:00 Graeme Robertson (projectclearwater.org) <
> gra...@projectclearwater.org>:
>
> Hi Michael,
>
>
>
> How many successes and failures are you seeing? We primarily use the
> clearwater-sip-stress package to check we haven’t introduced crashes under
> load, and to check we haven’t significantly regressed the performance of
> Project Clearwater. Unfortunately clearwater-sip-stress is not reliable
> enough to generate completely accurate performance numbers for Project
> Clearwater (and we don’t accurately measure Project Clearwater performance
> or provide numbers). We tend to see around 1% failures when running
> clearwater-sip-stress. If your failure numbers are fluctuating at around 1%
> then this is probably down to the test scripts not being completely
> reliable, and you won’t have actually hit the deployment’s limit until you
> start seeing more failures than this.
>
>
>
> Thanks,
>
> Graeme
>
>
>
>
>
> *From:* Clearwater [mailto:clearwater-boun...@lists.projectclearwater.org]
> *On Behalf Of *??????? ?ats?????
> *Sent:* 16 September 2016 10:17
> *To:* Clearwater@lists.projectclearwater.org
> *Subject:* [Project Clearwater] Performance limit measurement
>
>
>
> Hi all,
>
>
>
> we are running Stress Tests against our Clearwater Deployment using Sip
> Stress node.
>
> We have noticed that the results are not consistent as the number of
> successfull calls changes during repetitions of the same test scenario.
>
>
>
> We have tried to increase the values of max_tokens , init_token_rate,
> min_token_rate and
>
> target_latency_us but we did not observe any difference.
>
>
>
> What is the proposed way to discover the deployment's limit on how many
> requests per second can
>
> be served?
>
>
>
> Thanks in advance,
>
> Michael Katsoulis
>
>
> _______________________________________________
> Clearwater mailing list
> Clearwater@lists.projectclearwater.org
> http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.
> projectclearwater.org
>
>
>
> _______________________________________________
> Clearwater mailing list
> Clearwater@lists.projectclearwater.org
> http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.
> projectclearwater.org
>
>
_______________________________________________
Clearwater mailing list
Clearwater@lists.projectclearwater.org
http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org

Reply via email to