Re: [asterisk-users] Any thoughts on Asterisk 16.17.0 outputting FRACK refcount related messages

2021-11-17 Thread Telium Technical Support
I don't *think* it would purely volume related.  We have 16.17 deployments
with very large loads running without issue, and we also run 16.17 against
load simulators without issue.

 

In each case you have to traceback to find the cause of the problem.  For
example, a bad SBC which does not fully adhere to the SIP protocol could be
confusing PJSIP, causing timeouts, etc.  Our engineers spent a few days
tracing such a problem before we shipped a high volume system to a customer
earlier this year.  (I have NOT traced through your logs below, so I'm not
saying that is your problem)

 

There are some helpful posts here to help you trace, in the archives of this
list.  (In fact I see one responding to a question you asked 2 months ago).
Did that yield nothing?  (The response was dead on in terms of how to
diagnose).  If you share the result of that diagnosis you might get some
helpful answers.

 

Dave

 

 

From: asterisk-users [mailto:asterisk-users-boun...@lists.digium.com] On
Behalf Of Dan Cropp
Sent: Thursday, September 23, 2021 12:59 PM
To: 'asterisk-users@lists.digium.com' 
Subject: [asterisk-users] Any thoughts on Asterisk 16.17.0 outputting FRACK
refcount related messages

 

We have an extremely busy/large customer.  They run fine most of the time,
but periodically asterisk will output FRACK refcount related messages.  It
doesn't seem to be related to the volume, because it's not breaking during
their peak times.

 

When this happens, the system becomes unstable and they have to restart to
get things resolved.

To give an idea of the instability, we have seen INVITE/Trying responses in
SIP messaging logs.

We tell Asterisk to answer via AMI, but Asterisk never sends the OK (even 24
seconds later it hasn't sent).

Eventually the other send CANCEL of the call.

 

 

We've now captured 4 different days where something like the following
occurs.

1) Is there a good way to tell if this may be fixed in Asterisk 16.20.0
(short of upgrading)?

2) Would this be something I should submit as an asterisk issue?
Unfortunately, site is so busy capturing the debug will be very difficult
(if not impossible) due to amount of data.

 

 

[09/23 14:43:45.095] ERROR[34763][C-1d7f] frame.c: Excessive refcount
10 reached on ao2 object 0x5637ed3742b8

[09/23 14:43:45.095] ERROR[34763][C-1d7f] frame.c: FRACK!, Failed
assertion Excessive refcount 10 reached on ao2 object 0x5637ed3742b8 (0)

[09/23 14:43:45.095] ERROR[29830] frame.c: Excessive refcount 10 reached
on ao2 object 0x5637ed3742b8

[09/23 14:43:45.095] ERROR[29832][C-1920] frame.c: Excessive refcount
10 reached on ao2 object 0x5637ed3742b8

[09/23 14:43:45.095] ERROR[29830] frame.c: FRACK!, Failed assertion
Excessive refcount 10 reached on ao2 object 0x5637ed3742b8 (0)

[09/23 14:43:45.095] ERROR[29832][C-1920] frame.c: FRACK!, Failed
assertion Excessive refcount 10 reached on ao2 object 0x5637ed3742b8 (0)

[09/23 14:43:45.095] ERROR[32973] frame.c: Excessive refcount 10 reached
on ao2 object 0x5637ed3742b8

[09/23 14:43:45.095] ERROR[32973] frame.c: FRACK!, Failed assertion
Excessive refcount 10 reached on ao2 object 0x5637ed3742b8 (0)

[09/23 14:43:45.095] ERROR[3248] frame.c: Excessive refcount 10 reached
on ao2 object 0x5637ed3742b8

[09/23 14:43:45.095] ERROR[3248] frame.c: FRACK!, Failed assertion Excessive
refcount 10 reached on ao2 object 0x5637ed3742b8 (0)

[09/23 14:43:45.095] WARNING[2798][C-0093] channel.c: Exceptionally long
queue length queuing to
CBAnn/IS__8b9c6719-ca29-4c1b-ac87-75e8c6fe7074-0062;1

[09/23 14:43:45.095] ERROR[12979] frame.c: Excessive refcount 10 reached
on ao2 object 0x5637ed3742b8

[09/23 14:43:45.095] ERROR[12979] frame.c: FRACK!, Failed assertion
Excessive refcount 10 reached on ao2 object 0x5637ed3742b8 (0)

[09/23 14:43:45.095] WARNING[21123][C-1157] channel.c: Exceptionally
long queue length queuing to
CBAnn/IS__a652155f-b1fb-4c31-83e5-09ffa2107979-10de;1

[09/23 14:43:45.096] ERROR[29830] : Got 8 backtrace records

# 0: /usr/sbin/asterisk(__ao2_ref+0x209) [0x5637ebef1519]

# 1: /usr/sbin/asterisk(ast_frdup+0x1e2) [0x5637ebf96612]

# 2: /usr/sbin/asterisk(ast_bridge_channel_queue_frame+0x61)
[0x5637ebf1cfe1]

# 3: /usr/lib/asterisk/modules/bridge_softmix.so(+0x40af) [0x7fc15362f0af]

# 4: /usr/lib/asterisk/modules/bridge_softmix.so(+0x560a) [0x7fc15363060a]

# 5: /usr/sbin/asterisk(+0x1db41f) [0x5637ec06041f]

# 6: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fc1e9ebf6db]

# 7: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fc1e93f971f]

 

[09/23 14:43:45.097] WARNING[36475][C-0172] channel.c: Exceptionally
long voice queue length queuing to
CBAnn/IS__64bc075a-1ba4-4ad8-ba48-f0aea6ca6bab-1ed3;1

[09/23 14:43:45.098] ERROR[12979] : Got 8 backtrace records

# 0: /usr/sbin/asterisk(__ao2_ref+0x209) [0x5637ebef1519]

# 1: /usr/sbin/asterisk(ast_frdup+0x1e2) [0x5637ebf96612]

# 2: /usr/sbin/asterisk(ast_bridge_channel_queue_frame

Re: [asterisk-users] Any thoughts on Asterisk 16.17.0 outputting FRACK refcount related messages

2021-09-28 Thread Joshua C. Colp
On Tue, Sep 28, 2021 at 11:44 AM Dan Cropp  wrote:

> Thank you Joshua.
>
>
>
> I enabled better debugging and we’re running tests inhouse trying to
> replicate issue.  This morning, I see the following happened during our
> internal load test.
>
>
>
> Would this indicate system is too busy or a potential issue?
>

It's possible, yes.


-- 
Joshua C. Colp
Asterisk Technical Lead
Sangoma Technologies
Check us out at www.sangoma.com and www.asterisk.org
-- 
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

Check out the new Asterisk community forum at: https://community.asterisk.org/

New to Asterisk? Start here:
  https://wiki.asterisk.org/wiki/display/AST/Getting+Started

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users

Re: [asterisk-users] Any thoughts on Asterisk 16.17.0 outputting FRACK refcount related messages

2021-09-28 Thread Dan Cropp
] libc.so.6 clone.S:97 clone()

[09/28 09:31:55.638] WARNING[2872][C-0189] channel.c: Exceptionally long 
voice queue length queuing to 
CBAnn/IS__b6b99750-c9aa-420c-9e9f-fa41e9a99fa9-000d;1
[09/28 09:31:55.706] WARNING[3092][C-01ca] channel.c: Exceptionally long 
voice queue length queuing to 
CBAnn/IS__43f4512f-4009-448f-9ccf-4d3ada975f74-0054;1
[09/28 09:31:55.764] WARNING[3231][C-0161] channel.c: Exceptionally long 
voice queue length queuing to 
CBAnn/IS__57f2e3a7-6c86-423f-b35f-25048ef68591-007d;1
[09/28 09:31:55.771] ERROR[3208] : Got 8 backtrace records
# 0: [0x55d428652177] asterisk utils.c:2454 __ast_assert_failed()
# 1: [0x55d4284bb742] asterisk astobj2.c:588 __ao2_ref()
# 2: [0x7faa79aa75c0] bridge_softmix.so bridge_softmix.c:257 
softmix_process_write_audio()
# 3: [0x7faa79aabef7] bridge_softmix.so bridge_softmix.c:1934 
softmix_mixing_loop()
# 4: [0x7faa79aac54d] bridge_softmix.so bridge_softmix.c:2047 
softmix_mixing_thread()
# 5: [0x55d42864ef9b] asterisk utils.c:1299 dummy_start()
# 6: [0x7fab064b96db] libpthread.so.0 pthread_create.c:463 start_thread()
# 7: [0x7fab056ab71f] libc.so.6 clone.S:97 clone()


From: asterisk-users  On Behalf Of 
Joshua C. Colp
Sent: Thursday, September 23, 2021 12:03 PM
To: Asterisk Users Mailing List - Non-Commercial Discussion 

Subject: Re: [asterisk-users] Any thoughts on Asterisk 16.17.0 outputting FRACK 
refcount related messages

On Thu, Sep 23, 2021 at 1:59 PM Dan Cropp 
mailto:d...@amtelco.com>> wrote:
We have an extremely busy/large customer.  They run fine most of the time, but 
periodically asterisk will output FRACK refcount related messages.  It doesn’t 
seem to be related to the volume, because it’s not breaking during their peak 
times.

When this happens, the system becomes unstable and they have to restart to get 
things resolved.
To give an idea of the instability, we have seen INVITE/Trying responses in SIP 
messaging logs.
We tell Asterisk to answer via AMI, but Asterisk never sends the OK (even 24 
seconds later it hasn’t sent).
Eventually the other send CANCEL of the call.


We’ve now captured 4 different days where something like the following occurs.
1) Is there a good way to tell if this may be fixed in Asterisk 16.20.0 (short 
of upgrading)?

You could examine the changes between the two and see if any issues seem 
relevant.

2) Would this be something I should submit as an asterisk issue?  
Unfortunately, site is so busy capturing the debug will be very difficult (if 
not impossible) due to amount of data.

Without a backtrace, or debug logs it'd be really hard to look at. The message 
just means that an object was referenced a lot. Why that is - who knows. It 
could be that the system can't keep up for some reason, or there's an 
off-nominal path, or something is deadlocked, or something else.

--
Joshua C. Colp
Asterisk Technical Lead
Sangoma Technologies
Check us out at www.sangoma.com<http://www.sangoma.com> and 
www.asterisk.org<http://www.asterisk.org>

This email is intended only for the use of the party to which it is addressed 
and may contain information that is privileged, confidential, or protected by 
law. If you are not the intended recipient you are hereby notified that any 
dissemination, copying or distribution of this email or its contents is 
strictly prohibited. If you have received this message in error, please notify 
us immediately by replying to the message and deleting it from your computer.
-- 
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

Check out the new Asterisk community forum at: https://community.asterisk.org/

New to Asterisk? Start here:
  https://wiki.asterisk.org/wiki/display/AST/Getting+Started

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users

Re: [asterisk-users] Any thoughts on Asterisk 16.17.0 outputting FRACK refcount related messages

2021-09-23 Thread Joshua C. Colp
On Thu, Sep 23, 2021 at 1:59 PM Dan Cropp  wrote:

> We have an extremely busy/large customer.  They run fine most of the time,
> but periodically asterisk will output FRACK refcount related messages.  It
> doesn’t seem to be related to the volume, because it’s not breaking during
> their peak times.
>
>
>
> When this happens, the system becomes unstable and they have to restart to
> get things resolved.
>
> To give an idea of the instability, we have seen INVITE/Trying responses
> in SIP messaging logs.
>
> We tell Asterisk to answer via AMI, but Asterisk never sends the OK (even
> 24 seconds later it hasn’t sent).
>
> Eventually the other send CANCEL of the call.
>
>
>
>
>
> We’ve now captured 4 different days where something like the following
> occurs.
>
> 1) Is there a good way to tell if this may be fixed in Asterisk 16.20.0
> (short of upgrading)?
>

You could examine the changes between the two and see if any issues seem
relevant.


> 2) Would this be something I should submit as an asterisk issue?
> Unfortunately, site is so busy capturing the debug will be very difficult
> (if not impossible) due to amount of data.
>

Without a backtrace, or debug logs it'd be really hard to look at. The
message just means that an object was referenced a lot. Why that is - who
knows. It could be that the system can't keep up for some reason, or
there's an off-nominal path, or something is deadlocked, or something else.

-- 
Joshua C. Colp
Asterisk Technical Lead
Sangoma Technologies
Check us out at www.sangoma.com and www.asterisk.org
-- 
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

Check out the new Asterisk community forum at: https://community.asterisk.org/

New to Asterisk? Start here:
  https://wiki.asterisk.org/wiki/display/AST/Getting+Started

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users

[asterisk-users] Any thoughts on Asterisk 16.17.0 outputting FRACK refcount related messages

2021-09-23 Thread Dan Cropp
We have an extremely busy/large customer.  They run fine most of the time, but 
periodically asterisk will output FRACK refcount related messages.  It doesn't 
seem to be related to the volume, because it's not breaking during their peak 
times.

When this happens, the system becomes unstable and they have to restart to get 
things resolved.
To give an idea of the instability, we have seen INVITE/Trying responses in SIP 
messaging logs.
We tell Asterisk to answer via AMI, but Asterisk never sends the OK (even 24 
seconds later it hasn't sent).
Eventually the other send CANCEL of the call.


We've now captured 4 different days where something like the following occurs.
1) Is there a good way to tell if this may be fixed in Asterisk 16.20.0 (short 
of upgrading)?
2) Would this be something I should submit as an asterisk issue?  
Unfortunately, site is so busy capturing the debug will be very difficult (if 
not impossible) due to amount of data.


[09/23 14:43:45.095] ERROR[34763][C-1d7f] frame.c: Excessive refcount 
10 reached on ao2 object 0x5637ed3742b8
[09/23 14:43:45.095] ERROR[34763][C-1d7f] frame.c: FRACK!, Failed assertion 
Excessive refcount 10 reached on ao2 object 0x5637ed3742b8 (0)
[09/23 14:43:45.095] ERROR[29830] frame.c: Excessive refcount 10 reached on 
ao2 object 0x5637ed3742b8
[09/23 14:43:45.095] ERROR[29832][C-1920] frame.c: Excessive refcount 
10 reached on ao2 object 0x5637ed3742b8
[09/23 14:43:45.095] ERROR[29830] frame.c: FRACK!, Failed assertion Excessive 
refcount 10 reached on ao2 object 0x5637ed3742b8 (0)
[09/23 14:43:45.095] ERROR[29832][C-1920] frame.c: FRACK!, Failed assertion 
Excessive refcount 10 reached on ao2 object 0x5637ed3742b8 (0)
[09/23 14:43:45.095] ERROR[32973] frame.c: Excessive refcount 10 reached on 
ao2 object 0x5637ed3742b8
[09/23 14:43:45.095] ERROR[32973] frame.c: FRACK!, Failed assertion Excessive 
refcount 10 reached on ao2 object 0x5637ed3742b8 (0)
[09/23 14:43:45.095] ERROR[3248] frame.c: Excessive refcount 10 reached on 
ao2 object 0x5637ed3742b8
[09/23 14:43:45.095] ERROR[3248] frame.c: FRACK!, Failed assertion Excessive 
refcount 10 reached on ao2 object 0x5637ed3742b8 (0)
[09/23 14:43:45.095] WARNING[2798][C-0093] channel.c: Exceptionally long 
queue length queuing to 
CBAnn/IS__8b9c6719-ca29-4c1b-ac87-75e8c6fe7074-0062;1
[09/23 14:43:45.095] ERROR[12979] frame.c: Excessive refcount 10 reached on 
ao2 object 0x5637ed3742b8
[09/23 14:43:45.095] ERROR[12979] frame.c: FRACK!, Failed assertion Excessive 
refcount 10 reached on ao2 object 0x5637ed3742b8 (0)
[09/23 14:43:45.095] WARNING[21123][C-1157] channel.c: Exceptionally long 
queue length queuing to 
CBAnn/IS__a652155f-b1fb-4c31-83e5-09ffa2107979-10de;1
[09/23 14:43:45.096] ERROR[29830] : Got 8 backtrace records
# 0: /usr/sbin/asterisk(__ao2_ref+0x209) [0x5637ebef1519]
# 1: /usr/sbin/asterisk(ast_frdup+0x1e2) [0x5637ebf96612]
# 2: /usr/sbin/asterisk(ast_bridge_channel_queue_frame+0x61) [0x5637ebf1cfe1]
# 3: /usr/lib/asterisk/modules/bridge_softmix.so(+0x40af) [0x7fc15362f0af]
# 4: /usr/lib/asterisk/modules/bridge_softmix.so(+0x560a) [0x7fc15363060a]
# 5: /usr/sbin/asterisk(+0x1db41f) [0x5637ec06041f]
# 6: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fc1e9ebf6db]
# 7: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fc1e93f971f]

[09/23 14:43:45.097] WARNING[36475][C-0172] channel.c: Exceptionally long 
voice queue length queuing to 
CBAnn/IS__64bc075a-1ba4-4ad8-ba48-f0aea6ca6bab-1ed3;1
[09/23 14:43:45.098] ERROR[12979] : Got 8 backtrace records
# 0: /usr/sbin/asterisk(__ao2_ref+0x209) [0x5637ebef1519]
# 1: /usr/sbin/asterisk(ast_frdup+0x1e2) [0x5637ebf96612]
# 2: /usr/sbin/asterisk(ast_bridge_channel_queue_frame+0x61) [0x5637ebf1cfe1]
# 3: /usr/lib/asterisk/modules/bridge_softmix.so(+0x40af) [0x7fc15362f0af]
# 4: /usr/lib/asterisk/modules/bridge_softmix.so(+0x560a) [0x7fc15363060a]
# 5: /usr/sbin/asterisk(+0x1db41f) [0x5637ec06041f]
# 6: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fc1e9ebf6db]
# 7: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fc1e93f971f]

[09/23 14:43:45.098] ERROR[34763][C-1d7f] : Got 19 backtrace records
# 0: /usr/sbin/asterisk(__ao2_ref+0x209) [0x5637ebef1519]
# 1: /usr/sbin/asterisk(ast_frdup+0x1e2) [0x5637ebf96612]
# 2: /usr/sbin/asterisk(+0xa9517) [0x5637ebf2e517]
# 3: /usr/sbin/asterisk(ast_unreal_write_stream+0x15a) [0x5637ebf6ce1a]
# 4: /usr/sbin/asterisk(ast_write_stream+0xd2e) [0x5637ebf4551e]
# 5: /usr/sbin/asterisk(bridge_channel_internal_join+0xe6d) [0x5637ebf2238d]
# 6: /usr/sbin/asterisk(ast_bridge_join+0x3d5) [0x5637ebf05ce5]
# 7: /usr/lib/asterisk/modules/app_confbridge.so(+0xff35) [0x7fc05c258f35]
# 8: /usr/sbin/asterisk(pbx_exec+0xd2) [0x5637ebfdcf92]
# 9: /usr/lib/asterisk/modules/res_agi.so(+0xbea5) [0x7fc059269ea5]
#10: /usr/lib/asterisk/modules/res_agi.so(+0xd29b) [0x7fc05926b29b]
#11: /usr/lib/asterisk/modules/res_agi.so(+0xf7b0) [0x7fc05926d7b0]
#12: