Re: [asterisk-users] CEL entries over ODBC several hours late (Vinicius Fontes)

2015-12-11 Thread Stefan Viljoen
Hi Vinicius

Thanks for replying.

>Sorry for the probably obvious question, but it's better to cover all
bases.

>The DBMS is running on the same box as Asterisk is? If that's the case then
maybe the DBMS is using too much CPU and starving Asterisk?

I don't think so - I think I have a locking issue in the DB specifically
with my stored procedure -and- Asterisk trying to access the CEL table
simultaneously.

Load averages appear normal even if the DB is running in the background. The
autodestruct problem occurs whenever the CEL table (InnoDB, so should NOT be
locked for inserts while being queried in MySQL "nolock" equivalent, but it
IS) is busy being read by the SP and Asterisk tries to write into it.

This then back-flows to the Asterisk and delays the H extension execution to
such a degree that handsets get locked tight until the CEL table lock held
by the SP is released and Asterisk can write to it again over ODBC.

I understand that CEL logging and call hangup should be an atomic operation
(billing related) from there though the question about why it apparently
(from observed effects) is a synchronous operation - but considering it must
be atomic, it does make sense I suppose to write CELs synchronously with
call teardown.

It just requires that you keep the CEL table in MariaDB (or whatever) in an
"un-locked" state at all times - which makes it pretty hard since to have
any practical use out of the CEL table you should be able to query it
without locking it - which InnoDB in MariaDB is supposd to provide, but
clearly does not in my type of setup here.

Anyway, thanks for the reply.

Regards

Stefan


-- 
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
   http://www.asterisk.org/hello

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] CEL entries over ODBC several hours late (Vinicius Fontes)

2015-12-11 Thread Vinicius Fontes
Since the issue seems to be table locking, why not take a shot with
PostgreSQL? It's way better and more robust than MySQL/MariaDB.

You should be able to create an additional DSN and output CEL to both
PostgreSQL and MariaDB.

2015-12-11 8:59 GMT-02:00 Stefan Viljoen :

> Hi Vinicius
>
> Thanks for replying.
>
> >Sorry for the probably obvious question, but it's better to cover all
> bases.
>
> >The DBMS is running on the same box as Asterisk is? If that's the case
> then
> maybe the DBMS is using too much CPU and starving Asterisk?
>
> I don't think so - I think I have a locking issue in the DB specifically
> with my stored procedure -and- Asterisk trying to access the CEL table
> simultaneously.
>
> Load averages appear normal even if the DB is running in the background.
> The
> autodestruct problem occurs whenever the CEL table (InnoDB, so should NOT
> be
> locked for inserts while being queried in MySQL "nolock" equivalent, but it
> IS) is busy being read by the SP and Asterisk tries to write into it.
>
> This then back-flows to the Asterisk and delays the H extension execution
> to
> such a degree that handsets get locked tight until the CEL table lock held
> by the SP is released and Asterisk can write to it again over ODBC.
>
> I understand that CEL logging and call hangup should be an atomic operation
> (billing related) from there though the question about why it apparently
> (from observed effects) is a synchronous operation - but considering it
> must
> be atomic, it does make sense I suppose to write CELs synchronously with
> call teardown.
>
> It just requires that you keep the CEL table in MariaDB (or whatever) in an
> "un-locked" state at all times - which makes it pretty hard since to have
> any practical use out of the CEL table you should be able to query it
> without locking it - which InnoDB in MariaDB is supposd to provide, but
> clearly does not in my type of setup here.
>
> Anyway, thanks for the reply.
>
> Regards
>
> Stefan
>
>
> --
> _
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
> New to Asterisk? Join us for a live introductory webinar every Thurs:
>http://www.asterisk.org/hello
>
> asterisk-users mailing list
> To UNSUBSCRIBE or update options visit:
>http://lists.digium.com/mailman/listinfo/asterisk-users
>
-- 
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
   http://www.asterisk.org/hello

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users

Re: [asterisk-users] CEL entries over ODBC several hours late (Matthew Jordan)

2015-12-11 Thread Matthew Jordan
On Thu, Dec 10, 2015 at 8:57 AM, Stefan Viljoen 
wrote:

> Hi Matthew
>
> Thank you very much for the reply.
>
> I must have something seriously wrong somewhere else then - I retested now
> and the "apparent" effect is as  I describe but your info definitely
> contradicts that. But you're obviously correct.
>
> One more question - I've noted that if I run a combination of queries in
> the
> CEL backing DB (MariaDB) and the CEL table is locked, this severely affects
> the Asterisk instance - thousands of occurrences of
>
> chan_sip.c:4057 __sip_autodestruct: Autodestruct on dialog
> '6a9f5d3543b619655e07c81437373a32@172.17.12.3:5060' with owner
> SIP/3034-000207c8 in place (Method: BYE). Rescheduling destruction for
> 1
> ms
>
> appear in the CLI and users complain that if they hang up then they cannot
> make another call on the same SIP handset for several minutes.
>
> This is obviously because the dialplan gets delayed in the H extension, and
> cannot write to the CEL table, waiting for the MariaDB instance to clear
> the
> locks so it can write again. The above apparently comes from a watchdog
> process that watches how "fast" the H extension is and if it takes "too
> long" it forces the channels closed.
>
> Is this assumption correct?
>
> Addtionally, it seems that the writing of CEL in Asterisk is NOT async?
> E.g.
> it appears the thread that was running the conversation ALSO does the CEL
> writing / pushing to the CEL core as you describe in a synchronous manner.
>
> For 1.8, is this correct that CELs are synchronous, and do newer Asterisk
> versions do it async?
>
>
That's actually a bit surprising. Assuming you are using CELGenUserEvent in
the 'h' extension, I would not expect that to block the channel when
writing out to the database. The act of creating the CEL event will lock
the channel briefly, but the actual CEL event is queued up onto a message
bus and dispatched to another thread. I'd have to see the output of a gdb
backtrace or 'core show locks' to know why that is impacting the channels.

It should be asynchronous everywhere - in 1.8+. While the implementation of
the message bus changed in Asterisk 12, that doesn't change the nature of
how it is dispatched. As I said, a gdb backtrace or 'core show locks' would
show who the culprit it.


E. g. my point being if there are major DB issues, it is quite a bit
> kryptonite to have that "spill back" into Asterisk and start blocking off
> users from calling out - wouldn't it be much better to simply have failed
> CEL writes just die in a distal thread instead of the main call thread for
> the channels running on that handset.
>
> Or am I completely misunderstanding things?
>
> Anyway, thanks for the reply. :)
>
> Kind regards,
>
> > Hi guys
> >
> > I'm running 1.8.32.3 with CEL logging over ODBC to MariaDB 5.5.41 on the
> > same Centos 7 machine.
> >
> > I've noticed that the CDR entries made are all in-time, e. g. the call
> will
> > take place and the CDR entry is immediately written into the CDR table in
> > the MariaDB database.
> >
> > However, CEL events for that CDR (which I need to process for a realtime
> > display feature in my dialer software) are always several hours after the
> > fact. E. g. I will make a call at 09:30, see the call immediate pop up in
> > the MariaDB CDR table, but only at about 15:15 that afternoon will I see
> > that call's CEL events come into the CEL table, from Asterisk I have
> > examined the `show processlist` in MariaDB exhaustively to establish this
> > fact.
> >
> > The system doesn't appear loaded, load average is about 1.1 - it's a
> > quad-coare HT Intel Xeon E3-1225 with 8GB of DRAM running on an SSD for
> > main
> > storage.
> >
> > The system processes about 30 000 calls every 8 hour day, and services 90
> > SIP phones.
> >
> > I can stop and restart the MariaDB instance for several minutes, when I
> > restart it it immediately picks up on the "slow" CELs from where it was
> > interrupted - more evidence that Asterisk is very slowly streaming the
> CELs
> > through. I thought MariaDB was the bottleneck, but apparently not?
> >
> > If I make test inserts from a script into the CEL table, all of them
> > complete so quickly a time indication does not even register for the
> query
> > in MariaDB. Simple test queries on the CEL table are also instant, not
> even
> > counting in the internal MariaDB query duration timer.
> >
> > Can anybody explain why this is that the CELs asterisk emits over ODBC
> are
> > so delayed? Are CELs intended NOT to be realtime?
> >
> > So, logically, Asterisk appears to be caching CELs to the tune of
> hundreds
> > of thousands of them at any given time - meaning if it is stopped (either
> > killed, or core stop gracefully'ed, or just "core stop now")  potentially
> > hundreds of thousands of CELs will just evaporate irretrivably.
> >
> > What can I do to mitigate this extremely slow populating of CELs over
> ODBC?
> >
> >
> Asterisk does 

Re: [asterisk-users] CEL entries over ODBC several hours late (Matthew Jordan)

2015-12-10 Thread Stefan Viljoen
Hi Matthew

Thank you very much for the reply.

I must have something seriously wrong somewhere else then - I retested now
and the "apparent" effect is as  I describe but your info definitely
contradicts that. But you're obviously correct.

One more question - I've noted that if I run a combination of queries in the
CEL backing DB (MariaDB) and the CEL table is locked, this severely affects
the Asterisk instance - thousands of occurrences of

chan_sip.c:4057 __sip_autodestruct: Autodestruct on dialog
'6a9f5d3543b619655e07c81437373a32@172.17.12.3:5060' with owner
SIP/3034-000207c8 in place (Method: BYE). Rescheduling destruction for 1
ms

appear in the CLI and users complain that if they hang up then they cannot
make another call on the same SIP handset for several minutes.

This is obviously because the dialplan gets delayed in the H extension, and
cannot write to the CEL table, waiting for the MariaDB instance to clear the
locks so it can write again. The above apparently comes from a watchdog
process that watches how "fast" the H extension is and if it takes "too
long" it forces the channels closed.

Is this assumption correct?

Addtionally, it seems that the writing of CEL in Asterisk is NOT async? E.g.
it appears the thread that was running the conversation ALSO does the CEL
writing / pushing to the CEL core as you describe in a synchronous manner.

For 1.8, is this correct that CELs are synchronous, and do newer Asterisk
versions do it async?

E. g. my point being if there are major DB issues, it is quite a bit
kryptonite to have that "spill back" into Asterisk and start blocking off
users from calling out - wouldn't it be much better to simply have failed
CEL writes just die in a distal thread instead of the main call thread for
the channels running on that handset.

Or am I completely misunderstanding things?

Anyway, thanks for the reply. :)

Kind regards,

> Hi guys
>
> I'm running 1.8.32.3 with CEL logging over ODBC to MariaDB 5.5.41 on the
> same Centos 7 machine.
>
> I've noticed that the CDR entries made are all in-time, e. g. the call
will
> take place and the CDR entry is immediately written into the CDR table in
> the MariaDB database.
>
> However, CEL events for that CDR (which I need to process for a realtime
> display feature in my dialer software) are always several hours after the
> fact. E. g. I will make a call at 09:30, see the call immediate pop up in
> the MariaDB CDR table, but only at about 15:15 that afternoon will I see
> that call's CEL events come into the CEL table, from Asterisk I have
> examined the `show processlist` in MariaDB exhaustively to establish this
> fact.
>
> The system doesn't appear loaded, load average is about 1.1 - it's a
> quad-coare HT Intel Xeon E3-1225 with 8GB of DRAM running on an SSD for
> main
> storage.
>
> The system processes about 30 000 calls every 8 hour day, and services 90
> SIP phones.
>
> I can stop and restart the MariaDB instance for several minutes, when I
> restart it it immediately picks up on the "slow" CELs from where it was
> interrupted - more evidence that Asterisk is very slowly streaming the
CELs
> through. I thought MariaDB was the bottleneck, but apparently not?
>
> If I make test inserts from a script into the CEL table, all of them
> complete so quickly a time indication does not even register for the query
> in MariaDB. Simple test queries on the CEL table are also instant, not
even
> counting in the internal MariaDB query duration timer.
>
> Can anybody explain why this is that the CELs asterisk emits over ODBC are
> so delayed? Are CELs intended NOT to be realtime?
>
> So, logically, Asterisk appears to be caching CELs to the tune of hundreds
> of thousands of them at any given time - meaning if it is stopped (either
> killed, or core stop gracefully'ed, or just "core stop now")  potentially
> hundreds of thousands of CELs will just evaporate irretrivably.
>
> What can I do to mitigate this extremely slow populating of CELs over
ODBC?
>
>
Asterisk does not buffer CEL entries. If anything, it pushes the entries
out to ODBC much more aggressively than what you would get with CDRs.

An event is generated in Asterisk that corresponds to the CEL entry. That
entry is pushed over a message bus (the 'event' message bus in 1.8 - 11;
'stasis' in 12+) and is picked up by the CEL core. The events are
immediately sent to the registered backends, who also immediately write it
out to the backend they support. In the case of ODBC, this immediately does
an INSERT into the appropriate table.

In Asterisk 1.8, you can look for a verbose level 11 message that will show
when this occurs:

ast_verb(11, "[%s]\n", ast_str_buffer(sql));

In later versions, this was turned into a debug level 3 message (as
anything over a verbose 5/debug 5 was cleaned up).

If you see that message, then that will tell you when Asterisk *believes*
it has written the CEL entry. If that doesn't show up in the database, then
it is either in the ODBC 

Re: [asterisk-users] CEL entries over ODBC several hours late (Matthew Jordan)

2015-12-10 Thread Vinicius Fontes
Sorry for the probably obvious question, but it's better to cover all bases.

The DBMS is running on the same box as Asterisk is? If that's the case then
maybe the DBMS is using too much CPU and starving Asterisk?

2015-12-10 12:57 GMT-02:00 Stefan Viljoen :

> Hi Matthew
>
> Thank you very much for the reply.
>
> I must have something seriously wrong somewhere else then - I retested now
> and the "apparent" effect is as  I describe but your info definitely
> contradicts that. But you're obviously correct.
>
> One more question - I've noted that if I run a combination of queries in
> the
> CEL backing DB (MariaDB) and the CEL table is locked, this severely affects
> the Asterisk instance - thousands of occurrences of
>
> chan_sip.c:4057 __sip_autodestruct: Autodestruct on dialog
> '6a9f5d3543b619655e07c81437373a32@172.17.12.3:5060' with owner
> SIP/3034-000207c8 in place (Method: BYE). Rescheduling destruction for
> 1
> ms
>
> appear in the CLI and users complain that if they hang up then they cannot
> make another call on the same SIP handset for several minutes.
>
> This is obviously because the dialplan gets delayed in the H extension, and
> cannot write to the CEL table, waiting for the MariaDB instance to clear
> the
> locks so it can write again. The above apparently comes from a watchdog
> process that watches how "fast" the H extension is and if it takes "too
> long" it forces the channels closed.
>
> Is this assumption correct?
>
> Addtionally, it seems that the writing of CEL in Asterisk is NOT async?
> E.g.
> it appears the thread that was running the conversation ALSO does the CEL
> writing / pushing to the CEL core as you describe in a synchronous manner.
>
> For 1.8, is this correct that CELs are synchronous, and do newer Asterisk
> versions do it async?
>
> E. g. my point being if there are major DB issues, it is quite a bit
> kryptonite to have that "spill back" into Asterisk and start blocking off
> users from calling out - wouldn't it be much better to simply have failed
> CEL writes just die in a distal thread instead of the main call thread for
> the channels running on that handset.
>
> Or am I completely misunderstanding things?
>
> Anyway, thanks for the reply. :)
>
> Kind regards,
>
> > Hi guys
> >
> > I'm running 1.8.32.3 with CEL logging over ODBC to MariaDB 5.5.41 on the
> > same Centos 7 machine.
> >
> > I've noticed that the CDR entries made are all in-time, e. g. the call
> will
> > take place and the CDR entry is immediately written into the CDR table in
> > the MariaDB database.
> >
> > However, CEL events for that CDR (which I need to process for a realtime
> > display feature in my dialer software) are always several hours after the
> > fact. E. g. I will make a call at 09:30, see the call immediate pop up in
> > the MariaDB CDR table, but only at about 15:15 that afternoon will I see
> > that call's CEL events come into the CEL table, from Asterisk I have
> > examined the `show processlist` in MariaDB exhaustively to establish this
> > fact.
> >
> > The system doesn't appear loaded, load average is about 1.1 - it's a
> > quad-coare HT Intel Xeon E3-1225 with 8GB of DRAM running on an SSD for
> > main
> > storage.
> >
> > The system processes about 30 000 calls every 8 hour day, and services 90
> > SIP phones.
> >
> > I can stop and restart the MariaDB instance for several minutes, when I
> > restart it it immediately picks up on the "slow" CELs from where it was
> > interrupted - more evidence that Asterisk is very slowly streaming the
> CELs
> > through. I thought MariaDB was the bottleneck, but apparently not?
> >
> > If I make test inserts from a script into the CEL table, all of them
> > complete so quickly a time indication does not even register for the
> query
> > in MariaDB. Simple test queries on the CEL table are also instant, not
> even
> > counting in the internal MariaDB query duration timer.
> >
> > Can anybody explain why this is that the CELs asterisk emits over ODBC
> are
> > so delayed? Are CELs intended NOT to be realtime?
> >
> > So, logically, Asterisk appears to be caching CELs to the tune of
> hundreds
> > of thousands of them at any given time - meaning if it is stopped (either
> > killed, or core stop gracefully'ed, or just "core stop now")  potentially
> > hundreds of thousands of CELs will just evaporate irretrivably.
> >
> > What can I do to mitigate this extremely slow populating of CELs over
> ODBC?
> >
> >
> Asterisk does not buffer CEL entries. If anything, it pushes the entries
> out to ODBC much more aggressively than what you would get with CDRs.
>
> An event is generated in Asterisk that corresponds to the CEL entry. That
> entry is pushed over a message bus (the 'event' message bus in 1.8 - 11;
> 'stasis' in 12+) and is picked up by the CEL core. The events are
> immediately sent to the registered backends, who also immediately write it
> out to the backend they support. In the case of ODBC, this 

[asterisk-users] CEL entries over ODBC several hours late

2015-12-09 Thread Stefan Viljoen
Hi guys

I'm running 1.8.32.3 with CEL logging over ODBC to MariaDB 5.5.41 on the
same Centos 7 machine.

I've noticed that the CDR entries made are all in-time, e. g. the call will
take place and the CDR entry is immediately written into the CDR table in
the MariaDB database.

However, CEL events for that CDR (which I need to process for a realtime
display feature in my dialer software) are always several hours after the
fact. E. g. I will make a call at 09:30, see the call immediate pop up in
the MariaDB CDR table, but only at about 15:15 that afternoon will I see
that call's CEL events come into the CEL table, from Asterisk I have
examined the `show processlist` in MariaDB exhaustively to establish this
fact.

The system doesn't appear loaded, load average is about 1.1 - it's a
quad-coare HT Intel Xeon E3-1225 with 8GB of DRAM running on an SSD for main
storage.

The system processes about 30 000 calls every 8 hour day, and services 90
SIP phones.

I can stop and restart the MariaDB instance for several minutes, when I
restart it it immediately picks up on the "slow" CELs from where it was
interrupted - more evidence that Asterisk is very slowly streaming the CELs
through. I thought MariaDB was the bottleneck, but apparently not?

If I make test inserts from a script into the CEL table, all of them
complete so quickly a time indication does not even register for the query
in MariaDB. Simple test queries on the CEL table are also instant, not even
counting in the internal MariaDB query duration timer.

Can anybody explain why this is that the CELs asterisk emits over ODBC are
so delayed? Are CELs intended NOT to be realtime?

So, logically, Asterisk appears to be caching CELs to the tune of hundreds
of thousands of them at any given time - meaning if it is stopped (either
killed, or core stop gracefully'ed, or just "core stop now")  potentially
hundreds of thousands of CELs will just evaporate irretrivably.

What can I do to mitigate this extremely slow populating of CELs over ODBC?

Thanks


-- 
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
   http://www.asterisk.org/hello

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users


Re: [asterisk-users] CEL entries over ODBC several hours late

2015-12-09 Thread Matthew Jordan
On Wed, Dec 9, 2015 at 6:32 AM, Stefan Viljoen 
wrote:

> Hi guys
>
> I'm running 1.8.32.3 with CEL logging over ODBC to MariaDB 5.5.41 on the
> same Centos 7 machine.
>
> I've noticed that the CDR entries made are all in-time, e. g. the call will
> take place and the CDR entry is immediately written into the CDR table in
> the MariaDB database.
>
> However, CEL events for that CDR (which I need to process for a realtime
> display feature in my dialer software) are always several hours after the
> fact. E. g. I will make a call at 09:30, see the call immediate pop up in
> the MariaDB CDR table, but only at about 15:15 that afternoon will I see
> that call's CEL events come into the CEL table, from Asterisk I have
> examined the `show processlist` in MariaDB exhaustively to establish this
> fact.
>
> The system doesn't appear loaded, load average is about 1.1 - it's a
> quad-coare HT Intel Xeon E3-1225 with 8GB of DRAM running on an SSD for
> main
> storage.
>
> The system processes about 30 000 calls every 8 hour day, and services 90
> SIP phones.
>
> I can stop and restart the MariaDB instance for several minutes, when I
> restart it it immediately picks up on the "slow" CELs from where it was
> interrupted - more evidence that Asterisk is very slowly streaming the CELs
> through. I thought MariaDB was the bottleneck, but apparently not?
>
> If I make test inserts from a script into the CEL table, all of them
> complete so quickly a time indication does not even register for the query
> in MariaDB. Simple test queries on the CEL table are also instant, not even
> counting in the internal MariaDB query duration timer.
>
> Can anybody explain why this is that the CELs asterisk emits over ODBC are
> so delayed? Are CELs intended NOT to be realtime?
>
> So, logically, Asterisk appears to be caching CELs to the tune of hundreds
> of thousands of them at any given time - meaning if it is stopped (either
> killed, or core stop gracefully'ed, or just "core stop now")  potentially
> hundreds of thousands of CELs will just evaporate irretrivably.
>
> What can I do to mitigate this extremely slow populating of CELs over ODBC?
>
>
Asterisk does not buffer CEL entries. If anything, it pushes the entries
out to ODBC much more aggressively than what you would get with CDRs.

An event is generated in Asterisk that corresponds to the CEL entry. That
entry is pushed over a message bus (the 'event' message bus in 1.8 - 11;
'stasis' in 12+) and is picked up by the CEL core. The events are
immediately sent to the registered backends, who also immediately write it
out to the backend they support. In the case of ODBC, this immediately does
an INSERT into the appropriate table.

In Asterisk 1.8, you can look for a verbose level 11 message that will show
when this occurs:

ast_verb(11, "[%s]\n", ast_str_buffer(sql));

In later versions, this was turned into a debug level 3 message (as
anything over a verbose 5/debug 5 was cleaned up).

If you see that message, then that will tell you when Asterisk *believes*
it has written the CEL entry. If that doesn't show up in the database, then
it is either in the ODBC driver or the Maria database.

If you don't see that message, then something is preventing those events
from getting delivered inside of Asterisk, which would only occur if you
had some other serious call related issues occurring.

Matt

-- 
Matthew Jordan
Digium, Inc. | Director of Technology
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at: http://digium.com & http://asterisk.org
-- 
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
   http://www.asterisk.org/hello

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users