I believe this should be re-opened. There's no harm in this patch other than the trivial cost of an additional pointer. The CDR is not touched in ast_channel_free, so there's no reason why it can't be called before the CDR data is released. Reducing time in locks -- without requiring user intervention (in this case, enabling batch mode in cdr.conf) -- should always be a design goal.
--Chris On Wed, May 02, 2007 at 07:46:29AM -0500, [EMAIL PROTECTED] waxed: > > The following issue has been CLOSED. > ====================================================================== > http://bugs.digium.com/view.php?id=9612 > ====================================================================== > Reported By: one47 > Assigned To: > ====================================================================== > Project: SIP > Issue ID: 9612 > Category: General > Reproducibility: always > Severity: minor > Priority: normal > Status: closed > Asterisk Version: 1.4.3 > SVN Branch (only for SVN checkouts, not tarball releases): N/A > SVN Revision (number only!): > Disclaimer on File?: Yes > Resolution: no change required > Fixed in Version: > ====================================================================== > Date Submitted: 04-27-2007 04:46 CDT > Last Modified: 05-02-2007 07:46 CDT > ====================================================================== > Summary: I believe this fixes a BAD BAD BAD Scenario > Description: > I discussed a BAD BAD BAD case with Olle in June last year. It is low > priority because it happens during a hangup, so all that really happens is > that a final ACK is ignored. I think I worked out what is happening, so I > am including my original verbose/debug log, and a possible patch. > ====================================================================== > > ---------------------------------------------------------------------- > Corydon76 - 04-27-07 07:52 > ---------------------------------------------------------------------- > If you have a slow database, I'd suggest that you rewrite your CDR driver > to queue messages to be written via a separate thread, instead of posting > them directly. Changing the order of deallocation within the channel > driver can also be BAD BAD BAD, as it changes assumptions about when events > occur. > > ---------------------------------------------------------------------- > one47 - 04-27-07 08:21 > ---------------------------------------------------------------------- > The attached trace seems to hold a lock for a significant part of 1 second > (WAY too long in VoIP terms), and is using purely Asterisk-basic CDR code - > 1 copy of cdr record sent to text file, and one sent to cdr_pgsql. > > Also, the call to "ast_cdr_end" is unchanged, so the contents of the CDR > record will be unchanged by the patch. > > It is only the act of writing it to disk that is moved. The move is > similarly very small, as it just moves the "CDR push and free" to the other > side of the call to "chan->tech->hangup(chan)" so that the channel-proper > is finished with and the locks are cleared before some potentially blocking > disk activity and SQL preparation is required. > > I appreciate what you say abbout changing the order of deallocation, but > the CDR record pointer is deallocated (set to NULL) in exactly the same > place after this patch. We just take a copy of the data so we can write it > very slightly later. > > The trace shows 2 attempts to get a channel lock (one from the SIP > protocol thread, and one from the devstate thread) while the hangup process > is holding that lock and updating the CDR. > > ---------------------------------------------------------------------- > oej - 04-27-07 09:07 > ---------------------------------------------------------------------- > one47: Please change cdr.conf so we can test with the cdr caching - that > propably should not lock for a long time and will cause the BAD BAD BAD > messages to disappear if your theory is right. We need this feedback. > > ---------------------------------------------------------------------- > one47 - 04-27-07 10:03 > ---------------------------------------------------------------------- > Hi, > > I could not find any reference to CDR caching, even in SVN trunk, so > instead I tested as follows using a 1.2.18 system: > > 1) CDR to .txt and PGSQL, database idle, and revently VACUUM'ed for > performance. > Problem occurred occasionally. > 2) CDR to .txt only > Could not reproduce problem > 3) CDR to .txt and PGSQL, database made artificially busy. > Problem occurred on nearly every BUSY call. > > Applied patch as supplied, then retest: > > 4) CDR to .txt and PGSQL, database idle, and revently VACUUM'ed for > performance. > Could not reproduce problem > 5) CDR to .txt only > Could not reproduce problem > 6) CDR to .txt and PGSQL, database made artificially busy. > Could not reproduce problem > > I am happy to try CDR caching or any other tests if you can explain how. I > could perhaps add log entries that time the ast_hangup process to verify > where the time goes? > > ---------------------------------------------------------------------- > oej - 04-30-07 02:06 > ---------------------------------------------------------------------- > It's called "batch" mode, sorry. > > cdr.conf.sample: > ---- > ; Define the CDR batch mode, where instead of posting the CDR at the end > of > ; every call, the data will be stored in a buffer to help alleviate load > on the > ; asterisk server. Default is "no". > ; > ; WARNING WARNING WARNING > ; Use of batch mode may result in data loss after unsafe asterisk > termination > ; ie. software crash, power failure, kill -9, etc. > ; WARNING WARNING WARNING > ; > ;batch=no > > ---------------------------------------------------------------------- > one47 - 04-30-07 03:54 > ---------------------------------------------------------------------- > I gave this a fairly quick test with good results. I tried creating big > batches and causing quite high call-volume without seeing any obvious > problems. > > There is of course the downside in fixing it this way, in that your CDR > records are written after a delay. > > ---------------------------------------------------------------------- > cmaj - 04-30-07 14:29 > ---------------------------------------------------------------------- > If you set the batch size = 1, then that should get those CDRs submitted > fairly quickly, at the cost of more threads. But they're cheap threads. > > ---------------------------------------------------------------------- > one47 - 04-30-07 16:27 > ---------------------------------------------------------------------- > Yes, batch=1 does help, and is approximately equivalent to the attached > patch. The difference being that this patch does it without an extra thread > being created for almost every CDR write, which is beneficial, even if it > is only a small benefit, and it also means that the _default_ case is > holding channel locks for less time... > > ---------------------------------------------------------------------- > oej - 05-02-07 04:48 > ---------------------------------------------------------------------- > Corydon76: Since this can be handled by the cdr batching, do you agree with > me that this bug is resolved or do you see some value in the proposed > patch? I don't really have much insights into the CDR subsystem. > > ---------------------------------------------------------------------- > Corydon76 - 05-02-07 07:46 > ---------------------------------------------------------------------- > No change required, as this can be fixed with batching. > > Issue History > Date Modified Username Field Change > ====================================================================== > 04-27-07 04:46 one47 New Issue > 04-27-07 04:46 one47 Issue Monitored: one47 > 04-27-07 04:47 one47 File Added: badbadbad.txt > 04-27-07 04:47 one47 Asterisk Version => 1.4.3 > 04-27-07 04:47 one47 SVN Branch (only for SVN checkou => N/A > > > 04-27-07 04:47 one47 Disclaimer on File? => Yes > 04-27-07 04:47 one47 File Added: badbadbad_1.2.18.patch > > > 04-27-07 04:47 one47 File Added: badbadbad_1.4.trunk.patch > > > 04-27-07 07:52 Corydon76 Note Added: 0062741 > 04-27-07 08:21 one47 Note Added: 0062745 > 04-27-07 09:07 oej Note Added: 0062754 > 04-27-07 09:11 oej Status new => feedback > 04-27-07 10:03 one47 Note Added: 0062774 > 04-30-07 02:06 oej Note Added: 0062848 > 04-30-07 03:54 one47 Note Added: 0062853 > 04-30-07 14:25 cmaj Issue Monitored: cmaj > 04-30-07 14:29 cmaj Note Added: 0062879 > 04-30-07 16:27 one47 Note Added: 0062884 > 05-02-07 04:48 oej Note Added: 0062953 > 05-02-07 07:46 Corydon76 Note Added: 0062959 > 05-02-07 07:46 Corydon76 Status feedback => closed > 05-02-07 07:46 Corydon76 Resolution open => no change > required > ====================================================================== _______________________________________________ --Bandwidth and Colocation provided by Easynews.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev
