I have fixed the remaining return 1's , and also added a little more logging for if a query reties to connect and fails ( retry count ).
find attached latest patch file... ( tested to patch against master head ) hopefully this is good enough for you to commit. On 5 March 2014 21:42, Daniel-Constantin Mierla <mico...@gmail.com> wrote: > Hello, > > can you make the patch for master branch? I just backported two patches > that were in master branch but not yet in 4.1. > > With this occasion, can you review if the other 'return 1' expose the same > issue? I noticed another one in db_cassa_delete and in db_cassa_query. > > Thanks, > Daniel > > > On 05/03/14 03:57, jay binks wrote: > > Just noticed the same thing in db_cassa_delete.. > patch updated to fix both > > Jay > > > On 5 March 2014 12:52, jay binks <jaybi...@gmail.com> wrote: > >> Hi All, >> >> so Ive done what Carlos suggested and swapped out my dialog db to Mysql >> rather than cassandra. >> All worked 100% as you would expect. >> >> Right so the issue is db_cassandra . >> >> I started testing and going through the code. >> >> I found I had these lines, which was interesting & concerning. >> update_dialog_dbinfo_unsafe(): could not add another dialog to db >> I had been ignoring them, because the dialog was in the DB and I figured >> I would come back and figure that out later. >> >> but this seems to have been key to this whole thing. >> >> ends up that in dlg_db_handler.c dialog_dbf.insert was getting a 1 back >> from kamailio on the insert and a 0 back from mysql... WTF.. ok. >> >> so I trace into db_cassa_insert which calls db_cassa_modify .. >> around line 1210 I can see this .. >> >> CON_CASSA(_h)->con->batch_mutate(CFMap, oac::ConsistencyLevel::ONE); >> return 1; >> >> wrapped in a try / catch block.. >> seems db_cassandra wants to return 1 for success but kamailio ( or dialog >> module at least ) expects 0 for success . >> >> so I change that to be return 0, and re-test. >> everything works as expected, "could not add another dialog to db" >> stops coming up on my console, >> and dialogs are removed when calls hangup. >> >> seems this 1 thing is enough to screw dialogs in cassandra ( and who >> knows what else ). >> This is the reason for my email though, if we simply change that to 0, >> what else may break !?? >> >> however http://www.asipto.com/pub/kamailio-devel-guide/#c09f_insertclearly >> states that "0 if everything is OK" >> so this is clearly a bug that needs fixing. >> >> Can I get someone with more experience to test this for me and possibly >> apply the attached patch !? >> >> Jay >> >> >> >> >> >> >> >> >> >> >> >> On 25 February 2014 05:58, Daniel-Constantin Mierla <mico...@gmail.com> >> wrote: >> > >> > Hello, >> > >> > I pushed some patches to the master branch in order to remove the >> dialog from its associated profiles when it gets in terminated state. I >> encountered such issue (not that) recently, but I haven't gotten the time >> to get to it before. >> > >> > Then, the second patch is to not add dialogs in profiles when loading >> from database and the state is terminated (5). >> > >> > Here are the links to the patches: >> > >> > - >> http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=edf61acb57ed5e8ee0ca9ec1f796e43ce993be48 >> > - >> http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=9b88eb7ee2d243882383a44f601baa21fd679cd5 >> > >> > Should be straightforward to cherry pick to 4.1 (even 4.0 I expect). If >> you test and all goes fine, I will backport -- here I had no time for real >> testing. >> > >> > I plan also to not add the dialogs in memory for state terminated, but >> destroy them at db load time. But this needs a bit of a review, to be sure >> that all necessary callbacks are executed. >> > >> > On the other hand, if the dialogs are not removed from db, might be an >> issue with the database driver (cassandra in this case, which is rather new >> module). Do you get any syslog errors from kamailio or database server? I >> expect that people would have reported such issue for other database >> engines so far. Still it might be an issue, just that was not noticed... >> > >> > Cheers, >> > Daniel >> > >> > On 24/02/14 11:19, jay binks wrote: >> > >> > So poking round the code for the dialog module.... >> > Im not sure what im missing here. >> > >> > >> > get_profile_size dosnt care bout the state of a dialog... so you get >> ALL dialogs that are in the hash table. >> > ( which is interesting if you want to use dialog module to enforce >> channel limits etc ) >> > >> > So you go... OK... kamailio only expects to have "ACTIVE" dialogs in >> the hash table... kewl.. >> > lets assume that to be the case. >> > >> > but then in dlg_db_handler.c , load_dialog_info_from_db loads all >> dialogs from the DB, regardless of state. >> > so all dialogs in the DB ( ones that didnt get deleted yet... but were >> in state 5 ) get re-created in kamailio >> > upon startup. >> > >> > what this means is... >> > ( assume starting with empty DB ) >> > >> > I start kamailio, make some calls... they get synced to the DB. >> > I end the calls, kamailio removes from dialogs module internal hash, >> but the sync to DB hasnt happened yet. >> > >> > I kill kamailio ( or crash .. whatever ).... restart kamailio and it >> re-loads all those dialogs >> > and thinks they are still active calls. >> > >> > Im SURE Im missing something here, because it seems to be VERY common >> to use dialogs for channel limiting.. >> > maybe not so much using cassandra db behind the scenes, but as of yet >> ... Im still yet to find anything that makes me thing this is db_cassandra >> mis-behaving. >> > >> > if im wrong, please point me in the right direction. >> > >> > Jay >> > >> > >> > >> > >> > On 24 February 2014 17:54, jay binks <jaybi...@gmail.com> wrote: >> >> >> >> Am I REALLY the only person who has ever run into this !? >> >> >> >> >> >> On 19 February 2014 14:08, jay binks <jaybi...@gmail.com> wrote: >> >>> >> >>> Hi all, im using the dialog module with db_cassandra backend.. >> >>> I dont believe this issue is related to cassandra, but its worth >> mentioning anyways. >> >>> >> >>> so... I run kamailio, make calls, see dialogs in the DB.. >> >>> and I Can use "kamctl mi dlg_list" and see that dialogs go away when >> I hangup a call.. >> >>> >> >>> When I query the DB Backend, I still see the queries, but they have a >> state of 5. >> >>> I Initially thought this was a bug, but it seems dialogs in state 5 >> get cleaned up after a period. >> >>> so I moved on. >> >>> >> >>> now , lets restart kamailio.. >> >>> kamailio loads all dialogs on startup, after kamailio starts I call >> "kamctl mi dlg_list" again, and it shows all my dialogs from the DB. they >> DO show as "State 5" >> >>> but for some reason, these dialogs appear to stick around for a long >> time, and the bigger issue it causes me is that my channel limiting ( using >> get_profile_size ) seems to consider these dialogs ( in state 5 ) as being >> active calls. >> >>> >> >>> Please someone point me in the right direction... :) >> >>> >> >>> what am I doing wrong ? >> >>> ( or is this a bug somewhere ) >> >>> >> >>> Sincerely >> >>> >> >>> Jay >> >> >> >> >> >> >> >> >> >> -- >> >> Sincerely >> >> >> >> Jay >> > >> > >> > >> > >> > -- >> > Sincerely >> > >> > Jay >> > >> > >> > _______________________________________________ >> > sr-dev mailing list >> > sr-...@lists.sip-router.org >> > http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev >> > >> > >> > -- >> > Daniel-Constantin Mierla - http://www.asipto.com >> > http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda >> > >> > >> > _______________________________________________ >> > SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list >> > sr-users@lists.sip-router.org >> > http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users >> > >> >> >> >> -- >> Sincerely >> >> Jay >> > > > > -- > Sincerely > > Jay > > > -- > Daniel-Constantin Mierla - http://www.asipto.comhttp://twitter.com/#!/miconda > - http://www.linkedin.com/in/miconda > > -- Sincerely Jay
diff --git a/modules/db_cassandra/dbcassa_base.cpp b/modules/db_cassandra/dbcassa_base.cpp index e9d3a32..285fe16 100644 --- a/modules/db_cassandra/dbcassa_base.cpp +++ b/modules/db_cassandra/dbcassa_base.cpp @@ -561,7 +561,7 @@ ColumnVecPtr cassa_translate_query(const db1_con_t* _h, const db_key_t* _k, } dbcassa_reconnect(CON_CASSA(_h)); } while(cassa_auto_reconnect && retr++ < cassa_retries); - + LM_ERR("Failed to connect, retries exceeded.\n"); } catch (const oac::InvalidRequestException ir) { LM_ERR("Failed Invalid query request: %s\n", ir.why.c_str()); } catch (const at::TException &tx) { @@ -914,7 +914,7 @@ int db_cassa_query(const db1_con_t* _h, const db_key_t* _k, const db_op_t* _op, done: *_r = db_res; LM_DBG("Exited with success\n"); - return 1; + return 0; error: if(db_res) @@ -1060,14 +1060,14 @@ int db_cassa_modify(const db1_con_t* _h, const db_key_t* _k, const db_val_t* _v, if(CON_CASSA(_h)->con) { try{ CON_CASSA(_h)->con->batch_mutate(CFMap, oac::ConsistencyLevel::ONE); - return 1; + return 0; } catch (const att::TTransportException &tx) { LM_ERR("Failed to query: %s\n", tx.what()); } } dbcassa_reconnect(CON_CASSA(_h)); } while (cassa_auto_reconnect && retr++ < cassa_retries); - + LM_ERR("Failed to connect, retries exceeded.\n"); } catch (const oac::InvalidRequestException ir) { LM_ERR("Failed Invalid query request: %s\n", ir.why.c_str()); } catch (const at::TException &tx) { @@ -1188,13 +1188,14 @@ int db_cassa_delete(const db1_con_t* _h, const db_key_t* _k, const db_op_t* _o, if(CON_CASSA(_h)->con) { try { cassa_client->remove(row_key, cp, (int64_t)time(0), oac::ConsistencyLevel::ONE); - return 1; + return 0; } catch (const att::TTransportException &tx) { LM_ERR("Failed to query: %s\n", tx.what()); } } dbcassa_reconnect(CON_CASSA(_h)); } while(cassa_auto_reconnect && retr++ < cassa_retries); + LM_ERR("Failed to connect, retries exceeded.\n"); } else { if(!seckey_len) { @@ -1247,7 +1248,7 @@ int db_cassa_delete(const db1_con_t* _h, const db_key_t* _k, const db_op_t* _o, if(CON_CASSA(_h)->con) { try { cassa_client->batch_mutate(CFMap, oac::ConsistencyLevel::ONE); - return 1; + return 0; } catch (const att::TTransportException &tx) { LM_ERR("Failed to query: %s\n", tx.what()); } @@ -1255,7 +1256,7 @@ int db_cassa_delete(const db1_con_t* _h, const db_key_t* _k, const db_op_t* _o, dbcassa_reconnect(CON_CASSA(_h)); } while(cassa_auto_reconnect && retr++ < cassa_retries); } - return 1; + LM_ERR("Failed to connect, retries exceeded.\n"); } catch (const oac::InvalidRequestException ir) { LM_ERR("Invalid query: %s\n", ir.why.c_str()); } catch (const at::TException &tx) {
_______________________________________________ SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list sr-users@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users