Hi Daniel, After changing PROC_NOCHLDINIT for PROC_RPC in dmq.c no more errors!
One thing I'm curious, I do still see in the debug logs: Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:423]: usrloc_dmq_execute_action(): Received DMQ_UPDATE. Update contact info... Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:75]: add_contact(): aor: 1063362 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:76]: add_contact(): ci->ruid: uloc-2-5e542c06-34ac-1 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:77]: add_contact(): aorhash: 928261667 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:83]: add_contact(): Found contact Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1697]: update_ucontact(): exists callback for type= UL_CONTACT_UPDATE Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ul_callback.h:83]: run_ul_callbacks(): contact=0x7fc404bc6d20, callback type 2/15, id 0 entered Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:800]: dmq_ul_cb_contact(): Callback from usrloc with type=2 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:826]: dmq_ul_cb_contact(): Contact received from DMQ... skip Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1009]: db_update_ucontact_ruid(): ruid:uloc-2-5e542c06-34ac-1 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1138]: db_update_ucontact_ruid(): contact:sip:[email protected]:61183 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: <core> [db_query.c:72]: db_do_submit_query(): submitted query: update `location` set `expires`='2020-02-24 17:43:29',`q`=-1.000000 ,`cseq`=33578,`flags`=0,`cflags`=0 ,`user_agent`='Linksys/SPA941-5.1.8',`received`=NULL,`path`=NULL,`socket`='udp: 138.99.136.180:5060',`methods`=4767,`last_modified`='2020-02-24 17:42:29',`callid`='[email protected]',`instance`= NULL,`reg_id`=0,`server_id`=1,`connection_id`=-1,`keepalive`=0,`contact`=' sip:[email protected]:61183' where `ruid`='uloc-2-5e542c06-34ac-1', result 0 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:85]: add_contact(): Release record Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:87]: add_contact(): Unlock udomain Having set modparam("usrloc", "server_id_filter", 1), shouldn't the node receiving locations records via DMQ *not* update the database (as the server_id param won't match with the one in the DMQ request)? Thanks again! Joel. H On Mon, Feb 24, 2020 at 12:43 PM Joel Serrano <[email protected]> wrote: > Absolutely, I'll report back shortly! > > On Mon, Feb 24, 2020 at 12:19 PM Daniel-Constantin Mierla < > [email protected]> wrote: > >> Hello, >> >> the database connection is not initialized for that process based on the >> c code, as it is started with no child init callback, where the db >> connection is usually created: >> src/modules/dmq/dmq.c >> >> 283: newpid = fork_process(PROC_NOCHLDINIT, "DMQ WORKER", 0); >> >> Can you try with PROC_RPC instead of PROC_NOCHLDINIT and see if works ok? >> >> Cheers, >> Daniel >> On 24.02.20 21:04, Joel Serrano wrote: >> >> Hi Daniel, >> >> The process desc says: DMQ WORKER >> >> Joel. >> >> On Mon, Feb 24, 2020 at 11:55 AM Daniel-Constantin Mierla < >> [email protected]> wrote: >> >>> Can you get the type of the process throwing the error? The pid is >>> printed in the log message and the type you can find in the description >>> corresponding to the pid in the output of: >>> >>> kamctl ps >>> >>> Cheers, >>> Daniel >>> On 24.02.20 20:42, Joel Serrano wrote: >>> >>> Ok so my findings so far: >>> >>> 1- db_mode=2 --> No errors. >>> 2- db_mode=1 --> The reported error. >>> 3- modparam("usrloc", "server_id_filter", 1) --> Irrelevant, errors >>> happen with or without it on db_mode=1. >>> >>> Anything else I can try in order to find out what is happening? My next >>> step was to enable mysql logs or capture traffic and see what exactly is >>> Kam sending... >>> >>> Let me know your thoughts. >>> >>> Thanks, >>> Joel. >>> >>> >>> >>> On Mon, Feb 24, 2020 at 11:26 AM Joel Serrano <[email protected]> wrote: >>> >>>> Hi Henning, >>>> >>>> Every time I register, NodeA says OK, sends location by DMQ to NodeB, >>>> nodeB logs the error. >>>> >>>> I tried moving the VIP from NodeA to NodeB, and the error stops >>>> happening on NodeA, and starts happening on NodeB. >>>> >>>> I did not see this happening with db_mode=2, and I *think* it started >>>> happening when I enabled the server_id related params? I'm not 100% sure of >>>> this so I'm going to try some different combinations to see if I can narrow >>>> it down a little and provide more accurate details for you guys to find the >>>> issue. >>>> >>>> I'll report back shortly... >>>> >>>> Thanks! >>>> Joel. >>>> >>>> >>>> >>>> On Mon, Feb 24, 2020 at 11:20 AM Henning Westerholt <[email protected]> >>>> wrote: >>>> >>>>> Hello Joel, >>>>> >>>>> >>>>> >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: <core> [db.c:481]: >>>>> db_use_table(): invalid parameter value >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc [ucontact.c:670]: >>>>> db_insert_ucontact(): sql use_table failed for: location >>>>> >>>>> >>>>> >>>>> So the table name seems to be ok - this function also log this error >>>>> if the DB connection handle is not initialized. But still the question why >>>>> this happens. >>>>> >>>>> >>>>> >>>>> Cheers, >>>>> >>>>> >>>>> >>>>> Henning >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Henning Westerholt – https://skalatan.de/blog/ >>>>> >>>>> Kamailio services – https://gilawa.com >>>>> >>>>> >>>>> >>>>> *From:* sr-users <[email protected]> *On Behalf Of *Joel >>>>> Serrano >>>>> *Sent:* Monday, February 24, 2020 8:10 PM >>>>> *To:* Daniel-Constantin Mierla <[email protected]> >>>>> *Cc:* Kamailio (SER) - Users Mailing List <[email protected] >>>>> > >>>>> *Subject:* Re: [SR-Users] Database errors with usrloc db_mode=1 and >>>>> dmq_usrloc >>>>> >>>>> >>>>> >>>>> Hello Daniel, >>>>> >>>>> >>>>> >>>>> I built from master and re-tested, debug log: >>>>> >>>>> >>>>> >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> >>>>> [core/socket_info.c:635]: grep_sock_info(): checking if port 5060 >>>>> (advertise 0) matches port 5060 >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:423]: usrloc_dmq_execute_action(): Received DMQ_UPDATE. >>>>> Update contact info... >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:75]: add_contact(): aor: 1000107 >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:76]: add_contact(): ci->ruid: uloc-2-5e507d6a-1bd5-3 >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:77]: add_contact(): aorhash: 928379112 >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:115]: add_contact(): '1000107' Not found in usrloc >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:117]: add_contact(): Insert record >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: usrloc [ul_callback.h:83]: >>>>> run_ul_callbacks(): contact=0x7f3a8cfdf170, callback type 1/15, id 0 >>>>> entered >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:800]: dmq_ul_cb_contact(): Callback from usrloc with type=1 >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:826]: dmq_ul_cb_contact(): Contact received from DMQ... >>>>> skip >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: <core> [db.c:481]: >>>>> db_use_table(): invalid parameter value >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc [ucontact.c:670]: >>>>> db_insert_ucontact(): sql use_table failed for: location >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc [urecord.c:598]: >>>>> insert_ucontact(): failed to insert in database >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:121]: add_contact(): Insert ucontact >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:124]: add_contact(): Release record >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:126]: add_contact(): Unlock udomain >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: sl [sl.c:305]: send_reply(): >>>>> reply in stateless mode (sl) >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> >>>>> [core/msg_translator.c:161]: check_via_address(): (10.2.1.181, 10.2.1.181, >>>>> 0) >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dialog [dlg_cseq.c:116]: >>>>> dlg_cseq_prepare_new_msg(): prepare new msg for cseq update operations >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> >>>>> [core/parser/msg_parser.c:620]: parse_msg(): SIP Reply (status): >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> >>>>> [core/parser/msg_parser.c:622]: parse_msg(): version: <SIP/2.0> >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> >>>>> [core/parser/msg_parser.c:624]: parse_msg(): status: <200> >>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> >>>>> [core/parser/msg_parser.c:626]: parse_msg(): reason: <OK> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Joel. >>>>> >>>>> >>>>> >>>>> On Mon, Feb 24, 2020 at 1:36 AM Daniel-Constantin Mierla < >>>>> [email protected]> wrote: >>>>> >>>>> Hello, >>>>> >>>>> the error message: >>>>> >>>>> Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: <core> [db.c:481]: >>>>> db_use_table(): invalid parameter value >>>>> >>>>> Is printed if the database table name is not given as parameter (eg, >>>>> null or empty). I pushed a patch to usrloc to print the table in the log >>>>> message, can you try with latest master (or patch from next commit) and >>>>> send again the log messages? >>>>> >>>>> * >>>>> https://github.com/kamailio/kamailio/commit/3843ae18599bfc1806ccb62df44bbc8ad41d60dd >>>>> >>>>> Cheers, >>>>> Daniel >>>>> >>>>> On 21.02.20 22:40, Joel Serrano wrote: >>>>> >>>>> Hello, >>>>> >>>>> >>>>> >>>>> I have a setup with 2 kamailio in active/idle, locations are shared >>>>> via dmq_usrloc, but database is also used as a cache where other systems >>>>> can access and query "real-time" data. >>>>> >>>>> >>>>> >>>>> The config we have is: >>>>> >>>>> >>>>> >>>>> # ----- usrloc params ----- >>>>> modparam("usrloc", "db_url", DBURL) >>>>> modparam("usrloc", "db_mode", 1) >>>>> modparam("usrloc", "db_load", 0) >>>>> modparam("usrloc", "db_insert_update", 1) >>>>> modparam("usrloc", "db_timer_clean", 1) >>>>> modparam("usrloc", "server_id_filter", 1) >>>>> modparam("usrloc", "timer_procs", 2) >>>>> modparam("usrloc", "timer_interval", 60) >>>>> >>>>> >>>>> >>>>> # ----- dmq_usrloc params ----- >>>>> modparam("dmq_usrloc", "enable", 1) >>>>> modparam("dmq_usrloc", "sync", 1) >>>>> modparam("dmq_usrloc", "batch_msg_contacts", 50) # 50 contacts / >>>>> message >>>>> modparam("dmq_usrloc", "batch_size", 10000) # 10000 contacts / >>>>> batch >>>>> modparam("dmq_usrloc", "batch_usleep", 500000) # one batch every >>>>> 500ms >>>>> >>>>> >>>>> >>>>> With this setup, we assume: >>>>> >>>>> >>>>> >>>>> 1- almost real-time updated location info in the database. >>>>> >>>>> 2- as long as one of the two Kam is up, location information will not >>>>> be lost (as they will sync with each other via DMQ) >>>>> >>>>> 3- if both servers go down, locations are lost (as they will not >>>>> reload from database on startup). >>>>> >>>>> >>>>> >>>>> Our tests work well with the above config, with one exception, on the >>>>> idle node, we see a lot of (only happens on the node receiving location >>>>> via >>>>> dmq): >>>>> >>>>> >>>>> >>>>> Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: <core> [db.c:481]: >>>>> db_use_table(): invalid parameter value >>>>> Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: usrloc [ucontact.c:1136]: >>>>> db_update_ucontact_ruid(): sql use_table failed >>>>> Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: usrloc [ucontact.c:1657]: >>>>> update_contact_db(): failed to update database >>>>> >>>>> >>>>> >>>>> I added debug logs to see if I could understand the reason: >>>>> >>>>> >>>>> >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:478]: usrloc_dmq_handle_msg(): dmq message received from >>>>> sip:[email protected]:5050 >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:423]: usrloc_dmq_execute_action(): Received DMQ_UPDATE. >>>>> Update contact info... >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:75]: add_contact(): aor: 1063362 >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:76]: add_contact(): ci->ruid: uloc-1-5e502cd3-5629-1 >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:77]: add_contact(): aorhash: 928261667 >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:115]: add_contact(): '1063362' Not found in usrloc >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:117]: add_contact(): Insert record >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: usrloc [ul_callback.h:83]: >>>>> run_ul_callbacks(): contact=0x7fe41ff41f40, callback type 1/15, id 0 >>>>> entered >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:800]: dmq_ul_cb_contact(): Callback from usrloc with type=1 >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:826]: dmq_ul_cb_contact(): Contact received from DMQ... >>>>> skip >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: ERROR: <core> [db.c:481]: >>>>> db_use_table(): invalid parameter value >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: ERROR: usrloc [ucontact.c:669]: >>>>> db_insert_ucontact(): sql use_table failed >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: ERROR: usrloc [urecord.c:598]: >>>>> insert_ucontact(): failed to insert in database >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:121]: add_contact(): Insert ucontact >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:124]: add_contact(): Release record >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc >>>>> [usrloc_sync.c:126]: add_contact(): Unlock udomain >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: sl [sl.c:305]: >>>>> send_reply(): reply in stateless mode (sl) >>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: <core> >>>>> [core/msg_translator.c:161]: check_via_address(): (10.2.1.181, 10.2.1.181, >>>>> 0) >>>>> >>>>> >>>>> >>>>> Can anyone help me understand what's the problem? I have the server_id >>>>> setting, and the modparam server_id_filter, so shouldn't the receiving >>>>> node >>>>> just ignore any database tasks? >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Joel. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> >>>>> Kamailio (SER) - Users Mailing List >>>>> >>>>> [email protected] >>>>> >>>>> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users >>>>> >>>>> -- >>>>> >>>>> Daniel-Constantin Mierla -- www.asipto.com >>>>> >>>>> www.twitter.com/miconda -- www.linkedin.com/in/miconda >>>>> >>>>> Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com >>>>> >>>>> Kamailio World Conference - April 27-29, 2020, in Berlin -- >>>>> www.kamailioworld.com >>>>> >>>>> -- >>> Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- >>> www.linkedin.com/in/miconda >>> Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com >>> Kamailio World Conference - April 27-29, 2020, in Berlin -- >>> www.kamailioworld.com >>> >>> -- >> Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- >> www.linkedin.com/in/miconda >> Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com >> Kamailio World Conference - April 27-29, 2020, in Berlin -- >> www.kamailioworld.com >> >>
_______________________________________________ Kamailio (SER) - Users Mailing List [email protected] https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
