Hello,

Since you started to look at it again, let me repeat myself.
The problem is described in detail here: 
http://lists.digium.com/pipermail/asterisk-dev/2015-October/075128.html
It has to do with the fact that at initial load pjsip realtime issues separate 
db query for each endpoint/aor/etc in the system.
In my case of ~10K endpoints it took asterisk ~1.5minutes to load.
Further in that discussion I suggested that having the following API call to 
populate sorcery cache would go a long way to 
reducing the scale of this problem:

ast_sorcery_retrieve_by_fields(sip_sorcery, 
"endpoint",AST_RETRIEVE_FLAG_MULTIPLE | AST_RETRIEVE_FLAG_ALL, NULL);

I haven't looked at pjsip since the time of that discussion as that's clearly a 
show-stopper for me, but I doubt anything changed.
Also I haven't received any feedback if that suggestion is viable, so I'd love 
to hear your (and/or other developers) opinion on it.
Any other idea on how to deal with it is more than welcome as well.

Thanks,
Michael

On Wednesday, March 02, 2016 06:04:15 PM Ross Beer wrote:
> Hi George,
>  
> I have commented out those lines and it hasn't improved the load times, its 
> still taking 15 mins. It has improved it a little.
>  
> Regards,
>  
> Ross
>  
> From: [email protected]
> Date: Wed, 2 Mar 2016 08:19:01 -0700
> To: [email protected]
> Subject: Re: [asterisk-dev] Asterisk Segfault After PJSIP Commit 5241
> 
> 
> 
> On Wed, Mar 2, 2016 at 2:56 AM, Ross Beer <[email protected]> wrote:
> 
> 
> 
> Hi George,
>  
> I have re-built the 'c1bf014ea08cf66835a6f000e2bd6c7da588da6b' commit and 
> PJSIP and Asterisk hasn't crashed after reload. However it did take 25 mins 
> to load.
>  
> As requested I have opened a ticket for the realtime issue:
>  
> https://issues.asterisk.org/jira/browse/ASTERISK-25826
> ​Got it, thanks.​ 
>  
> Basically, I think this could be resolved by a configuration option that 
> stops sourcery/pjsip loading all peers at start-up as this is not needed for 
> the current setup. This has been discussed before on the mailing list however 
> it doesn't look like it progresses any further.
> 
> ​If you're up for trying something, ​you can comment out the 
> qualify_and_schedule_all function ​in ​line​s​ 1135​-1147​ of 
> res/res_pjsip/pjsip_options.c, then comment out its 2 references on lines 
> 1245 and 1281.  If that drops your startup times, then we know we're on the 
> right track.
>  
>  
> I would like to thank you for all of your help tying to identify the issue 
> and hope that we can resolve it soon.
> 
> ​No worries!​ 
>  
> Kind regards,
>  
> Ross
>  
> From: [email protected]
> Date: Tue, 1 Mar 2016 16:27:06 -0700
> To: [email protected]
> Subject: Re: [asterisk-dev] Asterisk Segfault After PJSIP Commit 5241
> 
> 
> 
> On Tue, Mar 1, 2016 at 3:07 PM, Ross Beer <[email protected]> wrote:
> 
> 
> 
> ok,
>  
> That took 15 mins to load and then crashed. This will be due to the 
> pjsip_dlg_create_uas_and_inc_lock commit.
> ​It should not have crashed.  That commit had the fix for it.  If it did 
> crash with that commit, open a Jira issue and ​attach a full backtrace. 
>  
> However 15 mins to start is a long time and would cause issues in a 
> production environment.
> ​Would you open a Jira issue on the realtime problem (if one isn't already 
> open).I'm starting to look at alternatives.
> 
> 
>  
> Thank you for your help here,
>  
> Ross
> 
>  
> From: [email protected]
> Date: Tue, 1 Mar 2016 14:02:38 -0700
> To: [email protected]
> Subject: Re: [asterisk-dev] Asterisk Segfault After PJSIP Commit 5241
> 
> 
> 
> On Tue, Mar 1, 2016 at 1:04 PM, Ross Beer <[email protected]> wrote:
> 
> 
> 
> Hi George,
>  
> Using a development test box for testing!!
>  
> Asterisk 13.7.2 with no cache takes 4:12 to load, that with PJSIP Commit 5240
> 
> ​Ok, try this combination..."git checkout 
> c1bf014ea08cf66835a6f000e2bd6c7da588da6b"pjproject from trunk.with caching.
> The commit I referenced is the one that handles the 
> pjsip_dlg_create_uas_and_inc_lock​
> 
> 
>   
> Qualify time on the aor is set to zero, I guess a query could be made to 
> check for a value greater than zero instead of loading all endpoints.
>  
> Ross
>  
> From: [email protected]
> Date: Tue, 1 Mar 2016 12:45:28 -0700
> To: [email protected]
> Subject: Re: [asterisk-dev] Asterisk Segfault After PJSIP Commit 5241
> 
> 
> 
> On Tue, Mar 1, 2016 at 12:21 PM, Ross Beer <[email protected]> wrote:
> 
> 
> 
> Hi George,
>  
> No endpoints are qualified, there are 20,000 endpoints with only 75 static 
> contacts defined in the aors. The database is a MySQL cluster.
>  
> With the current Asterisk 13 branch with cache disabled and the latest PJSIP 
> it takes 5 mins and then before finishing it crashes.
>  
> With Asterisk 13.7.2 with cache it takes around 1 1/2 min to load, however 
> due to the bug with PJSIP Commit 5241 asterisk crashes when using TLS devices.
> 
> ​Try 13.7.2 without the cache.  I'm trying to understand where the time is 
> being spent.​  I know it will crash because of that bug.  You're not doing 
> this on a production system are you??  
> The main issue here is that the endpoints are loaded as soon as PJSIP loads, 
> ideally endpoints would only be loaded once a device registers or attempts to 
> make a call. Much in the same way as Asterisk 1.8 chan_sip manages realtime.
>  
> There is no need to load the endpoints as they are not qualified.
> 
> ​How do you know they're not qualified if you don't load them? :)
> Time to load up a database with 20,000 endpoints I guess.​  
> Ross
>  
> From: [email protected]
> Date: Tue, 1 Mar 2016 11:58:15 -0700
> To: [email protected]
> Subject: Re: [asterisk-dev] Asterisk Segfault After PJSIP Commit 5241
> 
> 
> 
> On Tue, Mar 1, 2016 at 11:38 AM, Michael Ulitskiy <[email protected]> 
> wrote:
> 
> 
> Hello,
>  
> Please see this discussion 
> http://lists.digium.com/pipermail/asterisk-dev/2015-October/075122.html
> I guess you're talking about the same problem.
> ​It's possible.​
>  
> 
>  
> Michael
>  
> On Tuesday, March 01, 2016 06:26:27 PM Ross Beer wrote:
> > Hi George,
> >  
> > We need to store contacts in realtime for our system. However not all 
> > endpoints are registered only about 200, yet asterisk loops through every 
> > endpoint which has been defined. It does this if contacts are in realtime 
> > or not.
> >  
> > Its almost like pjsip is loading them to check if they need to be qualified 
> > etc.
> >  
> > Asterisk 1.8 only put things into cache once they were accessed, is this an 
> > option for sourcery?
> ​Well, in order to initiate qualify of contacts, Asterisk does have to 
> "access" them all​ so I'm not quite sure what the problem is.
> Can we reset to a known config and see what happens?
> 
> pjproject from the published 2.4.5 tarball.Asterisk from the published 13.7.2 
> tarball.Disable memory_cache altogether in sorcery.conf.
> 
> See what happens.
> Give me an estimate of how many endpoints and aors there are in the database, 
> how many of those aors have static contacts defined, and what's your qualify 
> interval.
> An idea of your database setup would help as well.  Same server, local, 
> remote, etc.
> Let's solve 1 problem at a time.
>  
> 
> 
> 
> 
> 
-- 
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

Reply via email to