On 04/12/08 17:45, Mathieu Arnold wrote: > +-le 04.12.2008 12:42:09 -0500, Milenko a dit : > |> -----Original Message----- > |> From: [EMAIL PROTECTED] [mailto:tilesathome- > |> [EMAIL PROTECTED] On Behalf Of Mathieu Arnold > |> Sent: Thursday, December 04, 2008 10:24 AM > |> To: 'TilesAtHome' > |> Subject: Re: [Tilesathome] Two new ROMA... > |> > |> +-le 04.12.2008 10:12:09 -0500, Milenko a dit : > |> | OK - the map.fcgi that I have I just downloaded from florians server, > |> so > |> | that version does need to be updated. > |> > |> Yes, that should end up somewhere in the svn :-) > |> > |> | Yes I see that. It's returning results in under a second or two at > |> the > |> | moment. Most of the current tiles look empty or pretty sparse > |> though. > |> | We'll see what happens when missingtiles runs next. > |> > |> The thing is that if possible, the clients that were hitting your > |> server > |> directly would be really better served by the load balancer, so that > |> the load > |> balancer does not set your server as being down because the concurrency > |> limit > |> is reached on your end. (You could also do as I did, set the > |> concurrency > |> around 20 and let the LB do the job.) > |> > |> -- > |> Mathieu Arnold > | > | Could you drop the LB to something more like 8 instead of 10? Those extra > | two requests really slow things down. > > It was at 9, it's at 8 now. > > | Do you have any idea what causes the large groups on requests all at one > | time? I'm seeing a pattern of no new requests for a minute or so and then > | 4-7 new requests all within a second or two. Is this by design on the LB? > | If so, spreading these requests out would probably make all of the ROMA > | servers more efficient. > > Well, it's a bit hard to debug things with not much informations :-) > It may be because your server is marked down, thus no hit, and comes up, thus > you getting assigned what's left in the queue you can get :-) >
I think that is exactly what is happening. As soon as the server comes back up, the LB will assign it the full 8 requests upto maxconn if there is still requests left in the queue. I think there is an option called slowstart / warmup or something like that, that allows you to slowly ramp up the requests once the server comes back online. The main problem is however that your server keeps on getting marked as down. In 20 minutes it has been down 19 times. I.e. basically whenever it was assigned 9 requests at least one of them failed with 503, triggering the server to be marked down. Can you tell how many requests hit your server that do not come through the load balancer? Another possibility is that there are a few stale pid files left in the directory that map.fcgi uses to determine the concurrency. It would then think there are more requests currently ongoing than there actually are. Could you check to see if there are any stale pids left in your $stampdir? One reason there are stale pid files is that fcgi scripts can get killed if they run to long not leaving the script time to clean up after it. On my ubuntu system this timeout is only 40 seconds, which is far to short for large requests. Also your check for the stale db does not seem ideally placed. As the check is rather late, the load balancer health check requests come back claiming everything is fine, but all requests fail with 503. Would it be possible to move the db check before the healthcheck bbox check? Kai _______________________________________________ Tilesathome mailing list [email protected] http://lists.openstreetmap.org/listinfo/tilesathome
