hi
try to move
tval.tv_sec = 300;
tval.tv_usec = 0;
into
while(1){
....
}
before select()
Ramil.
Vanja Hrustic wrote:
> Hi!
>
> I have noticed that cachelogd will start consuming 100% of CPU time *exactly* 5
>minutes after the daemon is started. Gdb showed that reason for the load is select()
>loop.
>
> In source code, I could see:
>
> ----- cachelogd.c
> <...code...>
> tval.tv_sec = 300;
> tval.tv_usec = 0;
> <...code...>
> while (1) {
> fd_set msk;
>
> msk=mask;
> sel=select(32,&msk,0,0,&tval);
>
> if(sel==0){
> /* Time limit expired */
> continue;
> }
> if(sel<1){
> /* FIXME: add errno checking */
> continue;
> }
>
> /* Check whether new client has connected */
> if(FD_ISSET(ctl_sock,&msk)){
> int cl,addrlen;
> addrlen = sizeof(his_addr);
> for(cl=0;cl<MAXCLIENT;cl++){
> if(log_client[cl].fd==0){
> fprintf(stderr,"%s Client #%d
>connected\n",time_pid_info(),cl);
> log_client[cl].fd=accept(ctl_sock, (struct sockaddr
>*)&his_addr, &addrlen);
> FD_SET(log_client[cl].fd,&mask);
> log_client[cl].state=STATE_CMD;
> log_client[cl].rbytes=sizeof(UDM_LOGD_CMD);
> break;
> }
> }
> }
> <code...>
> -----
>
> Now... I might be talking complete rubish, but I hope someone will correct me :)
>
> >From what I could find, there are 2 'ways' to use select()/accept(). One way is to
>accept(), then use select() later - select() has a timeout, and if nothing happens
>during that timeout period on a socket, select() returns 0
> and some action can be performed (close the socket, or whatever - depending on
>needs).
>
> In another situation, select() is used first, and accept() later (as in
>cachelogd.c). But, select() is called with timeout NULL, which makes it 'block' until
>some input comes in.
>
> What happens right now in cachelogd (as much as I can see, but I'm not a programmer
>by 'definition', so... ;) is that cachelogd will be ok for 5 minutes (while select()
>is actually sleeping), but once the timer reaches 0,
> select() will start the flood. It can be checked in gdb as well. Something like:
>
> -----
> [root@emx sbin]# gdb ./cachelogd
> <loading...>
> (gdb) b 416
> Breakpoint 1 at 0x80494a7: file cachelogd.c, line 416.
> (gdb) r
> Starting program: /opt/mnogosearch/sbin/./cachelogd
> Wed 21 16:49:59 [21785] Open logs 0 0
> Wed 21 16:49:59 [21785] cachelogd started. Accepting 128 connections.
>
> Breakpoint 1, main (argc=1, argv=0xbffffce4) at cachelogd.c:416
> warning: Source file is more recent than executable.
>
> 416 sel=select(32,&msk,0,0,&tval);
> [This select() is the 1st one that gets executed, and tval.tv_sec is 300 at this
>point.]
> (gdb) c
> Continuing.
>
> [exactly 300 seconds later...]
>
> Breakpoint 1, main (argc=1, argv=0xbffffce4) at cachelogd.c:416
> 416 sel=select(32,&msk,0,0,&tval);
> (gdb) p tval.tv_sec
> $1 = 0
> (gdb) c
> Continuing.
>
> Breakpoint 1, main (argc=1, argv=0xbffffce4) at cachelogd.c:416
> 416 sel=select(32,&msk,0,0,&tval);
> (gdb) c
> Continuing.
>
> Breakpoint 1, main (argc=1, argv=0xbffffce4) at cachelogd.c:416
> 416 sel=select(32,&msk,0,0,&tval);
> (gdb) c
> Continuing.
>
> etc... (repeats forever)
>
> -----
>
> I can think of 3 possible ways to fix this. But I would *really* appreciate if
>someone with more 'socket experience' gives the proper fix and possibly explains the
>real issue here :)
>
> 1. Do something like:
>
> if(sel==0){
> tval.tv_sec = 300; /* reset the timer when it reaches 0 */
> /* Time limit expired */
> continue;
> }
>
> In this case, timer will get reset every time it reaches 0. Seems to work ok, no
>'side-effects' noticed (tried for 30 mins and re-indexed few thousand pages)
>
> 2. Do something like:
>
> instead of:
> sel=select(32,&msk,0,0,&tval);
> use:
> sel=select(32,&msk,NULL,NULL,(struct timeval *)NULL); /* I prefer NULL
>over 0 - just for 'aesthetic' purposes, sorry :) */
>
> This *should* make select() "block" until there is actually something it can deal
>with (new connection, etc). Seems to work ok, no 'side-effects' noticed (still
>running, re-indexing 10,000 pages)
>
> 3. Rewrite this part using accept(), and then select()
>
> Don't think it's really needed :)
>
>
>
> I hope there is someone more experienced to check this out :)
>
> Thanks.
>
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]