Hi Henrik,

Thanks for sharing. I get the following when running the ws-demo:

 ./pil pl-web/ws-demo/main.l -go
..
!? (wsServer)
wsServer -- Undefined

I can't find the definition of wsServer anywhere. Is it missing from the
repo?

Thanks,
Joe

On Mon, Jan 4, 2016 at 4:27 PM, Henrik Sarvell <hsarv...@gmail.com> wrote:

> Update:
>
> The socketserver is now completely reliant on Redis, using Redis' pub
> / sub functionality: http://redis.io/topics/pubsub
>
> The reason for this is that I was using the websocket server to handle
> all websockets functionality for the site I'm being paid to work on
> and it started running into problems as the site grew, the first issue
> was an easy fix after Alex pointed me to it, increasing the amount of
> file descriptors in src64/sys/x86-64.linux.defs.l, my line #115 now
> looks like this: (equ FD_SET 1024)  # 1024 bit
>
> After re-compiling I could easily handle more than 500 clients and all
> was well for a while.
>
> Unfortunately the site is growing so fast that just some month(s)
> later the parent / root process started intermittently running at 100%
> CPU utilization and the service stopped working for perhaps 10-20
> minutes before resolving on its own. At this point peak usage involved
> 2000 clients being connected at the same time.
>
> Alex suspects that the issue has got to do with how the internal logic
> handles new processes being created when there are already a lot of
> them present. In a normal HTTP server scenario this probably never
> happens, imagine that every request takes on average 1 second to
> perform before the socket closes, you would then need about 2000
> requests per second in order to trigger the CPU problem, you'll run
> into many other issues long before that happens in a non-trivial
> scenario (trust me I've tested).
>
> In the end we switched over to a node.js based solution that also
> relies on Redis' pub / sub functionality (that's where I got the idea
> from to make the PL based solution also use it).
>
> I have tried to replicate the real world situation load wise and
> number of clients wise but not been able to trigger the CPU issue
> (this also seems to imply that Alex's suspicion is not completely on
> target), it's impossible for me to replicate the real world situation
> since I can't commandeer hundreds of machines all over the world to
> connect to my test server. What I did manage to trigger though was
> fairly high CPU usage in the child processes though, a situation that
> also involved loss of service. After the switch to using pub / sub I
> haven't been able to trigger it, so that's a win at least.
>
> Now for the real improvement, actually making HTTP requests to publish
> something becomes redundant when publishing from server to client
> since it's just a matter of issuing a publish call directly to Redis
> instead. That lowers the amount of process creation by more than 90%
> in my use case.
>
> Even though I can't be 100% sure as it currently stands I believe that
> if I had implemented the websocket server using Redis' pub / sub to
> begin with the CPU issue would probably never have happened and there
> would've been no need to switch over to node.js.
>
> That being said, this type of service / application is better suited
> for threads since the cost in RAM etc is lower.
>
> Final note, my decision to use one socket per feature was poor, it
> allowed me a simpler architecture but had I opted for one socket with
> "routing" logic implemented in the browser instead I could have
> lowered the amount of simultaneous sockets up to 8 times. Peak usage
> would then have been 2000 / 8 = 250 processes. Not only that, it turns
> out that IE (yes, even version 11 / edge) only allows 6 simultaneous
> sockets (including in iframes) per page. We've therefore been forced
> to turn off for instance the tournament functionality for IE users.
>
>
>
> On Fri, Jun 26, 2015 at 9:30 PM, Henrik Sarvell <hsarv...@gmail.com>
> wrote:
> > Hi all, after over a month without any of the prior issues I now
> > consider the websockets part of pl-web stable:
> > https://bitbucket.org/hsarvell/pl-web Gone are the days of 100% CPU
> > usage and zombie processes.
> >
> > With Alex's help the main web server is now more stable (he made me
> > throw away a few throws in favour of a few byes). The throws were
> > causing the zombies.
> >
> > I was also including dbg.l (it was causing hung processes at 100%
> > CPU), it's basically been deprecated or something, I'll leave it up to
> > him to elaborate. It's just something I've been including by habit
> > since years ago when at some point I needed to include it to do some
> > kind of debugging.
> >
> > Anyway atm the WS router is regularly routing up to 40 messages per
> > second to upwards 300-500 clients which means that roughly 20,000
> > messages are being pushed out per second during peak hours.
> >
> > The PL processes show up with 0 CPU and 0 RAM usage when I run top,
> > sometimes 1% CPU :) They hardly register even i aggregate, the server
> > would be running 99% idle if it was only running the WS server.
> >
> > To work around the inter-process limit of 4096 byte long messages the
> > router now supports storing the messages in Redis (raw disk is also
> > supported if Redis is not available), this is also in effect in
> > production and is working flawlessly since months.
> >
> > This is how I start the WS server in production:
> >
> > (load "pl-web/pl-web.l")
> >
> > (setq *Mobj (new '(+Redis) "pl-ws-"))
> >
> > (undef 'app)
> >
> > (setq *WsAuth '(("notifications" (("send" ("put your password/key
> here"))))))
> >
> > (de app ()
> >    (splitPath)
> >    (wsApp)
> >    (bye))
> >
> > (de go ()
> >    (wsServer)
> >    (server 9090) )
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>

Reply via email to