Re: Websockets now considered stable

2016-01-16 Thread Henrik Sarvell
Hi Joe, thanks for the pointer, had totally forgotten about that demo app.

I've removed the call from the demo, the wsServer logic has been
removed completely since it's now Redis that is responsible for
routing messages through the pub / sub handling.

I've also updated the readme with the fact that Redis is now a
required dependency.


On Thu, Jan 14, 2016 at 6:17 PM, Joe Bogner  wrote:
> Hi Henrik,
>
> Thanks for sharing. I get the following when running the ws-demo:
>
>  ./pil pl-web/ws-demo/main.l -go
> ...
> !? (wsServer)
> wsServer -- Undefined
>
> I can't find the definition of wsServer anywhere. Is it missing from the
> repo?
>
> Thanks,
> Joe
>
> On Mon, Jan 4, 2016 at 4:27 PM, Henrik Sarvell  wrote:
>>
>> Update:
>>
>> The socketserver is now completely reliant on Redis, using Redis' pub
>> / sub functionality: http://redis.io/topics/pubsub
>>
>> The reason for this is that I was using the websocket server to handle
>> all websockets functionality for the site I'm being paid to work on
>> and it started running into problems as the site grew, the first issue
>> was an easy fix after Alex pointed me to it, increasing the amount of
>> file descriptors in src64/sys/x86-64.linux.defs.l, my line #115 now
>> looks like this: (equ FD_SET 1024)  # 1024 bit
>>
>> After re-compiling I could easily handle more than 500 clients and all
>> was well for a while.
>>
>> Unfortunately the site is growing so fast that just some month(s)
>> later the parent / root process started intermittently running at 100%
>> CPU utilization and the service stopped working for perhaps 10-20
>> minutes before resolving on its own. At this point peak usage involved
>> 2000 clients being connected at the same time.
>>
>> Alex suspects that the issue has got to do with how the internal logic
>> handles new processes being created when there are already a lot of
>> them present. In a normal HTTP server scenario this probably never
>> happens, imagine that every request takes on average 1 second to
>> perform before the socket closes, you would then need about 2000
>> requests per second in order to trigger the CPU problem, you'll run
>> into many other issues long before that happens in a non-trivial
>> scenario (trust me I've tested).
>>
>> In the end we switched over to a node.js based solution that also
>> relies on Redis' pub / sub functionality (that's where I got the idea
>> from to make the PL based solution also use it).
>>
>> I have tried to replicate the real world situation load wise and
>> number of clients wise but not been able to trigger the CPU issue
>> (this also seems to imply that Alex's suspicion is not completely on
>> target), it's impossible for me to replicate the real world situation
>> since I can't commandeer hundreds of machines all over the world to
>> connect to my test server. What I did manage to trigger though was
>> fairly high CPU usage in the child processes though, a situation that
>> also involved loss of service. After the switch to using pub / sub I
>> haven't been able to trigger it, so that's a win at least.
>>
>> Now for the real improvement, actually making HTTP requests to publish
>> something becomes redundant when publishing from server to client
>> since it's just a matter of issuing a publish call directly to Redis
>> instead. That lowers the amount of process creation by more than 90%
>> in my use case.
>>
>> Even though I can't be 100% sure as it currently stands I believe that
>> if I had implemented the websocket server using Redis' pub / sub to
>> begin with the CPU issue would probably never have happened and there
>> would've been no need to switch over to node.js.
>>
>> That being said, this type of service / application is better suited
>> for threads since the cost in RAM etc is lower.
>>
>> Final note, my decision to use one socket per feature was poor, it
>> allowed me a simpler architecture but had I opted for one socket with
>> "routing" logic implemented in the browser instead I could have
>> lowered the amount of simultaneous sockets up to 8 times. Peak usage
>> would then have been 2000 / 8 = 250 processes. Not only that, it turns
>> out that IE (yes, even version 11 / edge) only allows 6 simultaneous
>> sockets (including in iframes) per page. We've therefore been forced
>> to turn off for instance the tournament functionality for IE users.
>>
>>
>>
>> On Fri, Jun 26, 2015 at 9:30 PM, Henrik Sarvell 
>> wrote:
>> > Hi all, after over a month without any of the prior issues I now
>> > consider the websockets part of pl-web stable:
>> > https://bitbucket.org/hsarvell/pl-web Gone are the days of 100% CPU
>> > usage and zombie processes.
>> >
>> > With Alex's help the main web server is now more stable (he made me
>> > throw away a few throws in favour of a few byes). The throws were
>> > causing the zombies.
>> >
>> > I was also including dbg.l (it was causing hung processes at 100%
>> > CPU), it's 

Re: Websockets now considered stable

2016-01-14 Thread Joe Bogner
Hi Henrik,

Thanks for sharing. I get the following when running the ws-demo:

 ./pil pl-web/ws-demo/main.l -go
..
!? (wsServer)
wsServer -- Undefined

I can't find the definition of wsServer anywhere. Is it missing from the
repo?

Thanks,
Joe

On Mon, Jan 4, 2016 at 4:27 PM, Henrik Sarvell  wrote:

> Update:
>
> The socketserver is now completely reliant on Redis, using Redis' pub
> / sub functionality: http://redis.io/topics/pubsub
>
> The reason for this is that I was using the websocket server to handle
> all websockets functionality for the site I'm being paid to work on
> and it started running into problems as the site grew, the first issue
> was an easy fix after Alex pointed me to it, increasing the amount of
> file descriptors in src64/sys/x86-64.linux.defs.l, my line #115 now
> looks like this: (equ FD_SET 1024)  # 1024 bit
>
> After re-compiling I could easily handle more than 500 clients and all
> was well for a while.
>
> Unfortunately the site is growing so fast that just some month(s)
> later the parent / root process started intermittently running at 100%
> CPU utilization and the service stopped working for perhaps 10-20
> minutes before resolving on its own. At this point peak usage involved
> 2000 clients being connected at the same time.
>
> Alex suspects that the issue has got to do with how the internal logic
> handles new processes being created when there are already a lot of
> them present. In a normal HTTP server scenario this probably never
> happens, imagine that every request takes on average 1 second to
> perform before the socket closes, you would then need about 2000
> requests per second in order to trigger the CPU problem, you'll run
> into many other issues long before that happens in a non-trivial
> scenario (trust me I've tested).
>
> In the end we switched over to a node.js based solution that also
> relies on Redis' pub / sub functionality (that's where I got the idea
> from to make the PL based solution also use it).
>
> I have tried to replicate the real world situation load wise and
> number of clients wise but not been able to trigger the CPU issue
> (this also seems to imply that Alex's suspicion is not completely on
> target), it's impossible for me to replicate the real world situation
> since I can't commandeer hundreds of machines all over the world to
> connect to my test server. What I did manage to trigger though was
> fairly high CPU usage in the child processes though, a situation that
> also involved loss of service. After the switch to using pub / sub I
> haven't been able to trigger it, so that's a win at least.
>
> Now for the real improvement, actually making HTTP requests to publish
> something becomes redundant when publishing from server to client
> since it's just a matter of issuing a publish call directly to Redis
> instead. That lowers the amount of process creation by more than 90%
> in my use case.
>
> Even though I can't be 100% sure as it currently stands I believe that
> if I had implemented the websocket server using Redis' pub / sub to
> begin with the CPU issue would probably never have happened and there
> would've been no need to switch over to node.js.
>
> That being said, this type of service / application is better suited
> for threads since the cost in RAM etc is lower.
>
> Final note, my decision to use one socket per feature was poor, it
> allowed me a simpler architecture but had I opted for one socket with
> "routing" logic implemented in the browser instead I could have
> lowered the amount of simultaneous sockets up to 8 times. Peak usage
> would then have been 2000 / 8 = 250 processes. Not only that, it turns
> out that IE (yes, even version 11 / edge) only allows 6 simultaneous
> sockets (including in iframes) per page. We've therefore been forced
> to turn off for instance the tournament functionality for IE users.
>
>
>
> On Fri, Jun 26, 2015 at 9:30 PM, Henrik Sarvell 
> wrote:
> > Hi all, after over a month without any of the prior issues I now
> > consider the websockets part of pl-web stable:
> > https://bitbucket.org/hsarvell/pl-web Gone are the days of 100% CPU
> > usage and zombie processes.
> >
> > With Alex's help the main web server is now more stable (he made me
> > throw away a few throws in favour of a few byes). The throws were
> > causing the zombies.
> >
> > I was also including dbg.l (it was causing hung processes at 100%
> > CPU), it's basically been deprecated or something, I'll leave it up to
> > him to elaborate. It's just something I've been including by habit
> > since years ago when at some point I needed to include it to do some
> > kind of debugging.
> >
> > Anyway atm the WS router is regularly routing up to 40 messages per
> > second to upwards 300-500 clients which means that roughly 20,000
> > messages are being pushed out per second during peak hours.
> >
> > The PL processes show up with 0 CPU and 0 RAM usage when I run top,
> > sometimes 1% CPU 

Re: Websockets now considered stable

2016-01-04 Thread Henrik Sarvell
Update:

The socketserver is now completely reliant on Redis, using Redis' pub
/ sub functionality: http://redis.io/topics/pubsub

The reason for this is that I was using the websocket server to handle
all websockets functionality for the site I'm being paid to work on
and it started running into problems as the site grew, the first issue
was an easy fix after Alex pointed me to it, increasing the amount of
file descriptors in src64/sys/x86-64.linux.defs.l, my line #115 now
looks like this: (equ FD_SET 1024)  # 1024 bit

After re-compiling I could easily handle more than 500 clients and all
was well for a while.

Unfortunately the site is growing so fast that just some month(s)
later the parent / root process started intermittently running at 100%
CPU utilization and the service stopped working for perhaps 10-20
minutes before resolving on its own. At this point peak usage involved
2000 clients being connected at the same time.

Alex suspects that the issue has got to do with how the internal logic
handles new processes being created when there are already a lot of
them present. In a normal HTTP server scenario this probably never
happens, imagine that every request takes on average 1 second to
perform before the socket closes, you would then need about 2000
requests per second in order to trigger the CPU problem, you'll run
into many other issues long before that happens in a non-trivial
scenario (trust me I've tested).

In the end we switched over to a node.js based solution that also
relies on Redis' pub / sub functionality (that's where I got the idea
from to make the PL based solution also use it).

I have tried to replicate the real world situation load wise and
number of clients wise but not been able to trigger the CPU issue
(this also seems to imply that Alex's suspicion is not completely on
target), it's impossible for me to replicate the real world situation
since I can't commandeer hundreds of machines all over the world to
connect to my test server. What I did manage to trigger though was
fairly high CPU usage in the child processes though, a situation that
also involved loss of service. After the switch to using pub / sub I
haven't been able to trigger it, so that's a win at least.

Now for the real improvement, actually making HTTP requests to publish
something becomes redundant when publishing from server to client
since it's just a matter of issuing a publish call directly to Redis
instead. That lowers the amount of process creation by more than 90%
in my use case.

Even though I can't be 100% sure as it currently stands I believe that
if I had implemented the websocket server using Redis' pub / sub to
begin with the CPU issue would probably never have happened and there
would've been no need to switch over to node.js.

That being said, this type of service / application is better suited
for threads since the cost in RAM etc is lower.

Final note, my decision to use one socket per feature was poor, it
allowed me a simpler architecture but had I opted for one socket with
"routing" logic implemented in the browser instead I could have
lowered the amount of simultaneous sockets up to 8 times. Peak usage
would then have been 2000 / 8 = 250 processes. Not only that, it turns
out that IE (yes, even version 11 / edge) only allows 6 simultaneous
sockets (including in iframes) per page. We've therefore been forced
to turn off for instance the tournament functionality for IE users.



On Fri, Jun 26, 2015 at 9:30 PM, Henrik Sarvell  wrote:
> Hi all, after over a month without any of the prior issues I now
> consider the websockets part of pl-web stable:
> https://bitbucket.org/hsarvell/pl-web Gone are the days of 100% CPU
> usage and zombie processes.
>
> With Alex's help the main web server is now more stable (he made me
> throw away a few throws in favour of a few byes). The throws were
> causing the zombies.
>
> I was also including dbg.l (it was causing hung processes at 100%
> CPU), it's basically been deprecated or something, I'll leave it up to
> him to elaborate. It's just something I've been including by habit
> since years ago when at some point I needed to include it to do some
> kind of debugging.
>
> Anyway atm the WS router is regularly routing up to 40 messages per
> second to upwards 300-500 clients which means that roughly 20,000
> messages are being pushed out per second during peak hours.
>
> The PL processes show up with 0 CPU and 0 RAM usage when I run top,
> sometimes 1% CPU :) They hardly register even i aggregate, the server
> would be running 99% idle if it was only running the WS server.
>
> To work around the inter-process limit of 4096 byte long messages the
> router now supports storing the messages in Redis (raw disk is also
> supported if Redis is not available), this is also in effect in
> production and is working flawlessly since months.
>
> This is how I start the WS server in production:
>
> (load "pl-web/pl-web.l")
>
> (setq *Mobj (new 

Re: Websockets now considered stable

2015-07-02 Thread Rick Hanson
 Hi Rick, seems like a fix would be a check there: if sessions dir
 doesn't exist (and Redis isn't used to store the session) create it
 and move on instead of breaking down in tears.

Hi Henrik!  Yes, I agree.  BTW, thanks.  I forgot to thank you before
for sharing this!
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: Websockets now considered stable

2015-07-02 Thread Henrik Sarvell
Hi Rick, seems like a fix would be a check there: if sessions dir doesn't
exist (and Redis isn't used to store the session) create it and move on
instead of breaking down in tears.

On Sun, Jun 28, 2015 at 10:47 PM, Rick Hanson cryptor...@gmail.com wrote:

 I downloaded pl-web and ext and ran the demo-app.  When I went to the
 login page, I got this Open error in the console:

 # excmd redefined
 # exlst redefined
 # exlst redefined
 !? (out Sf (print (list (list sid *Sid

 /home/rick/projects/pl-web-demo/./sessions/6d61fa61b9cc1d8fd878f4b534703473
 -- Open error: No such file or directory
 ?

 But after I quit, issued a:

 $ mkdir sessions

 and re-started the server, I never got the error again -- everything
 worked as expected.  (I got the login creds from main.l.)
 --
 UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe



Re: Websockets now considered stable

2015-06-28 Thread Rick Hanson
What gives?!  This stuff is broken!!!

$ git clone https://bitbucket.org/hsarvell/pl-web
Cloning into 'pl-web'...
fatal: repository 'https://bitbucket.org/hsarvell/pl-web/' not found

Just yanking your chain.  I know this is a mercurial repo.  :)

Thanks, man.  Looks good. I'll study the code when I get time in the
next few days.

On Fri, Jun 26, 2015 at 3:30 PM, Henrik Sarvell hsarv...@gmail.com wrote:
 Hi all, after over a month without any of the prior issues I now
 consider the websockets part of pl-web stable:
 https://bitbucket.org/hsarvell/pl-web Gone are the days of 100% CPU
 usage and zombie processes.

 With Alex's help the main web server is now more stable (he made me
 throw away a few throws in favour of a few byes). The throws were
 causing the zombies.

 I was also including dbg.l (it was causing hung processes at 100%
 CPU), it's basically been deprecated or something, I'll leave it up to
 him to elaborate. It's just something I've been including by habit
 since years ago when at some point I needed to include it to do some
 kind of debugging.

 Anyway atm the WS router is regularly routing up to 40 messages per
 second to upwards 300-500 clients which means that roughly 20,000
 messages are being pushed out per second during peak hours.

 The PL processes show up with 0 CPU and 0 RAM usage when I run top,
 sometimes 1% CPU :) They hardly register even i aggregate, the server
 would be running 99% idle if it was only running the WS server.

 To work around the inter-process limit of 4096 byte long messages the
 router now supports storing the messages in Redis (raw disk is also
 supported if Redis is not available), this is also in effect in
 production and is working flawlessly since months.

 This is how I start the WS server in production:

 (load pl-web/pl-web.l)

 (setq *Mobj (new '(+Redis) pl-ws-))

 (undef 'app)

 (setq *WsAuth '((notifications ((send (put your password/key here))

 (de app ()
(splitPath)
(wsApp)
(bye))

 (de go ()
(wsServer)
(server 9090) )
 --
 UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: Websockets now considered stable

2015-06-27 Thread Alexander Burger
Hi Henrik, hi Andreas,

  Question:
  To work around the inter-process limit of 4096 byte long messages the
  router now supports storing the messages in Redis
 
  1) Where comes this limit from?
  POSIX IPC? PicoLisp IPC ?

 1) As far as I remember from a discussion with Alex it's a hard
 limit (OS related).


I think this was about the constant PIPE_BUF

/usr/include/linux/limits.h

   #define PIPE_BUF4096  /* # bytes in atomic write to a pipe */

Used to be just 512 bytes on older Unixes
♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: Websockets now considered stable

2015-06-27 Thread Alexander Burger
On Fri, Jun 26, 2015 at 09:30:58PM +0200, Henrik Sarvell wrote:
 I was also including dbg.l (it was causing hung processes at 100%
 CPU), it's basically been deprecated or something, I'll leave it up to
 him to elaborate.

IIRC, the problem was not so much including dbg.l, but starting the
application without stdio redirection to some log file. As a result,
errors or other messages in the background server caused broken pipe
exceptions.

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


RE: Websockets now considered stable

2015-06-26 Thread andreas
Hi Henrik

Awesome! That's really cool, thank you for your effort and for sharing the code 
:-)
20k message with nearly zero server load sounds very impressive.

Question:
 To work around the inter-process limit of 4096 byte long messages the
 router now supports storing the messages in Redis

1) Where comes this limit from?
POSIX IPC? PicoLisp IPC ?

2) I couldn't find the redis part in the code, maybe you can give me a hint 
where to look?


Thanks, your work on websockets will definitely help me in in the future :-)
- beneroth



- Original Message -
From: Henrik Sarvell [mailto:hsarv...@gmail.com]
To: picolisp@software-lab.de
Sent: Fri, 26 Jun 2015 21:30:58 +0200
Subject:

Hi all, after over a month without any of the prior issues I now
consider the websockets part of pl-web stable:
https://bitbucket.org/hsarvell/pl-web Gone are the days of 100% CPU
usage and zombie processes.

With Alex's help the main web server is now more stable (he made me
throw away a few throws in favour of a few byes). The throws were
causing the zombies.

I was also including dbg.l (it was causing hung processes at 100%
CPU), it's basically been deprecated or something, I'll leave it up to
him to elaborate. It's just something I've been including by habit
since years ago when at some point I needed to include it to do some
kind of debugging.

Anyway atm the WS router is regularly routing up to 40 messages per
second to upwards 300-500 clients which means that roughly 20,000
messages are being pushed out per second during peak hours.

The PL processes show up with 0 CPU and 0 RAM usage when I run top,
sometimes 1% CPU :) They hardly register even i aggregate, the server
would be running 99% idle if it was only running the WS server.

To work around the inter-process limit of 4096 byte long messages the
router now supports storing the messages in Redis (raw disk is also
supported if Redis is not available), this is also in effect in
production and is working flawlessly since months.

This is how I start the WS server in production:

(load pl-web/pl-web.l)

(setq *Mobj (new '(+Redis) pl-ws-))

(undef 'app)

(setq *WsAuth '((notifications ((send (put your password/key here))

(de app ()
   (splitPath)
   (wsApp)
   (bye))

(de go ()
   (wsServer)
   (server 9090) )
--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe



Re: Websockets now considered stable

2015-06-26 Thread Henrik Sarvell
Hi Andreas.

1) As far as I remember from a discussion with Alex it's a hard
limit (OS related).

2) Line 369 - 372 here:
https://bitbucket.org/hsarvell/pl-web/src/c445ca3861159d0b28ea779a183572c91b7b8458/pl-web.l?at=default



On Fri, Jun 26, 2015 at 9:50 PM,  andr...@itship.ch wrote:
 Hi Henrik

 Awesome! That's really cool, thank you for your effort and for sharing the 
 code :-)
 20k message with nearly zero server load sounds very impressive.

 Question:
 To work around the inter-process limit of 4096 byte long messages the
 router now supports storing the messages in Redis

 1) Where comes this limit from?
 POSIX IPC? PicoLisp IPC ?

 2) I couldn't find the redis part in the code, maybe you can give me a hint 
 where to look?


 Thanks, your work on websockets will definitely help me in in the future :-)
 - beneroth



 - Original Message -
 From: Henrik Sarvell [mailto:hsarv...@gmail.com]
 To: picolisp@software-lab.de
 Sent: Fri, 26 Jun 2015 21:30:58 +0200
 Subject:

 Hi all, after over a month without any of the prior issues I now
 consider the websockets part of pl-web stable:
 https://bitbucket.org/hsarvell/pl-web Gone are the days of 100% CPU
 usage and zombie processes.

 With Alex's help the main web server is now more stable (he made me
 throw away a few throws in favour of a few byes). The throws were
 causing the zombies.

 I was also including dbg.l (it was causing hung processes at 100%
 CPU), it's basically been deprecated or something, I'll leave it up to
 him to elaborate. It's just something I've been including by habit
 since years ago when at some point I needed to include it to do some
 kind of debugging.

 Anyway atm the WS router is regularly routing up to 40 messages per
 second to upwards 300-500 clients which means that roughly 20,000
 messages are being pushed out per second during peak hours.

 The PL processes show up with 0 CPU and 0 RAM usage when I run top,
 sometimes 1% CPU :) They hardly register even i aggregate, the server
 would be running 99% idle if it was only running the WS server.

 To work around the inter-process limit of 4096 byte long messages the
 router now supports storing the messages in Redis (raw disk is also
 supported if Redis is not available), this is also in effect in
 production and is working flawlessly since months.

 This is how I start the WS server in production:

 (load pl-web/pl-web.l)

 (setq *Mobj (new '(+Redis) pl-ws-))

 (undef 'app)

 (setq *WsAuth '((notifications ((send (put your password/key here))

 (de app ()
(splitPath)
(wsApp)
(bye))

 (de go ()
(wsServer)
(server 9090) )
 --
 UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe