[web2py] Re: Web2py and threads

John Heenan Wed, 25 Aug 2010 15:13:47 -0700

Linux lost a PR war over the exact same issue that I am addressing
now.

The point is being aware of issues surrounding the scalability of
services associated with using far more OS resources than required and
at more risk.

The tiny amount of time a single thread, thread A, may be blocked
during a read is not the point. The OS is still going to block and do
expensive context switches that allow other threads to run before
examining if thread A should be allowed to unblock. If the server is
heavily loaded then it could be quite some time before thread A
unblocks due to the large amount of total time other unblocked thread
consume before they block again.

However in the context of the the Python GIL works, it is is not even
as simple as above because the GIL imposes another layer

In objective tests, using the OS to notify an OS mediated action is
ready beats using multiple threads that just become unblocked. For
example Lightttpd beats Apache when serving static files with lots of
concurrent connections. Lighttpd can serve up to 10,000 concurrent
connections. Lighttpd uses event notification to service http
requests, so allowing Lighttpd to know when the OS is finished with
some action that would block a thread if actioned synchronously.
Apache wastefully uses a separate thread for each separate request and
allows the OS to unblock threads that expect synchronous action..

Years ago there was a big PR war between Windows and Linux. There was
a hell of a lot riding on the outcome of web server scaling tests and
the war got an awful lot of attention in the computer media. Linux had
God like status. The Linux zealots were absolutely convinced tests
would show Linux would beat what they regarded as inferior Windows
bloat and poor design. They had the journalists convinced and everyone
else convinced the tests would be just a formality that would prove
Linux was superior and would formally justify the utter contempt
Windows was held in. Linux advocates got their best people. Linux
lost. After the tests Linux lost its God like status in the media,
really just became another side show and never recovered.

So what went wrong for Linux? Simple I reckon. Microsoft had an API
that allowed the OS to notify a process asynchronously that data was
ready. Linux at the time did not and relied on its antique UNIX select
function to allow threads to block. At small scale there is little
effective difference. However Apache was unable to scale as well as
the Microsoft IIS web server.

John Heenan

On Aug 26, 6:46 am, mdipierro <[email protected]> wrote:
> the time to execute a typical web2py action my server is 10-20ms. The
> time to open a file or write a small file is so small that is not
> measurable. I am not sure I believe there is any issue here or perhaps
> I do not understand the problem. Can you provide a test case?
>
> On Aug 25, 2:11 pm, John Heenan <[email protected]> wrote:
>
> > Even with file reading there is no way the disk drive, its controllers
> > and various buses can keep up with the CPU.
>
> > Hence reading any file from disk will cause the OS to intervene and
> > block (reading a template view file, controller file or otherwise),
> > albeit a 'short' time.
>
> > Here are two choices.
>
> > 1) Put the read function in a thread, wait for the thread to unblock
> > and continue servicing any other threads that are no longer blocked.
> > This is what web2py does using a Python file read. The OS
> > automatically provides thread scheduling.
>
> > 2) Tell the OS to pass a message to an event callback when the OS is
> > ready. A separate thread is not required if the application chooses to
> > process its own message queue as the thread in effect simply relays
> > the message on.
>
> > There is of course other blocks: file writing, database access and
> > network read and write (including from/to the http request PC)
>
> > It is tempting to say it is not rocket science.
>
> > Anyway the main message is being aware of the plumbing and avoiding
> > blind religious type fixations is important for long term planning and
> > scalability issues.
>
> > We really need to face up to realities that seeing Python as a black
> > box type total solution is not healthy.
>
> > John Heenan
>
> > On Aug 26, 2:27 am, mdipierro <[email protected]> wrote:
>
> > > On Aug 25, 11:00 am, Phyo Arkar <[email protected]> wrote:
>
> > > > Did I Read that reading files inside controller will block web2py , 
> > > > Does it?
>
> > > No web2py does not block. web2py only locks sessions that means one
> > > user cannot request two concurrent pages because there would be a race
> > > condition in saving sessions. Two user can request different pages
> > > which open the same file unless the file is explicitly locked by your
> > > code.
>
> > > > Thats a bad news.. i am doing a file crawler and while crawling ,
> > > > web2py is blocked even tho the process talke only 25% of 1 out of 4
> > > > CPUs ..
>
> > > Tell us more or I cannot help.
>
> > > > On 8/25/10, pierreth <[email protected]> wrote:
>
> > > > > I would appreciate a good reference to understand the concepts you are
> > > > > talking about. It is something new to me and I don't understand.
>
> > > > > On 25 août, 11:22, John Heenan <[email protected]> wrote:
> > > > >> No, nothing that abstract. Using WSGI forces a new thread for each
> > > > >> request. This is is a simple and inefficient brute force approach 
> > > > >> that
> > > > >> really only suits the simplest Python applications and where only a
> > > > >> small number of concurrent connection might be expected.
>
> > > > >> Any application that provides web services is going to OS block on
> > > > >> file reading (and writing) and on database access. Using threads is a
> > > > >> classic and easy way out that carries a lot of baggage. Windows has
> > > > >> had a way out of this for years with its asynch (or event)
> > > > >> notification set up through an OVERLAPPED structure.
>
> > > > >> Lightttpd makes use of efficient event notification schemes like
> > > > >> kqueue and epoll. Apache only uses such schemes for listening and 
> > > > >> Keep-
> > > > >> Alives.
>
> > > > >> No matter how careful one is with threads and processes there always
> > > > >> appears to be unexpected gotchas. Python has a notorious example, the
> > > > >> now fixed 'Beazly Effect' that affected the GIL. Also I don't think
> > > > >> there is a single experienced Python user that trusts the GIL.
>
> > > > >> John Heenan

[web2py] Re: Web2py and threads

Reply via email to