Re: squid-smp: synchronization issue & solutions

2009-11-17 Thread Adrian Chadd
Plenty of kernels nowdays do a bit of TCP and socket process in
process/thread context; so you need to do your socket TX/RX in
different processes/threads to get parallelism in the networking side
of things.

You could fake it somewhat by pushing socket IO into different threads
but then you have all the overhead of shuffling IO and completed IO
between threads. This may be .. complicated.


Adrian

2009/11/18 Gonzalo Arana :
> On Tue, Nov 17, 2009 at 12:45 PM, Alex Rousskov
>  wrote:
>> On 11/17/2009 04:09 AM, Sachin Malave wrote:
>>
>> 
>>
>>> I AM THINKING ABOUT HYBRID OF BOTH...
>>>
>>> Somebody might implement process model, Then we would merge both
>>> process and thread models .. together we could have a better squid..
>>> :)
>>> What do u think? 
>
> In my limited squid expierence, cpu usage is hardly a bottleneck.  So,
> why not just use smp for the cpu/disk-intensive parts?
>
> The candidates I can think of are:
>  * evaluating regular expressions (url_regex acls).
>  * aufs/diskd (squid already has support for this).
>
> Best regards,
>
> --
> Gonzalo A. Arana
>
>


Re: squid-smp: synchronization issue & solutions

2009-11-17 Thread Robert Collins
On Tue, 2009-11-17 at 15:49 -0300, Gonzalo Arana wrote:


> In my limited squid expierence, cpu usage is hardly a bottleneck.  So,
> why not just use smp for the cpu/disk-intensive parts?
> 
> The candidates I can think of are:
>   * evaluating regular expressions (url_regex acls).
>   * aufs/diskd (squid already has support for this).

So, we can drive squid to 100% CPU in production high load environments.
To scale further we need:
 - more cpus
 - more performance from the cpu's we have

Adrian is working on the latter, and the SMP discussion is about the
former. Simply putting each request in its own thread would go a long
way towards getting much more bang for buck - but thats not actually
trivial to do :)

-Rob


signature.asc
Description: This is a digitally signed message part


Re: squid-smp: synchronization issue & solutions

2009-11-17 Thread Sachin Malave
On Tue, Nov 17, 2009 at 9:15 PM, Alex Rousskov
 wrote:
> On 11/17/2009 04:09 AM, Sachin Malave wrote:
>
>>> After spending 2 minutes on openmp.org, I am not very excited about
>>> using OpenMP. Please correct me if I am wrong, but OpenMP seems to be:
>>>
>>> - An "approach" or "model" requiring compiler support and language
>>> extensions. It is _not_ a library. You examples with #pragmas is a good
>>> illustration.
>
>> Important features of  OPENMP, you might be interested in...
>>
>> ** If your compiler is not supporting OPENMP then you dont have to do
>> any special thing, Compiler simply ignores these #pragmas..
>> and runs codes as if they are in sequential single thread program,
>> without affecting the end goal.
>>
>> ** Programmers need not to create any locking mechanism and worry
>> about critical sections,
>>
>> ** By default it creates number threads equals to processors( * cores
>> per processor) in your system.
>
> All of the above make me think that OPENMP-enabled Squid may be
> significantly slower than multi-instance Squid. I doubt OPENMP is so
> smart that it can correctly and efficiently orchestrate the work of
> Squid "threads" that are often not even visible/identifiable in the
> current code.
>
>>> - Designed for parallelizing computation-intensive programs such as
>>> various math models running on massively parallel computers. AFAICT, the
>>> OpenMP steering group is comprised of folks that deal with such models
>>> in such environments. Our environment and performance goals are rather
>>> different.
>>>
>>
>> But that doesnt mean that we can not have independent threads,
>
> It means that there is a high probability that it will not work well for
> other, very different, problem areas. It may work, but not work well enough.
>
>>> I think our first questions should instead include:
>>>
>>> Q1. What are the major areas or units of asynchronous code execution?
>>> Some of us may prefer large areas such as "http_port acceptor" or
>>> "cache" or "server side". Others may root for AsyncJob as the largest
>>> asynchronous unit of execution. These two approaches and their
>>> implications differ a lot. There may be other designs worth considering.
>>>
>>
>> See my sample codes, I sent in last mail.. There i have separated out
>> the schedule() and dial()  functions, Where one thread is registering
>> calls in AsyncCallQueue and another is dispatching them..
>> Well, We can concentrate on other areas also
>
> scheedule() and dial() are low level routines that are irrelevant for Q1.
>
>>> Q2. Threads versus processes. Depending on Q1, we may have a choice. The
>>> choice will affect the required locking mechanism and other key decisions.
>>>
>>
>> If you are planning to use processes then it is as good as running
>> multiple squids on single machine..,
>
> I am not planning to use processes yet, but if they are indeed as good
> as running multiple Squids, that is a plus. Hopefully, we can do better
> than multi-instance Squid, but we should be at least as bad/good.
>
>
>>  Only thing is they must be
>> accepting requests on different ports... But if we want distribute
>> single squid's work then i feel threading is the best choice..
>
> You can have a process accepting a request and then forwarding the work
> to another process or receiving a cache hit from another process.
> Inter-process communication is slower than inter-thread communication,
> but it is not impossible.
>
>
>> I AM THINKING ABOUT HYBRID OF BOTH...
>>
>> Somebody might implement process model, Then we would merge both
>> process and thread models .. together we could have a better squid..
>> :)
>> What do u think? 
>
> I doubt we have the resources to do a generic process model so I would
> rather decide on a single primary direction (processes or threads) and
> try to generalize that later if needed. However, a process (if we decide
> to go down that route) may still have lower-level threads, but that is a
> secondary question/decision.
>

OK then, please come precisely,
What exactly you are thinking ?
tell me areas where i should concentrate ?
I want to know what exactly is going in your mind so that i could
start working and experimenting in that direction ... :)

meanwhile i would also try to experiment with threading, i am doing
right now, it would help me when we start actual development, is that
OK ?


Thanx..



> Cheers,
>
> Alex.
>



-- 
Mr. S. H. Malave
Computer Science & Engineering Department,
Walchand College of Engineering,Sangli.
sachinmal...@wce.org.in


Re: squid-smp: synchronization issue & solutions

2009-11-17 Thread Gonzalo Arana
On Tue, Nov 17, 2009 at 12:45 PM, Alex Rousskov
 wrote:
> On 11/17/2009 04:09 AM, Sachin Malave wrote:
>
> 
>
>> I AM THINKING ABOUT HYBRID OF BOTH...
>>
>> Somebody might implement process model, Then we would merge both
>> process and thread models .. together we could have a better squid..
>> :)
>> What do u think? 

In my limited squid expierence, cpu usage is hardly a bottleneck.  So,
why not just use smp for the cpu/disk-intensive parts?

The candidates I can think of are:
  * evaluating regular expressions (url_regex acls).
  * aufs/diskd (squid already has support for this).

Best regards,

-- 
Gonzalo A. Arana


Odd shadowing in client_side_reply.cc

2009-11-17 Thread Kinkie
Hi all,
   in trunk client_side_reply.cc:357 we define a local "old_rep"
variable, which is shadowed 8 lines later in a nested block, in case
of a not-modified answer, but then referenced again in line 396.
This looks a bit suspicious to me, but I don't have enough knowledge
about how that data-path is supposed to behave to properly understand
and possibly fix it.
Can anyone look into it?

Thanks

-- 
/kinkie


Re: squid-smp: synchronization issue & solutions

2009-11-17 Thread Alex Rousskov
On 11/17/2009 04:09 AM, Sachin Malave wrote:

>> After spending 2 minutes on openmp.org, I am not very excited about
>> using OpenMP. Please correct me if I am wrong, but OpenMP seems to be:
>>
>> - An "approach" or "model" requiring compiler support and language
>> extensions. It is _not_ a library. You examples with #pragmas is a good
>> illustration.

> Important features of  OPENMP, you might be interested in...
> 
> ** If your compiler is not supporting OPENMP then you dont have to do
> any special thing, Compiler simply ignores these #pragmas..
> and runs codes as if they are in sequential single thread program,
> without affecting the end goal.
> 
> ** Programmers need not to create any locking mechanism and worry
> about critical sections,
> 
> ** By default it creates number threads equals to processors( * cores
> per processor) in your system.

All of the above make me think that OPENMP-enabled Squid may be
significantly slower than multi-instance Squid. I doubt OPENMP is so
smart that it can correctly and efficiently orchestrate the work of
Squid "threads" that are often not even visible/identifiable in the
current code.

>> - Designed for parallelizing computation-intensive programs such as
>> various math models running on massively parallel computers. AFAICT, the
>> OpenMP steering group is comprised of folks that deal with such models
>> in such environments. Our environment and performance goals are rather
>> different.
>>
> 
> But that doesnt mean that we can not have independent threads,

It means that there is a high probability that it will not work well for
other, very different, problem areas. It may work, but not work well enough.

>> I think our first questions should instead include:
>>
>> Q1. What are the major areas or units of asynchronous code execution?
>> Some of us may prefer large areas such as "http_port acceptor" or
>> "cache" or "server side". Others may root for AsyncJob as the largest
>> asynchronous unit of execution. These two approaches and their
>> implications differ a lot. There may be other designs worth considering.
>>
> 
> See my sample codes, I sent in last mail.. There i have separated out
> the schedule() and dial()  functions, Where one thread is registering
> calls in AsyncCallQueue and another is dispatching them..
> Well, We can concentrate on other areas also

scheedule() and dial() are low level routines that are irrelevant for Q1.

>> Q2. Threads versus processes. Depending on Q1, we may have a choice. The
>> choice will affect the required locking mechanism and other key decisions.
>>
> 
> If you are planning to use processes then it is as good as running
> multiple squids on single machine..,

I am not planning to use processes yet, but if they are indeed as good
as running multiple Squids, that is a plus. Hopefully, we can do better
than multi-instance Squid, but we should be at least as bad/good.


>  Only thing is they must be
> accepting requests on different ports... But if we want distribute
> single squid's work then i feel threading is the best choice..

You can have a process accepting a request and then forwarding the work
to another process or receiving a cache hit from another process.
Inter-process communication is slower than inter-thread communication,
but it is not impossible.


> I AM THINKING ABOUT HYBRID OF BOTH...
> 
> Somebody might implement process model, Then we would merge both
> process and thread models .. together we could have a better squid..
> :)
> What do u think? 

I doubt we have the resources to do a generic process model so I would
rather decide on a single primary direction (processes or threads) and
try to generalize that later if needed. However, a process (if we decide
to go down that route) may still have lower-level threads, but that is a
secondary question/decision.

Cheers,

Alex.


Re: squid-smp: synchronization issue & solutions

2009-11-17 Thread Sachin Malave
On Mon, Nov 16, 2009 at 9:43 PM, Alex Rousskov
 wrote:
> On 11/15/2009 11:59 AM, Sachin Malave wrote:
>
>> Since last few days i am analyzing squid code for smp support, I found
>> one big issue regarding debugs() function, It is very hard get rid of
>> this issue as it is appearing at almost everywhere in the code. So for
>> testing purpose i have disable the debug option in squid.conf as
>> follows
>>
>> ---
>> debug_options 0,0
>> ---
>>
>> Well this was only way, as did not want to spend time on this issue.
>
> You can certainly disable any feature as an intermediate step as long as
> the overall approach allows for the later efficient support of the
> temporary disabled feature. Debugging is probably the worst feature to
> disable though because without it we do not know much about Squid operation.
>
I agree, We should find a way to re-enable this feature. It is
temporarily disabled...
Off-course locking debugs() was not the solution thats why it is disabled...


>
>> Now concentrating on locking mechanism...
>
> I would not recommend starting with such low-level decisions as locking
> mechanisms. We need to decide what needs to be locked first. AFAIK,
> there is currently no consensus whether we start with processes or
> threads, for example. The locking mechanism would depend on that.
>


>
>> As OpenMP library is widely supported by almost all platforms and
>> compilers, I am inheriting locking mechanism from the same
>> Just include omp.h & compile code with -fopenmp option if using gcc,
>> Other may use similar thing on their platform, Well that is not a big
>> issue..


>
> After spending 2 minutes on openmp.org, I am not very excited about
> using OpenMP. Please correct me if I am wrong, but OpenMP seems to be:
>
> - An "approach" or "model" requiring compiler support and language
> extensions. It is _not_ a library. You examples with #pragmas is a good
> illustration.
>

We have to use something to create and manage threads, there are some
other libraries and models also but i feel we need something that will
work on all platforms,
Important features of  OPENMP, you might be interested in...

** If your compiler is not supporting OPENMP then you dont have to do
any special thing, Compiler simply ignores these #pragmas..
and runs codes as if they are in sequential single thread program,
without affecting the end goal.

** Programmers need not to create any locking mechanism and worry
about critical sections,

** By default it creates number threads equals to processors( * cores
per processor) in your system.

** Its fork and join model is scalable.. ( Off-course we must find
such areas in exiting code)

** OPENMP is OLD but still growing .. Providing new features with new
releases.. Think about other threading libraries, I think their
developments are stopped, Some of them are not freely available, some
of them are available only on WINDOWS..

** IT IS FREE and OPEN-SOURCE like us..

** INTEL just has released TBB ( Threading Building Blocks), But i
doubt its performance on AMD ( non-intel ) hardware.

** You might be thinking about old Pthreads, But i think OPENMP is
very safe and better than pthreads for programmers

SPECIALLY ONE WHO IS MAKING CHANGES IN EXISTING CODES.  and easy to debugs.

 please think about my last point... :)






> - Designed for parallelizing computation-intensive programs such as
> various math models running on massively parallel computers. AFAICT, the
> OpenMP steering group is comprised of folks that deal with such models
> in such environments. Our environment and performance goals are rather
> different.
>

But that doesnt mean that we can not have independent threads, Only
thing is that we have to start these threads in main(), because main
never ends.. Otherwise those independent threads will die after
returning to calling function..



>
>> 1. hash_link  LOCKED
>>
>> 2. dlink_list  LOCKED
>>
>> 3. ipcache, fqdncache   LOCKED,
>>
>> 4. FD / fde handling ---WELL, SEEMS NOT CREATING PROBLEM, If any then
>> please discuss.
>>
>> 5. statistic counters --- NOT LOCKED ( I know this is very important,
>> But these are scattered all around squid code, Write now they may be
>> holding wrong values)
>>
>> 6. memory manager --- DID NOT FOLLOW
>>
>> 7. configuration objects --- DID NOT FOLLOW
>
> I worry that the end result of this exercise would produce a slow and
> buggy Squid for several reasons:
>
> - Globally locking low-level but interdependent objects is likely to
> create deadlocks when two or more locked objects need to lock other
> locked objects in a circular fashion.
>

is there any other option ? As discussed, Amos is trying to make these
areas as independent as possible. So that we would have less locking
in the code.

> - Locking low-level objects without an overall performance-aware plan is
> likely to result in performance-killing competition for critical locks.
> I believe that with