Re: [MERGE] SMP implementation, part 1

2010-07-01 Thread Alex Rousskov
On 07/01/2010 01:31 PM, Mark Nottingham wrote:
> On 30/06/2010, at 6:46 PM, Alex Rousskov wrote:
> 
>> ICP and HTCP _servers_: share listening sockets.

> Am I right that currently, it's effectively a random process that
> handles the query or CLR, and no state / effects are shared among
> processes?

Yes, in this patch, no caching state is shared among worker processes.

By default, the OS decides which process gets to read from a shared
listening socket but you can segregate processes using process-dedicated
listening addresses in squid.conf.

The current implementation makes SMP Squid similar to a Squid farm
behind a load balancer with no cache affinity guarantees. Perfect for
non-caching Squids and inapropriate for Squids that must coordinate
caching activities (everything in-between is a gray area).

HTH,

Alex.


Re: [MERGE] SMP implementation, part 1

2010-07-01 Thread Mark Nottingham

On 30/06/2010, at 6:46 PM, Alex Rousskov wrote:

>   ICP and HTCP _servers_: share listening sockets.


Am I right that currently, it's effectively a random process that handles the 
query or CLR, and no state / effects are shared among processes?

Cheers,


--
Mark Nottingham   m...@yahoo-inc.com




Re: [MERGE] SMP implementation, part 1

2010-07-01 Thread Amos Jeffries

Alex Rousskov wrote:

Hello,

Attached is the first part of the project making Squid SMP-scalable.
The implementation follows SMP design discussed a few months ago (quoted
below). The patch contains most of the promised Phase1 features. There
is more code brewing in the lab, but the attached code is ready to be
synced and merged with trunk. With these changes, you can run Squid in
SMP mode, utilizing many CPU cores, provided you do not mind cache
manager and caching inconsistencies.

A brief list of changes and to-dos is provided below. Besides the
acceptance review, I need help with these two questions:

1. Should we rename the squid.conf option specifying the number of
processes from "main_processes" to "worker_processes" or even "workers"?
The forked processes do almost everything the old daemon process did,
but that will change as we add more specialized processes. There is also
a special "Coordinator" process that is launched automatically in SMP
mode. It does not handle regular HTTP transactions.


Workers brings to mind the right meaning from use by apache and other 
software. +1 from me on calling it worker_processes (we will have 
threads later).




2. We had to clone comm_openex into comm_open_uds because Unix Domain
Sockets do not use IP addresses. They use file names. We can unify and
simplify related code if we add the file name (struct sockaddr_un) to
Ip::Address. IIRC, Amos and I discussed this on IRC and decided that it
is OK to add sockaddr_un to Ip::Address (because it is used as a general
"network address" anyway), but I would prefer to hear more opinions
before altering Ip::Address.


Change log:

* Added main_processes squid.conf option to specify how many worker
processes to fork and maintain. Zero means old no-deamon mode. One means
the old non-SMP mode.

* Added support for process_name and process_number macros and
if-statement conditionals in squid.conf. Search for .pre changes for
documented details. These features allow the admin to configure each
worker process differently if needed.


Regarding the parser, use of un-bracketed assignment in conditionals as 
Henrik keeps reminding me these fail on some compile modes.
 You are using a number of if(a=b) and when(a=b) statements, 
particularly in the parser. These will need to become if((a=b)) etc.




* Support multiple workers listening on the same HTTP[S] port (port
sharing). This allows multiple workers to split the load without any
special rules.


You add a comment question:
  fatal("No HTTP or HTTPS ports configured"); // defaults prohibit this?

... not quite. The defaults make an entry in the config for new 
installs, if its removed or an old 2.5 config without port is used that 
error case still occurs.


The MacOS people have an open bug requesting the Mac-service -I option 
get ported to 3.x which will mean no port configured by default, but an 
open socket passed in on stdin to the master process later.




* Support or prohibit port sharing for WCCP, DNS, ICP, HTCP, SMP, and
Ident protocols, depending on protocol-specific restrictions. Sharing is
implemented by registering listening socket descriptors with the
Coordinator process and obtaining them from the Coordinator as needed.
Here are protocol-specific notes:

   WCCP: Restricted to the Coordinator process due to how WCCP works.
   Workers do not need access to the WCCP code.

   DNS: Done by each worker with no sharing. Fixed source ports not
   supported unless each worker is given its own outgoing address
   because we do not want to match outgoing queries and incoming
   responses across processes.


This is a good reason for adding the often pondered dns_outgoing_address 
directive. It solves a few issues in other low-priority use cases as well.




   SNMP: Workers share incoming and outgoing sockets.


Does this really make sense?
  SNMP stats will be different for each worker and particularly in the 
client table where things are indexed by the particular client IPs which 
connected to each worker.
  I think better a single port managed by the master with a new OID for 
child process somewhere. But the design will need a good thinking out 
separate to this.




   ICP and HTCP _clients_: Cannot be supported in SMP environment
   unless each process has its own address (i.e., unique IP address
   and/or unique [ICP] port) because we do not want to match outgoing
   queries and incoming responses across processes.


Huh? "or unique port" this is entirely doable right?



   ICP and HTCP _servers_: share listening sockets.

   Ident clients do not need to share sockets because they use
   unique ports.

* Support management signals (squid -k ...) in SMP mode, acting as a
single Squid instance.

* Refork dying workers, similar to how we reforked dying process in
non-SMP daemon mode.


Um, have you checked if this new process structure obsoletes the old 
kill-parent hack? That would be nice to kill off cleanly.




Detailed change descriptions are at
http