On 02/09/11 05:24, Alex Rousskov wrote:
On 08/31/2011 03:18 PM, Amos Jeffries wrote:
On 01/09/11 03:11, Alex Rousskov wrote:
On 08/31/2011 03:32 AM, Amos Jeffries wrote:

Okay. You have brought up a few things I need to ask about before going
further.

I was under the dubious assumption that each SMP process had an
Ipc::Coordinator _Job_ which read UDS packets and called the appropriate
function based on type. With the coordinator _process_ version of that
Job doing more shovelling around stuff in its actions than the workers.
    - my change #1 was to make that function an AsyncCall and schedule it
instead of calling synchronously. All fine. Nothing that could cross
processes there.


   - my questionable change (#2) was to make the CacheManager a Job which
was instantiated during reconfigure/startup. (that would be one for each
worker yes?)

     - during setup that Manager Job uses Ipc::Coordinator::Instance()
and
registers its Mgr::Request/Mgr::Reply actions AsyncCalls to receive mgr:
packets. (that might start in Ipc::Coordinator in each worker?)


What I am trying to achieve is:
    - worker A received mgr:action URL. Sends Mgr::Request (UDS
packet) to
?1?.
    - ?2? schedules Manager:: subscription call to handle the
packet/TypedMsgHdr
    - Question: manager ACKs receipt with Mgr::Response immediately?
    - CacheManager handling it schedules UDS requests for all the workers
through ?2?
    - ?3? in the worker B-F receives UDS packet and produces a reply, and
passes it back to ?3?
    - ?2? passes the worker B-F replies to manager which merged and send
the reply via UDS to worker A.

Halfway sane? or batting off left field?

?1? and ?2? would be the same Ipc::Coordinator Job right? or is ?1? a
worker UDS handler and ?2? the coordinator process UDS handler?

(gah, looks like a mess, hope you can read that.)

I understand most of it, but it does not match how things work today,
and I do not think it matches how things should work after your changes.
Let me try to summarize the overall operation here, but I recommend
reading the corresponding class definitions for details.

0. A kid is Squid processes. Kids have different roles.
     Every kid has a CacheManager class instance.

     There is only one Coordinator kid C.
     C and only C has an Ipc::Coordinator class instance.

1a. A worker kid W receives mgr:action HTTP request:
     CacheManager::Start().

1b. W converts the HTTP request into an IPC request and sends
     that IPC request to Coordinator kid C. The IPC request
     includes the HTTP client connection descriptor so that C
     can respond directly to the HTTP client.
     W expects an ACK from C.
     This step is done using the Mgr::Forwarder class.

2. C receives an IPC request from W:
     Ipc::Coordinator::handleCacheMgrRequest().
     C broadcasts that IPC request to all kids and
     collects their responses. This is done using the
     Mgr::Inquirer class.

3a. A kid K receives Coordinator request:
     Ipc::Strand::handleCacheMgrRequest().
     Many kids will receive this request, including
     worker W mentioned in 1a and 1b above. In general,
     a kid does not need to know whether the HTTP cache manager
     request came to it or to some other kid. And not all kids
     processing IPC cache manager requests are workers.

3b. K creates the right Action object and
     asks that object to collect the necessary data,
     eventually responding to Coordinator:
     action->respond(request).


The above omits processing of IPC responses, but the code is rather
symmetric: For every handleCacheMgrRequest() mentioned above, there
should be matching handleCacheMgrResponse().


As you can see, there are two major classes working on the cache manager
request: Ipc::Coordinator (only present in C) and Ipc::Strand (present
in most other processes). Their instances, one per process, are created
in main.cc:

      if (IamCoordinatorProcess())
          AsyncJob::Start(Ipc::Coordinator::Instance());
      else if (UsingSmp()&&   IamWorkerProcess())
          AsyncJob::Start(new Ipc::Strand);

Both Ipc::Coordinator and Ipc::Strand have a common parent, Ipc::Port.

Aha. Thank you! that seems to be the missing piece
  So Ipc::Strand instance lookup is what CacheManager needs to be using
in that else instead of throwing. In order to send its Async reply back
to the IPC.


I do not think CacheManager::handleIpcRequest() should be created as a
part of this project. Coordinator and Strand (possibly with Port
assistance) should continue handling ICP requests. Coordinator and
Strand handle IPC requests differently so it is wrong to put that
handling code into the CacheManager class which is the same for all kids.

And when the code in question is returned to Ipc::Strand, there will be
no reason to lookup its own instance.



If you want to teach cache manager-related code to use Subscription API,
teach Ipc::Port if possible and Ipc::{Coordinator,Strand} if not.

Do not fiddle with CacheManager class as a part of this project:
CacheManager may eventually deal with subscriptions directly, but it
would be overall better if you do not try to change two very different
things at once: how the messages are received and who receives them.


Aye.

I think I am going to have to change the Ipc::Coordinator ACK logics a
bit so that when a message arrives and needs an ACK it is the one to do
so at the point the Asynccall is scheduled for handling.

I believe that is how the code already works. See unpatched
Ipc::Coordinator::handleCacheMgrRequest().

Ipc::Coordinator::handleCacheMgrRequest() -> depends libmgr.la

libmgr.la -> depends libcomm.la

libcomm.la -> depends libipc.la (aka. Ipc::Coordinator::*)

The whole point of this is to break that loop.

With the extra bonus that libipc -> libmgr -> HttpRequest -> Universe chain is also erased.


Right now I think Coordinator (and Strand maybe) need to use a _generic_ ACK message to ACK a certain message ID they sent. This will allow removal of the component-specific ACK types from Ipc::Coordinator::handle*().

Under the subscription model the component specific handle*() are simply a table of AsyncCall callback pointers to be scheduled.

Before we go any further I want to make a new patch for that and get you to audit that.


Allowing us to
NACK it immediately if not handled. Then I can pull all that request
handler Instance() dependency back out of CacheManager.

Coordinator accepts all cache manager messages. There should be no
reason for sending a NACK to Strand as Coordinator can respond to the
client directly if there are any problems with the request it got from
the Strand.

What happens if component X (Cache Manager in this instance) is not registered to receive that message type? My understanding is that it will hang in the sending process until some timeout.


It is possible that I am misreading your intentions, but it feels like
you are still trying to fix the original patch which was based on wrong
design assumptions. I would recommend starting from scratch and
following the steps I outlined at the end of the previous email. In
short: subscribe Ipc::Coordinator and Ipc::Strand to receive IPC
messages, placing any common code in Ipc::Port. Leave CacheManager as is.

So leaving the core problem of library dependency loops unresolved?

The goal of this effort is simply to make CacheManager (upper layer) subscribe to IPC (lower layer).

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.15
  Beta testers wanted for 3.2.0.10

Reply via email to