Re: API for the serf

Greg Stein 2 Oct 2001 09:26:15 -0000

On Mon, Oct 01, 2001 at 11:54:44PM -0700, Justin Erenkrantz wrote:
> On Mon, Oct 01, 2001 at 10:13:16PM -0700, Ian Holsman wrote:
> > * Simple 'GET/Post/Head' function just does it and returns
> >   the contents in a bucket-brigade
> 
> Sure.  I'd like to make it able to speak any protocol that
> has similar functionality to HTTP but isn't quite HTTP/1.1 - i.e. 
> try to be as protocol-agnostic as we can make it.


Euh... let's do HTTP/1.1 *today*. At some point in the future, we can look
back and see what kinds of tweaks/changes are necessary to meet other needs.
Just remember that our charter is based solely on an HTTP client library.
*Not* a general purpose "http-like" protocol library.

Say "oh, but protocol X uses request/response, too... let's add that!" is
out of bounds for 1.0 :-)

>...
> I'd like to see this client API be 
> able to handle that as well.

In truth, I think the API is going to be very damned flexibile and *could*
support other protocols. But let's *please* not go down that rathole until
we have a 1.0 release? Please?

Just having an HTTP client library is a *big* win in an of itself.

>...
> > * Async version of above.. (like LWP::Parallel) I just push
> >   GET requests and it notifies me when there is an event for
> >   one of them.
> 
> I'm just not sure that we really want to make the API event-based.
> I'm really not a fan of event-based or callback-based programming
> (if you want event-based, use libwww in its various incarnations).
> I think it makes things too confusing.  I much prefer the
> paralleism to be at the thread-level rather than select/poll-based.

See my previous response to Ian, and my notes about the initial proposal. I
really think we can create an API that easily moves between the push/pull
and sync/async models.

You can ask for a response's brigade (and the lib will fetch it off the
wire), or the library will call you with the brigade. We're talking just a
minor difference in how the app uses the library. Both models focus on the
brigade, but the call sequence is something like:

    serf_send_request(session, req_brigade, NULL)
     :
    resp_brigade = serf_get_response(session)
or
    serf_send_request(session, req_brigade, response_cb)
     :
    serf_process(ctx)

    apr_status_t response_cb(resp_brigade) { ... }

>...
> also find that most web tasks don't need to be event-driven -

In Subversion, we need to push request data at the network (yet,
unfortunately, Neon wants to pull data). On the response side, we want the
network to push data at Subversion (which Neon does).

So for the response, I would say that serf needs to have a callback mode for
delivering the response.

> they are all essentially parallel tasks.  I think you sacrifice 
> too much of the API to get callbacks and async.

I don't buy this. I'm not seeing that a "sacrifice" is going to happen.
Let's see what happens. I suspect the presence of callbacks is not going to
bug you, as long as you have the sync mode.

> > * filters used for 'serf' as similiar in syntax/use to HTTPD
> 
> One thing I'd like to see is that we use filter framework as what 
> is already in httpd.

No, on several accounts.

The presence of f->r and f->c is severely whacked. The filter chain should
operate independently of any global context. "Reaching out" is bad juju.

I also don't believe the registration and naming of filters is useful for
the serf. We're a library, not an application. The notion of names and
simplified insertion is an application concept.

[ that said: attaching a name to a filter could be useful for debugging,
  much like we have names for buckets ]

Hmm. That is a good analogy: the filters are going to be referenced,
inserted, etc, much more like buckets. We'd have some static structure
defining the filter, and we simply call a function to insert the bugger in
the filter chain.

To some extent, httpd is building those "static" structures from runtime
parameters to the registration functions (type, name, func, etc). No real
reason for that; I think we just kind of fell into it because we were more
focused on naming them so a config file could refer to them.

> This is good in two respects - we force 
> ourselves to clean up the filtering API in httpd (most of us are
> active in httpd as well) and we can reuse the same code in both 
> places.

Agreed on both accounts. However, I would suggest that it goes in the other
direction. *We* define a Good filter stack and httpd moves towards it. We
don't have the legacy stuff with the headers, nor the needs of f->r and f->c
to get in our way of doing it Right.

> This would mean that all of the filter code may need 
> to get moved to a neutral repository.  Not sure I know where 
> it'd go or if this is even a good idea.  But, most likely, any
> filters that we write will be implemented in both places - that
> seems a bit silly.

More than likely, serf just exports the filters for use by httpd. Recall
that one of our targets is for proxy to use the serf. Thus, the code will
already be "within" httpd.

> > * HTTP/1.1 Support
> 
> As Greg said, we should only send out HTTP/1.1 requests.  This is
> the way we did it with flood.  The only thing we don't handle
> correctly as part of HTTP/1.1 is chunking.  And, I've got a start
> on it in my tree, but I got sidetracked by the input filtering
> in httpd and now this.

The serf will have both chunking and dechunking code. Since the request
could be "pushed" at the library, chunking is required. It is also possible
to give a brigade to the serf and it could compute the length and send a
request with a Content-Length header.

On the response side, we'll definitely have dechunking.

> > * SSL Support
> 
> flood already does this with OpenSSL, so I think it should be
> straightforward as most of the heavy lifting has already been
> done.

You betcha :-)

IMO, I don't see that we need to use an SSL filter. I think the terminal
filter (analog to CORE_IN / CORE_OUT) could simply handle SSL as necessary.

> Client certs might need to be added, but that's not too hard.

Yes, they will be needed. (/me thinking of SVN)

> > * Connection Pooling per server (ie we keep a max of n open connections
> >   to a server and we re-use these on different requests)
> 
> Eh, I'm not terribly sure about this though.  I'm not sure how you
> would get the lifetimes right.  Ideally, the connections should 
> timeout themselves.  I guess we could enforce the lifetimes with 
> socket timeouts (but that'd only be on our side not the server 
> side).  I think that if you want connection-pooling, that *might* be 
> outside of the scope of this library.  But, I'm not really sure.
> Thoughts?

All of this would be subject to various parameters and controls, TBD.

I generally agree that it is an advanced concept and possibly able to
relegate to the app, but I also think that many apps will want multiple
connections and multiple outstanding requests. Thus, a pool of connections.

> FWIW, I think RFC 2616 8.1.4 says that you should only keep two 
> connections per server open at any time for single-client machines.
> Proxy<->Proxy connections may be at 2*N (N == number of concurrent 
> users) per server.

Huh. Neat. Hadn't seen that one.

> I don't expect that we should have knowledge
> of this within apr-serf - I just think this is something left to
> the program that uses apr-serf.  My $.02.

Agreed. We provide the controls, the doc provides guidelines (including refs
to the RFC), and let the app do what it will.

Otherwise, I'm sure you'd be peeved if flood could only open a max of two
connections :-)

> > * send requests to a server from a list of servers/ports (via round-robin)
> 
> Oh, Flood does this.  =) -- justin

:-)

I think the above is simply an aspect of the connection grouping and
pooling. More on that in a bit.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: API for the serf

Reply via email to