Hi Joan,

thanks for clarifying!

Klaus


On 29.10.2014 03:24, Joan Touzet wrote:
> Hi Klaus,
> 
> This client doesn't do inline responses so I'll top-reply, sorry about
> that.
> 
> I respect it being a good enough proxy. My argument is that making
> it any better of a proxy is something on which I don't think we should
> spend cycles.
> 
> I have seen a bunch of couch externals coded against other DB systems
> or referencing other resources that are not distributed. This is my
> primary concern in a clustered CouchDB with externals -- people coding
> that kind of thing will have bad behaviour.
> 
> If we are comfortable pushing back on users who have bad behaviour in
> these scenarios - knowing that we will not implement "sticky sessions"
> or similar - then we could conceivably keep externals. I just think
> it'll lead to an uptick in reported issues on IRC from people who
> don't know how to code a stateless application server, or who are
> attempting to use _externals to interface Couch with a secondary storage
> system that doesn't share the same semantics.
> 
> I have indeed seen this done at prior employers -- it's not a strawman.
> They're using _externals to do things like generate "global sequence
> numbers" or guarantee ordering of events in Couch. Neither of these make
> sense in a clustered setup, viz. the Fallacies of Distributed Computing.
> 
> -Joan
> 
> ----- Original Message -----
> From: "Klaus Trainer" <[email protected]>
> To: [email protected]
> Sent: Tuesday, October 28, 2014 9:25:01 PM
> Subject: Re: [DISCUSS] Deprecating _externals?
> 
> Hi Joan,
> 
> one reason why I've never missed an "official" plugin API is that
> CouchDB provides the externals API, which is documented and works well.
> 
> Please see my further comments inline.
> 
>> Today, someone came to the #couchdb channel asking about
>> _externals. For a long while it's been on my mind that perhaps
>> we should deprecate the entire _externals feature for a number
>> of reasons:
>>
>>   1. Couch is not a great reverse proxy. Making it into one is
>>      as hard as rewriting nginx or haproxy in erlang. It's a
>>      distraction to our development team and far outside our
>>      core competency.
> 
> From my user perspective, it's a *good enough* reverse proxy.  It does
> its job of forwarding HTTP requests and returning them.  Beyond that,
> there is no other reverse proxy implementation that I know of that has a
> RESTful HTTP API that can be used to start and stop services during
> runtime and on top of that can be used as a central storage for storing
> configuration of services.  I don't want to assert that it's always a
> good idea to use that combination of features in place of other, more
> common solutions, but there are scenarios where this can provide some
> significant advantages.
> 
> Also, can you explain in what way doing HTTP and managing external OS
> processes is "far outside of our core competency" while both are
> actually essential to CouchDB's core feature set?
> 
>>   2. In a clustered CouchDB (the default in 2.0), the
>>      assumptions around externals change drastically. For an 
>>      _external to work, it must be stateless and not rely upon
>>      multiple sequential requests to hit the same node (assuming
>>      the standard n-node cluster + a load balancer/reverse proxy
>>      at the front.)
> 
> Maybe I'm missing some aspects, but I can't see how "the assumptions
> around externals change drastically".  We're using a stateless protocol,
> and we've never made any guarantees with regard to people's application
> state.  I can only see a straw man here.
> 
>>      People who wrote a CouchDB 1.x external could reasonably
>>      expect to write an old-school singleton app (i.e., the only
>>      copy of that external process running, on a single machine).
>>
>>      If they engaged in any of a number of bad behaviours for
>>      distributed systems - storing content on local disk, locking
>>      or blocking connections to other services/databases in a 
>>      "single-threaded" pattern, or even expecting CouchDB not to
>>      possibly introduce a conflict or "read your writes" - they
>>      will probably fail outright at best, or at worst introduce
>>      subtle and confusing behaviour.
> 
> Even for an "old-school singleton" web app it's common best practice to
> put any application state that's beyond a request/response cycle into
> the database and nowhere else.  Assuming you follow that best practice,
> I can't see any new problem when it comes to running multiple instances
> of such an app in parallel.  Regarding missing "read your writes"
> guarantee and possible conflicts: these are database and not application
> properties.  The related problems affect any client, and they are not
> specific to using the externals API at all!  Also, they are not new
> insofar as these problems exist already today as soon as replication
> comes into play.
> 
>> TL;DR: We're changing the contract we give to _externals in a
>> reverse-compatibility-breaking way. We either need to document it
>> straight up, along with all of the admonishments required for
>> people who expect it to operate the same as in 1.x, or we need to
>> remove it.
> 
> What contract are you talking about?  Unless you have something
> specific, I will assume that possible changes in semantics, which will
> only occur in combination with some application-specific behaviour
> anyway, are already dealt with a major version number increment.
> 
>> My opinion is that now that the default CouchDB rollout will be
>> a cluster with a reverse proxy, that _externals should be exposed
>> through the load balancer, which can then reference 1 or more
>> processes distributed either on the same CouchDB nodes, or on
>> different hosts should compute needs demand it.
>>
>> The exception here would be a single-node CouchDB, which could
>> still use the same approach. However I don't see the issue with
>> deploying an haproxy on that same node and using the same approach
>> I describe above.
> 
> I do see an issue especially with regard to small applications that
> suddenly need an additional piece of infrastructure that needs to be
> configured and maintained.  Maybe I'm just naive, but I can't see the
> large burden you seem to suggest that would justify to remove that
> feature.  That is, I *am* willing to accept a tradeoff if it seems
> worth it, but in this case I'm not quite convinced.
> 
> Cheers,
> Klaus
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to