Here's an issue to split the WSGI app into two WSGI apps: https://pulp.plan.io/issues/3132
On Mon, Nov 13, 2017 at 2:04 PM, Brian Bouterse <[email protected]> wrote: > OK in looking at this more, the mod_wsgi docs resolved my concerns in > shipping multiple WSGI applications. From the docs [0] we can serve > multiple WSGI applications as easily as we can serve one, even within one > process group. In Pulp2 we just defaulted to 1 process group per WSGI app, > but one process group can serve multiple WSGI apps. Here are some example > configs. Say we have a two WSGI application (content + live APIs, and the > REST API); call them content.wsgi and api.wsgi respectively. An apache > config could then be: > > WSGIScriptAlias /status /path/to/installed/wsgi/files/api.wsgi > <----- the API WSGI app serving the status API > WSGIScriptAlias /api/v3 /path/to/installed/wsgi/files/api.wsgi > <----- the API WSGI app serving the rest of the API > WSGIScriptAlias /content /path/to/installed/wsgi/files/content.wsgi > <---- the "content" app in Pulp > WSGIScriptAlias /v1/ /path/to/installed/wsgi/files/content.wsgi > <--- any live APIs you need > > Other webservers be configured similarly. So does shipping multiple WSGI > apps sound good to everyone? I think this is what I'e heard from others in > this thread and now I agree. > > What about shipping 3 WSGI apps versus 2? One for the REST API (status and > API), the pulp content app serves, and third for any "other urls" ? If we > have two wsgi apps only, we probably have to have plugin writers identify > which WSGI app a url should be for. Sometimes they may want them deployed > with the content (live API), or with the rest API (complex upload use > cases). > > @ehelms, I'll try to share some responses. The config above could be mixed > with other applications on the same vhost. It would also allow them to be > deployed together or separately. The content WSGI app still needs database > access to serve that content so running them over a WAN is not something I > would recommend. Pushing content to be served "at the edge", "in another > geo", or via a "CDN" are a whole group of important use cases. This won't > help with those use cases unfortunately. > > [0]: http://modwsgi.readthedocs.io/en/develop/user-guides/configu > ration-guidelines.html#the-wsgiscriptalias-directive > > On Fri, Nov 10, 2017 at 11:17 AM, Eric Helms <[email protected]> wrote: > >> First, I appreciate this being a proposal and discussion before going >> this route given it's implications for applications used to consuming Pulp >> heavily. Secondly, I believe some of my questions and concerns have been >> asked and addressed throughout the thread, but I do feel like it's reached >> a point where a summary would be useful for those just now entering the >> conversation to parse. >> >> I know some of these concerns start to quickly broach into more advanced >> architecture discussion than the original proposal raised Brian. I am happy >> to break those out into other threads, but for now since someone else >> mentioned it I am including them. >> >> My generalized concerns are: >> >> * How do I deploy Pulp alongside another Apache based application (aka >> the current Katello use case) ? >> * Can I deploy Pulp and the Content separately? From two perspectives, >> splitting load across multiple hosts or separating concerns into >> independent pieces that can be scaled per differing demands? (e.g. the Pulp >> API itself may get a little traffic, whereas my 50k hosts might hammer the >> content API) >> * If the Content is separate from the main Pulp API, does that mean that >> I can scale my content delivery more easily horizontally and across >> geographies? This kinda goes to in the new world, how does the current >> setup of Pulp talking to Pulp to create replicated data endpoints for >> geography and scaling look like given this is affected by how the URL >> namespace is consumed. >> >> >> Eric >> >> On Fri, Nov 10, 2017 at 11:03 AM, Patrick Creech <[email protected]> >> wrote: >> >>> On Fri, 2017-11-10 at 10:49 -0500, Brian Bouterse wrote: >>> >>> >>> >>> >>> >>> >>> From a deployment perspective, it's been a key use case to deploy crane >>> at the perimeter, rsync published image files out to a file or CDN service, >>> and run the rest of Pulp on a well-protected internal network. >>> >>> >>> Pulp can also be installed at the perimeter. Core should support a >>> setting that enables/disables the REST API. Each plugin could support a >>> setting that enables/disables its content API. >>> >>> >>> I think we're envisioning a similar goal, but with a different >>> mechanism. I like the idea of a user selecting which components should be >>> active. Making each component a WSGI app is very easy for us and very >>> convenient for users. You can see Pulp 2's WSGI apps defined here: >>> >>> https://github.com/pulp/pulp/tree/master/server/usr/share/pulp/wsgi >>> >>> Depending on whether a user wants to run each component embedded in >>> normal httpd processes, or in separate daemon processes, it's just a matter >>> of enabling or not a small httpd config file like this one: >>> >>> https://github.com/pulp/pulp/blob/master/server/etc/httpd/co >>> nf.d/pulp_content.conf >>> >>> This gives the most flexibility. A user won't need to deploy the entire >>> stack of Pulp dependencies with all of their plugins at the perimeter if >>> they don't want to; we can choose to deliver each WSGI app separately, or >>> not, depending on what is convenient. >>> >>> >>> How can one process group serve multiple WSGI applications? I don't >>> think it can, so it requires the user to deploy it with multiple process >>> groups (one for each WSGI application). This prevents a use case that goes >>> like: "As a user, I can deploy Pulp to serve content, REST API, and live >>> APIs all from one WSGI process". This use case is valuable because it's >>> both a simple deployment model (fewer WSGI apps) and it uses less memory >>> because there are fewer process groups. This is why I'm suggesting we ship >>> all urls to be handled by one WSGI application, which also allows for the >>> deployment that you outline also. So shipping one WSGI app makes Pulp the >>> most flexible. >>> >>> >>> This separation has worked very well in Pulp 2, and as far as I know >>> there have been no complaints about it. >>> >>> >>> There are complaints that Pulp is hard to deploy (multiple WSGI apps), >>> and that it uses too much memory. >>> >>> >>> There are some important isolation concerns from the security and >>> reliability points here that should be weight as well. You are talking >>> about having general 'user facing/unauthenticated' services like content >>> serving share the same process and memory space as your management >>> interface (rest api). There should probably be some thought given to what >>> your acceptible level of risk and exposure are when someone finds a flaw in >>> your content serving code and can now see your management interface's >>> memory footprint. Or say you have some unstable code that can crash in your >>> management interface, which will end up bringing down your entire >>> application, instead of just the management interface. Just some food for >>> thought while thinking about all the ins and outs of this tradeoff. >>> >>> _______________________________________________ >>> Pulp-dev mailing list >>> [email protected] >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >>> >> >
_______________________________________________ Pulp-dev mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-dev
