"* 75% of all hosts (vm's) are tomcat hosts (I'll focus on just those from here);
ok * every specific tomcat setup is deployed as two nodes (not a real cluster, but mostly stateless applications behind a loadbalancer); * every cluster typically has 1 application (1 deployed war with 1 context path in tomcat speak, basically providing http://node/app ); this sounds somewhat like serving multiple customers or different variations on a project from a shared infrastructure? i.e. AcmeCorp and BetaCorp? This seems to imply groups here to me so far. * occasionally a node/cluster will have more than one such 'application' hosted. This can be on the same Tomcat instance (same tcp port 8080), but could also be living on another port (which calls the need for a separate ip/port combination or pool on the load balancer) This seems to imply each node/cluster has a playbook that defines what groups get what roles. If you want to generate those, that could be reasonable depending on use case. * every application cluster typically is part of a larger application which can vary from one to several application clusters * the big applications are part of a project, a project is part of an organisation AWX is pretty useful for segrating things and permissions between organizations, if you're talking about access control. Can be useful. Just throwing that out there. * every application has three instances in each environment: development, testing and production (clustered in the same way, everywhere) This seems like you might want to maintain three seperate inventories, that way "-i development" never risks managing production and there is no crossing of the streams (assuming people have seen Ghost Busters) * the loadbalancer performs typically one, but sometimes more, health checks per application (a basic GET, and checking a string in the response), and will automatically mark a node as down if that fails * some applications can communicate with some other applications if need be, but only by communicating through the loadbalancer; this is also enforced by the network; so we need a configuration here that says 'node A may communicate with node B'; we do that on the load balancer at the time, and every such set needs a separate LB config; * every application is of course consumed in some way or another, and is defined on the load balancer (nodes and pools and virtual servers in F5 speak) Seems unrelated to the above bits (or at least not complicating it). Summary of my suggestion: * groups per "customer" * seperate inventories for QA/stage/prod * define role to server mapping in playbooks, which you might generate if inventory is a source of such knowledge * roles of course still written by hand On Tue, Jan 21, 2014 at 4:18 PM, Serge van Ginderachter < [email protected]> wrote: > Hi list, > > > TL;DR: I'd like to know how people model their inventory data for a large > set of hosts (+500 vm's) that are given the mostly the same role, but with > many varying applications parameters, to the extent where a simple > with_items list or even with_nested list doesn't satisfy anymore. > > > I have been pondering some time on the subject at hand, where I'm hesitant > if the way I started working with ansible and how it growed over time, is > the best possible way. In particular on how to model the inventory > , > variables, but obviously also in the way implement > ing > and nest > ing > groups. > > Rather than showing how I did it, let me explain some of the particulars > of this environment, so I can ask the community "how would you do it?" > > We're mostly a Java shop, and have a very standardized, and sometimes > particular setup: > > * 75% of all hosts (vm's) are tomcat hosts (I'll focus on just those from > here); > * every specific tomcat setup is deployed as two nodes (not a real > cluster, but mostly stateless applications behind a loadbalancer); > * every cluster typically has 1 application (1 deployed war with 1 context > path in tomcat speak, basically providing http://node/app ); > * occasionally a node/cluster will have more than one such 'application' > hosted. This can be on the same Tomcat instance (same tcp port 8080), but > could also be living on another port (which calls the need for a separate > ip/port combination or pool on the load balancer) > * every application cluster typically is part of a larger application > which can vary from one to several application clusters > * the big applications are part of a project, a project is part of an > organisation > * every application has three instances in each environment: development, > testing and production (clustered in the same way, everywhere) > * the loadbalancer performs typically one, but sometimes more, health > checks > per > application (a basic GET, and checking a string in the response), and > will automatically mark a node as down if that fails > * some applications can communicate with some other applications if need > be, but only by > communicating through > the loadbalancer; this is also enforced by the network; > so > we need a configuration here that says 'node A may communicate with node > B'; we do that on the load balancer > at the time, and every such set needs a separate LB config; > * every application is of course consumed in some way or another, and is > defined on the load balancer (nodes and pools and virtual servers in F5 > speak) > > Yes, this means every tomcat application lives on, in total, 6 instances > (2 cluster nodes x 3 environments), hence 6 virtual machines > > A basic inventory would hence show as: > > all inventory > |_ organisation 1 > |_ project 1 > |_ application 1 > |_ dev > |_ node 1 > |_ node 2 > |_ test > |_ .. > |_ prod > |_ .. > |_ application 2 > |_ .. > |_ project 2 > |_ .. > |_ organisation 1 > |_ .. > > Some other implented groups are: > > |_ development > |_ organisation1-dev > |_application1-dev > |_ testing > |_ production > > or > > - > tomcat > > |_ application1 > > > |_ application2 > - > <some_other_server_role_besides_tomcat> > > |_ application7 > > |_ application9 > > Our environment counts around 100 applications, hence 600 vm's at this > moment, so keeping everything rigorously standard is very important. > Automating the load balancer from a config per application has become a > key issue1 > So w > hen looking beyond the > purely per > groups and node inventory, on a node we get following data important to > configure things on the load balancer: > > * Within an application server: > > node > |_ subapp1 > |_ healthcheck1 > |_ healthcheck2 > |_ subapp1 > ...... > > > * > So > we also need to define which application cluster may communicate with > what other application cluster. Normally this is the same configuration for > all environments, but on some occasions a node in environment X might need > to communicate with a node in environment Y (e.g. a dev node that needs > relaying mail, as we have just one smtp speaking node > "prod" > setup for all environments, these exceptions are rare, but I tend to think > necessary exceptions should be automated as well.) > > This > cluster to cluster communication > thing is actually something I'm not sure what the best way would be to > implement in variables, as at this point it isn't about just a host or > group var any more, but it's about data for multiple hosts (e.g. giving > access from app A to app B requires network facts from both clusters). > > Also, at this point, data gets nested very deep, looping over separate > applications with different paths, on different ports, with each instance > having multiple healthchecks. Until here I managed it, but now combine this > with the need of giving certain clusters access to one or more of those > instances on one or more other clusters. Basically, I'm stumping on the > limits of with_nested here. > > > So, given this, how would you design the inventory data, to implement all > this? Am I overdoing it by wanting to put everything in a combined set of > complex variables? > > > I look forward to different viewpoints :) > > > Thanks, > > > > Serge > > -- > You received this message because you are subscribed to the Google Groups > "Ansible Project" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > -- Michael DeHaan <[email protected]> CTO, AnsibleWorks, Inc. http://www.ansibleworks.com/ -- You received this message because you are subscribed to the Google Groups "Ansible Project" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
