Hey Octavia folks!
First off, yes, I'm still alive and kicking. :) I,d like to start a conversation on usage requirements and have a few suggestions. I advocate that, since we will be using TCP and HTTP/HTTPS based protocols, we inherently enable connection logging for load balancers for several reasons: 1) We can use these logs as the raw and granular data needed to track usage. With logs, the operator has flexibility as to what usage metrics they want to bill against. For example, bandwidth is easy to track and can even be split into header and body data so that the provider can choose if they want to bill on header data or not. Also, the provider can determine if they will bill their customers for failed requests that were the fault of the provider themselves. These are just a few examples; the point is the flexible nature of logs. 2) Creating billable usage from logs is easy compared to other options like polling. For example, in our current LBaaS iteration at Rackspace we bill partly on "average concurrent connections". This is based on polling and is not as accurate as it possibly can be. It's very close, but it doesn't get more accurate that the logs themselves. Furthermore, polling is more complex and uses up resources on the polling cadence. 3) Enabling logs for all load balancers can be used for debugging, support and audit purposes. While the customer may or may not want their logs uploaded to swift, operators and their support teams can still use this data to help customers out with billing and setup issues. Auditing will also be easier with raw logs. 4) Enabling logs for all load balancers will help mitigate uncertainty in terms of capacity planning. Imagine if every customer suddenly enabled logs without it ever being turned on. This could produce a spike in resource utilization that will be hard to manage. Enabling logs from the start means we are certain as to what to plan for other than the nature of the customer's traffic pattern. Some Cons I can think of (please add more as I think the pros outweigh the cons): 1) If we every add UDP based protocols then this model won't work. < 1% of our load balancers at Rackspace are UDP based so we are not looking at using this protocol for Octavia. I'm more of a fan of building a really good TCP/HTTP/HTTPS based load balancer because UDP load balancing solves a different problem. For me different problem == different product. 2) I'm assuming HA Proxy. Thus, if we choose another technology for the amphora then this model may break. Also, and more generally speaking, I have categorized usage into three categories: 1) Tracking usage - this is usage that will be used my operators and support teams to gain insight into what load balancers are doing in an attempt to monitor potential issues. 2) Billable usage - this is usage that is a subset of tracking usage used to bill customers. 3) Real-time usage - this is usage that should be exposed via the API so that customers can make decisions that affect their configuration (ex. "Based off of the number of connections my web heads can handle when should I add another node to my pool?"). These are my preliminary thoughts, and I'd love to gain insight into what the community thinks. I have built about 3 usage collection systems thus far (1 with Brandon) and have learned a lot. Some basic rules I have discovered with collecting usage are: 1) Always collect granular usage as it "paints a picture" of what actually happened. Massaged/un-granular usage == lost information. 2) Never imply, always be explicit. Implications usually stem from bad assumptions. Last but not least, we need to store every user and system load balancer event such as creation, updates, suspension and deletion so that we may bill on things like uptime and serve our customers better by knowing what happened and when. Cheers, --Jorge _______________________________________________ OpenStack-dev mailing list OpenStackfirstname.lastname@example.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev