Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-11-06 Thread Stephen Balukoff
Hi Jorge,

So, one can query a pre-defined UDP socket or stats HTTP service (which
can be an in-band service, by the way) and HAProxy will give all kinds of
useful stats on the current listener, its pools, its members, etc. We will
probably be querying this service in any case to detect things like members
going down, etc. for sending notifications upstream. The problem is this
interface presently resets state whenever haproxy is reloaded, which needs
to happen whenever there's a configuration change. I was able to meet with
the HAProxy team (including Willy Tarreau), and they're interested in
making improvements to HAProxy that we would find useful. Foremost on their
list was the ability to preserve this state information between restarts.

Until that's ready and in a stable release of haproxy, it's also pretty
trivial to parse out IP addresses and listening ports from the haproxy
config, and use these to populate a series of IPtables chains whose entire
purpose is to gather bandwidth I/O data. These tables won't give you things
like max connnection counts, etc., but if you're billing on raw bandwidth
usage, these stats are guaranteed to be accurate and survive through
haproxy restarts. It also does not require one to scan logs, and is
available cheaply in real time. (This is how we bill for bandwidth on our
current software load balancer product.)

My vote would be to use the IPTables approach for now until HAProxy is able
to retain state between restarts. For other stats data (eg. max connection
counts, total number of requests), I would recommend gathering this data
from the haproxy daemon, and keeping an external state file that we update
immediately before restarting haproxy. (Yes, this means we lose some
information on connections that are still open when haproxy restarts, but
it gives us an approximate good value since we anticipate haproxy
restarts being relatively rare in comparison to serving actual requests).

Logs are still very handy, and I agree that if extreme accuracy in billing
is required, this is the way to get that data. Logs are also very handy for
users to have for troubleshooting purposes. But I think logs are not well
suited to providing data which will be consumed in real time (eg. stuff
which will populate a dashboard.)

What do y'all think of this?

Stephen

On Wed, Nov 5, 2014 at 10:25 AM, Jorge Miramontes 
jorge.miramon...@rackspace.com wrote:

   Thanks German,

  It looks like the conversation is going towards using the HAProxy stats
 interface and/or iptables. I just wanted to explore logging a bit. That
 said, can you and Stephen share your thoughts on how we might implement
 that approach? I'd like to get a spec out soon because I believe metric
 gathering can be worked on in parallel with the rest of the project. In
 fact, I was hoping to get my hands dirty on this one and contribute some
 code, but a strategy and spec are needed first before I can start that ;)

  Cheers,
 --Jorge

   From: Eichberger, German german.eichber...@hp.com
 Reply-To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Date: Wednesday, November 5, 2014 3:50 AM

 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hi Jorge,



 I am still not convinced that we need to use logging for usage metrics. We
 can also use the haproxy stats interface (which the haproxy team is willing
 to improve based on our input) and/or iptables as Stephen suggested. That
 said this probably needs more exploration.



 From an HP perspective the full logs on the load balancer are mostly
 interesting for the user of the loadbalancer – we only care about
 aggregates for our metering. That said we would be happy to just move them
 on demand to a place the user can access.



 Thanks,

 German





 *From:* Jorge Miramontes [mailto:jorge.miramon...@rackspace.com
 jorge.miramon...@rackspace.com]
 *Sent:* Tuesday, November 04, 2014 8:20 PM
 *To:* OpenStack Development Mailing List (not for usage questions)
 *Subject:* Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage
 Requirements



 Hi Susanne,



 Thanks for the reply. As Angus pointed out, the one big item that needs to
 be addressed with this method is network I/O of raw logs. One idea to
 mitigate this concern is to store the data locally for the
 operator-configured granularity, process it and THEN send it to cielometer,
 etc. If we can't engineer a way to deal with the high network I/O that will
 inevitably occur we may have to move towards a polling approach. Thoughts?



 Cheers,

 --Jorge



 *From: *Susanne Balle sleipnir...@gmail.com
 *Reply-To: *OpenStack Development Mailing List (not for usage
 questions) openstack-dev@lists.openstack.org
 *Date: *Tuesday, November 4, 2014 11:10 AM
 *To: *OpenStack Development Mailing List (not for usage questions) 
 openstack-dev

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-11-05 Thread Eichberger, German
Hi Jorge,

I am still not convinced that we need to use logging for usage metrics. We can 
also use the haproxy stats interface (which the haproxy team is willing to 
improve based on our input) and/or iptables as Stephen suggested. That said 
this probably needs more exploration.

From an HP perspective the full logs on the load balancer are mostly 
interesting for the user of the loadbalancer - we only care about aggregates 
for our metering. That said we would be happy to just move them on demand to a 
place the user can access.

Thanks,
German


From: Jorge Miramontes [mailto:jorge.miramon...@rackspace.com]
Sent: Tuesday, November 04, 2014 8:20 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hi Susanne,

Thanks for the reply. As Angus pointed out, the one big item that needs to be 
addressed with this method is network I/O of raw logs. One idea to mitigate 
this concern is to store the data locally for the operator-configured 
granularity, process it and THEN send it to cielometer, etc. If we can't 
engineer a way to deal with the high network I/O that will inevitably occur we 
may have to move towards a polling approach. Thoughts?

Cheers,
--Jorge

From: Susanne Balle sleipnir...@gmail.commailto:sleipnir...@gmail.com
Reply-To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Date: Tuesday, November 4, 2014 11:10 AM
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Jorge

I understand your use cases around capturing of metrics, etc.

Today we mine the logs for usage information on our Hadoop cluster. In the 
future we'll capture all the metrics via ceilometer.

IMHO the amphorae should have an interface that allow for the logs to be moved 
to various backends such as an elastic search, hadoop HDFS, Swift, etc as well 
as by default (but with the option to disable it) ceilometer. Ceilometer is the 
metering defacto for OpenStack so we need to support it. We would like the 
integration with Ceilometer to be based on Notifications. I believe German send 
a reference to that in another email. The pre-processing will need to be 
optional and the amount of data aggregation configurable.

What you describe below to me is usage gathering/metering. The billing is 
independent since companies with private clouds might not want to bill but 
still need usage reports for capacity planning etc. Billing/Charging is just 
putting a monetary value on the various form of usage,

I agree with all points.

 - Capture logs in a scalable way (i.e. capture logs and put them on a
 separate scalable store somewhere so that it doesn't affect the amphora).

 - Every X amount of time (every hour, for example) process the logs and
 send them on their merry way to cielometer or whatever service an operator
 will be using for billing purposes.

Keep the logs: This is what we would use log forwarding to either Swift or 
Elastic Search, etc.

- Keep logs for some configurable amount of time. This could be anything
 from indefinitely to not at all. Rackspace is planing on keeping them for
 a certain period of time for the following reasons:

It looks like we are in agreement so I am not sure why it sounded like we were 
in disagreement on the IRC. I am not sure why but it sounded like you were 
talking about something else when you were talking about the real time 
processing. If we are just taking about moving the logs to your Hadoop cluster 
or any backedn a scalable way we agree.

Susanne


On Thu, Oct 23, 2014 at 6:30 PM, Jorge Miramontes 
jorge.miramon...@rackspace.commailto:jorge.miramon...@rackspace.com wrote:
Hey German/Susanne,

To continue our conversation from our IRC meeting could you all provide
more insight into you usage requirements? Also, I'd like to clarify a few
points related to using logging.

I am advocating that logs be used for multiple purposes, including
billing. Billing requirements are different that connection logging
requirements. However, connection logging is a very accurate mechanism to
capture billable metrics and thus, is related. My vision for this is
something like the following:

- Capture logs in a scalable way (i.e. capture logs and put them on a
separate scalable store somewhere so that it doesn't affect the amphora).
- Every X amount of time (every hour, for example) process the logs and
send them on their merry way to cielometer or whatever service an operator
will be using for billing purposes.
- Keep logs for some configurable amount of time. This could be anything
from indefinitely to not at all. Rackspace is planing on keeping them for
a certain period of time for the following reasons:

A) We have connection logging as a planned feature

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-11-05 Thread Jorge Miramontes
Thanks German,

It looks like the conversation is going towards using the HAProxy stats 
interface and/or iptables. I just wanted to explore logging a bit. That said, 
can you and Stephen share your thoughts on how we might implement that 
approach? I'd like to get a spec out soon because I believe metric gathering 
can be worked on in parallel with the rest of the project. In fact, I was 
hoping to get my hands dirty on this one and contribute some code, but a 
strategy and spec are needed first before I can start that ;)

Cheers,
--Jorge

From: Eichberger, German 
german.eichber...@hp.commailto:german.eichber...@hp.com
Reply-To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Date: Wednesday, November 5, 2014 3:50 AM
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hi Jorge,

I am still not convinced that we need to use logging for usage metrics. We can 
also use the haproxy stats interface (which the haproxy team is willing to 
improve based on our input) and/or iptables as Stephen suggested. That said 
this probably needs more exploration.

From an HP perspective the full logs on the load balancer are mostly 
interesting for the user of the loadbalancer – we only care about aggregates 
for our metering. That said we would be happy to just move them on demand to a 
place the user can access.

Thanks,
German


From: Jorge Miramontes [mailto:jorge.miramon...@rackspace.com]
Sent: Tuesday, November 04, 2014 8:20 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hi Susanne,

Thanks for the reply. As Angus pointed out, the one big item that needs to be 
addressed with this method is network I/O of raw logs. One idea to mitigate 
this concern is to store the data locally for the operator-configured 
granularity, process it and THEN send it to cielometer, etc. If we can't 
engineer a way to deal with the high network I/O that will inevitably occur we 
may have to move towards a polling approach. Thoughts?

Cheers,
--Jorge

From: Susanne Balle sleipnir...@gmail.commailto:sleipnir...@gmail.com
Reply-To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Date: Tuesday, November 4, 2014 11:10 AM
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Jorge

I understand your use cases around capturing of metrics, etc.

Today we mine the logs for usage information on our Hadoop cluster. In the 
future we'll capture all the metrics via ceilometer.

IMHO the amphorae should have an interface that allow for the logs to be moved 
to various backends such as an elastic search, hadoop HDFS, Swift, etc as well 
as by default (but with the option to disable it) ceilometer. Ceilometer is the 
metering defacto for OpenStack so we need to support it. We would like the 
integration with Ceilometer to be based on Notifications. I believe German send 
a reference to that in another email. The pre-processing will need to be 
optional and the amount of data aggregation configurable.

What you describe below to me is usage gathering/metering. The billing is 
independent since companies with private clouds might not want to bill but 
still need usage reports for capacity planning etc. Billing/Charging is just 
putting a monetary value on the various form of usage,

I agree with all points.

 - Capture logs in a scalable way (i.e. capture logs and put them on a
 separate scalable store somewhere so that it doesn't affect the amphora).

 - Every X amount of time (every hour, for example) process the logs and
 send them on their merry way to cielometer or whatever service an operator
 will be using for billing purposes.

Keep the logs: This is what we would use log forwarding to either Swift or 
Elastic Search, etc.

- Keep logs for some configurable amount of time. This could be anything
 from indefinitely to not at all. Rackspace is planing on keeping them for
 a certain period of time for the following reasons:

It looks like we are in agreement so I am not sure why it sounded like we were 
in disagreement on the IRC. I am not sure why but it sounded like you were 
talking about something else when you were talking about the real time 
processing. If we are just taking about moving the logs to your Hadoop cluster 
or any backedn a scalable way we agree.

Susanne


On Thu, Oct 23, 2014 at 6:30 PM, Jorge Miramontes 
jorge.miramon...@rackspace.commailto:jorge.miramon...@rackspace.com wrote:
Hey German/Susanne,

To continue our conversation from our IRC

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-11-04 Thread Susanne Balle
Jorge

I understand your use cases around capturing of metrics, etc.

Today we mine the logs for usage information on our Hadoop cluster. In the
future we'll capture all the metrics via ceilometer.

IMHO the amphorae should have an interface that allow for the logs to be
moved to various backends such as an elastic search, hadoop HDFS, Swift,
etc as well as by default (but with the option to disable it) ceilometer.
Ceilometer is the metering defacto for OpenStack so we need to support it.
We would like the integration with Ceilometer to be based on Notifications.
I believe German send a reference to that in another email. The
pre-processing will need to be optional and the amount of data aggregation
configurable.

What you describe below to me is usage gathering/metering. The billing is
independent since companies with private clouds might not want to bill but
still need usage reports for capacity planning etc. Billing/Charging is
just putting a monetary value on the various form of usage,

I agree with all points.

 - Capture logs in a scalable way (i.e. capture logs and put them on a
 separate scalable store somewhere so that it doesn't affect the amphora).

 - Every X amount of time (every hour, for example) process the logs and
 send them on their merry way to cielometer or whatever service an operator
 will be using for billing purposes.

Keep the logs: This is what we would use log forwarding to either Swift
or Elastic Search, etc.

- Keep logs for some configurable amount of time. This could be anything
 from indefinitely to not at all. Rackspace is planing on keeping them for
 a certain period of time for the following reasons:

It looks like we are in agreement so I am not sure why it sounded like we
were in disagreement on the IRC. I am not sure why but it sounded like you
were talking about something else when you were talking about the real time
processing. If we are just taking about moving the logs to your Hadoop
cluster or any backedn a scalable way we agree.

Susanne


On Thu, Oct 23, 2014 at 6:30 PM, Jorge Miramontes 
jorge.miramon...@rackspace.com wrote:

 Hey German/Susanne,

 To continue our conversation from our IRC meeting could you all provide
 more insight into you usage requirements? Also, I'd like to clarify a few
 points related to using logging.

 I am advocating that logs be used for multiple purposes, including
 billing. Billing requirements are different that connection logging
 requirements. However, connection logging is a very accurate mechanism to
 capture billable metrics and thus, is related. My vision for this is
 something like the following:

 - Capture logs in a scalable way (i.e. capture logs and put them on a
 separate scalable store somewhere so that it doesn't affect the amphora).
 - Every X amount of time (every hour, for example) process the logs and
 send them on their merry way to cielometer or whatever service an operator
 will be using for billing purposes.
 - Keep logs for some configurable amount of time. This could be anything
 from indefinitely to not at all. Rackspace is planing on keeping them for
 a certain period of time for the following reasons:

 A) We have connection logging as a planned feature. If a customer
 turns
 on the connection logging feature for their load balancer it will already
 have a history. One important aspect of this is that customers (at least
 ours) tend to turn on logging after they realize they need it (usually
 after a tragic lb event). By already capturing the logs I'm sure customers
 will be extremely happy to see that there are already X days worth of logs
 they can immediately sift through.
 B) Operators and their support teams can leverage logs when
 providing
 service to their customers. This is huge for finding issues and resolving
 them quickly.
 C) Albeit a minor point, building support for logs from the get-go
 mitigates capacity management uncertainty. My example earlier was the
 extreme case of every customer turning on logging at the same time. While
 unlikely, I would hate to manage that!

 I agree that there are other ways to capture billing metrics but, from my
 experience, those tend to be more complex than what I am advocating and
 without the added benefits listed above. An understanding of HP's desires
 on this matter will hopefully get this to a point where we can start
 working on a spec.

 Cheers,
 --Jorge

 P.S. Real-time stats is a different beast and I envision there being an
 API call that returns real-time data such as this ==
 http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.


 From:  Eichberger, German german.eichber...@hp.com
 Reply-To:  OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org
 Date:  Wednesday, October 22, 2014 2:41 PM
 To:  OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org
 Subject:  Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-28 Thread Jorge Miramontes
Thanks for the reply Angus,

DDoS attacks are definitely a concern we are trying to address here. My
assumptions are based on a solution that is engineered for this type of
thing. Are you more concerned with network I/O during a DoS attack or
storing the logs? Under the idea I had, I wanted to make the amount of
time logs are stored for configurable so that the operator can choose
whether they want the logs after processing or not. The network I/O of
pumping logs out is a concern of mine, however.

Sampling seems like the go-to solution for gathering usage but I was
looking for something different as sampling can get messy and can be
inaccurate for certain metrics. Depending on the sampling rate, this
solution has the potential to miss spikes in traffic if you are gathering
gauge metrics such as active connections/sessions. Using logs would be
100% accurate in this case. Also, I'm assuming LBaaS will have events so
combining sampling with events (CREATE, UPDATE, SUSPEND, DELETE, etc.)
gets complicated. Combining logs with events is arguably less complicated
as the granularity of logs is high. Due to this granularity, one can split
the logs based on the event times cleanly. Since sampling will have a
fixed cadence you will have to perform a manual sample at the time of
the event (i.e. add complexity).

At the end of the day there is no free lunch so more insight is
appreciated. Thanks for the feedback.

Cheers,
--Jorge




On 10/27/14 6:55 PM, Angus Lees g...@inodes.org wrote:

On Wed, 22 Oct 2014 11:29:27 AM Robert van Leeuwen wrote:
  I,d like to start a conversation on usage requirements and have a few
  suggestions. I advocate that, since we will be using TCP and
HTTP/HTTPS
  based protocols, we inherently enable connection logging for load
 
  balancers for several reasons:
 Just request from the operator side of things:
 Please think about the scalability when storing all logs.
 
 e.g. we are currently logging http requests to one load balanced
application
 (that would be a fit for LBAAS) It is about 500 requests per second,
which
 adds up to 40GB per day (in elasticsearch.) Please make sure whatever
 solution is chosen it can cope with machines doing 1000s of requests per
 second...

And to take this further, what happens during DoS attack (either syn
flood or 
full connections)?  How do we ensure that we don't lose our logging
system 
and/or amplify the DoS attack?

One solution is sampling, with a tunable knob for the sampling rate -
perhaps 
tunable per-vip.  This still increases linearly with attack traffic,
unless you 
use time-based sampling (1-every-N-seconds rather than 1-every-N-packets).

One of the advantages of (eg) polling the number of current sessions is
that 
the cost of that monitoring is essentially fixed regardless of the number
of 
connections passing through.  Numerous other metrics (rate of new
connections, 
etc) also have this property and could presumably be used for accurate
billing 
- without amplifying attacks.

I think we should be careful about whether we want logging or metrics for
more 
accurate billing.  Both are useful, but full logging is only really
required 
for ad-hoc debugging (important! but different).

-- 
 - Gus

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-28 Thread Angus Lees
On Tue, 28 Oct 2014 04:42:27 PM Jorge Miramontes wrote:
 Thanks for the reply Angus,
 
 DDoS attacks are definitely a concern we are trying to address here. My
 assumptions are based on a solution that is engineered for this type of
 thing. Are you more concerned with network I/O during a DoS attack or
 storing the logs? Under the idea I had, I wanted to make the amount of
 time logs are stored for configurable so that the operator can choose
 whether they want the logs after processing or not. The network I/O of
 pumping logs out is a concern of mine, however.

My primary concern was the generated network I/O, and the write bandwidth to 
storage media implied by that (not so much the accumulated volume of data).

We're in an era where 10Gb/s networking is now common for serving/loadbalancer 
infrastructure and as far as I can see the trend for networking is climbing 
more steeply that storage I/O, so it's only going to get worse.   10Gb/s of 
short-lived connections is a *lot* to try to write to reliable storage 
somewhere and later analyse.
It's a useful option for some users, but it would be a shame to have to limit 
loadbalancer throughput by the logging infrastructure just because we didn't 
have an alternative available.

I think you're right, that we don't have an obviously-correct choice here.  I 
think we need to expose both cheap sampling/polling of counters and more 
detailed logging of connections matching patterns (and indeed actual packet 
capture would be nice too).  Someone could then choose to base their billing 
on either datasource depending on their own accuracy-vs-cost-of-collection 
tradeoffs.  I don't see that either approach is going to be sufficiently 
universal to obsolete the other :(

Also: UDP.   Most providers are all about HTTP now, but there are still some 
people that need to bill for UDP, SIP, VPN, etc traffic.

 - Gus

 Sampling seems like the go-to solution for gathering usage but I was
 looking for something different as sampling can get messy and can be
 inaccurate for certain metrics. Depending on the sampling rate, this
 solution has the potential to miss spikes in traffic if you are gathering
 gauge metrics such as active connections/sessions. Using logs would be
 100% accurate in this case. Also, I'm assuming LBaaS will have events so
 combining sampling with events (CREATE, UPDATE, SUSPEND, DELETE, etc.)
 gets complicated. Combining logs with events is arguably less complicated
 as the granularity of logs is high. Due to this granularity, one can split
 the logs based on the event times cleanly. Since sampling will have a
 fixed cadence you will have to perform a manual sample at the time of
 the event (i.e. add complexity).
 
 At the end of the day there is no free lunch so more insight is
 appreciated. Thanks for the feedback.
 
 Cheers,
 --Jorge
 
 On 10/27/14 6:55 PM, Angus Lees g...@inodes.org wrote:
 On Wed, 22 Oct 2014 11:29:27 AM Robert van Leeuwen wrote:
   I,d like to start a conversation on usage requirements and have a few
   suggestions. I advocate that, since we will be using TCP and
 
 HTTP/HTTPS
 
   based protocols, we inherently enable connection logging for load
  
   balancers for several reasons:
  Just request from the operator side of things:
  Please think about the scalability when storing all logs.
  
  e.g. we are currently logging http requests to one load balanced
 
 application
 
  (that would be a fit for LBAAS) It is about 500 requests per second,
 
 which
 
  adds up to 40GB per day (in elasticsearch.) Please make sure whatever
  solution is chosen it can cope with machines doing 1000s of requests per
  second...
 
 And to take this further, what happens during DoS attack (either syn
 flood or
 full connections)?  How do we ensure that we don't lose our logging
 system
 and/or amplify the DoS attack?
 
 One solution is sampling, with a tunable knob for the sampling rate -
 perhaps
 tunable per-vip.  This still increases linearly with attack traffic,
 unless you
 use time-based sampling (1-every-N-seconds rather than 1-every-N-packets).
 
 One of the advantages of (eg) polling the number of current sessions is
 that
 the cost of that monitoring is essentially fixed regardless of the number
 of
 connections passing through.  Numerous other metrics (rate of new
 connections,
 etc) also have this property and could presumably be used for accurate
 billing
 - without amplifying attacks.
 
 I think we should be careful about whether we want logging or metrics for
 more
 accurate billing.  Both are useful, but full logging is only really
 required
 for ad-hoc debugging (important! but different).
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-27 Thread Jorge Miramontes
Hey German,

I totally agree on the security/privacy aspect of logs, especially due to
the SSL/TLS Termination feature.

After looking at BP [1] and the spec [2] for metering, it looks like it is
proposing to send more than just billable usage to cielometer. From my
previous email I considered this tracking usage (billable usage can be
a subset of tracking usage). It also appears to me that there is an
implied interface  for cielometer as we need to be able to capture metrics
from various lb devices (HAProxy, Nginx, Netscaler, etc.), standardize
them, and then send them off. That said, what type of implementation was
HP thinking of to gather these metrics? Instead of focusing on my idea of
using logging I'd like to change the discussion and get a picture as to
what you all are envisioning for a possible implementation direction.
Important items for Rackspace include accuracy of data, no lost data (i.e.
when sending to upstream system ensure it gets there), reliability of
cadence when sending usage to upstream system, and the ability to
backtrack and audit data whenever there seems to be a discrepancy in a
customer's monthly statement. Keep in mind that we need to integrate with
our current billing pipeline so we are not planning on using cielometer at
the moment. Thus, we need to make this somewhat configurable for those not
using cielometer.

Cheers,
--Jorge

[1] 
https://blueprints.launchpad.net/ceilometer/+spec/ceilometer-meter-lbaas

[2] https://review.openstack.org/#/c/94958/12/specs/juno/lbaas_metering.rst


On 10/24/14 5:19 PM, Eichberger, German german.eichber...@hp.com wrote:

Hi Jorge,

I agree completely with the points you make about the logs. We still feel
that metering and logging are two different problems. The ceilometers
community has a proposal on how to meter lbaas (see
http://specs.openstack.org/openstack/ceilometer-specs/specs/juno/lbaas_met
ering.html) and we at HP think that those values are be sufficient for us
for the time being.

I think our discussion is mostly about connection logs which are emitted
some way from amphora (e.g. haproxy logs). Since they are customer's logs
we need to explore on our end the privacy implications (I assume at RAX
you have controls in place to make sure that there is no violation :-).
Also I need to check if our central logging system is scalable enough and
we can send logs there without creating security holes.

Another possibility is to log like syslog our apmphora agent logs to a
central system to help with trouble shooting debugging. Those could be
sufficiently anonymized to avoid privacy issue. What are your thoughts on
logging those?

Thanks,
German

-Original Message-
From: Jorge Miramontes [mailto:jorge.miramon...@rackspace.com]
Sent: Thursday, October 23, 2014 3:30 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hey German/Susanne,

To continue our conversation from our IRC meeting could you all provide
more insight into you usage requirements? Also, I'd like to clarify a few
points related to using logging.

I am advocating that logs be used for multiple purposes, including
billing. Billing requirements are different that connection logging
requirements. However, connection logging is a very accurate mechanism to
capture billable metrics and thus, is related. My vision for this is
something like the following:

- Capture logs in a scalable way (i.e. capture logs and put them on a
separate scalable store somewhere so that it doesn't affect the amphora).
- Every X amount of time (every hour, for example) process the logs and
send them on their merry way to cielometer or whatever service an
operator will be using for billing purposes.
- Keep logs for some configurable amount of time. This could be anything
from indefinitely to not at all. Rackspace is planing on keeping them for
a certain period of time for the following reasons:
   
   A) We have connection logging as a planned feature. If a customer turns
on the connection logging feature for their load balancer it will already
have a history. One important aspect of this is that customers (at least
ours) tend to turn on logging after they realize they need it (usually
after a tragic lb event). By already capturing the logs I'm sure
customers will be extremely happy to see that there are already X days
worth of logs they can immediately sift through.
   B) Operators and their support teams can leverage logs when providing
service to their customers. This is huge for finding issues and resolving
them quickly.
   C) Albeit a minor point, building support for logs from the get-go
mitigates capacity management uncertainty. My example earlier was the
extreme case of every customer turning on logging at the same time. While
unlikely, I would hate to manage that!

I agree that there are other ways to capture billing metrics but, from my
experience, those tend to be more complex

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-24 Thread Eichberger, German
Hi Jorge,

I agree completely with the points you make about the logs. We still feel that 
metering and logging are two different problems. The ceilometers community has 
a proposal on how to meter lbaas (see 
http://specs.openstack.org/openstack/ceilometer-specs/specs/juno/lbaas_metering.html)
 and we at HP think that those values are be sufficient for us for the time 
being. 

I think our discussion is mostly about connection logs which are emitted some 
way from amphora (e.g. haproxy logs). Since they are customer's logs we need to 
explore on our end the privacy implications (I assume at RAX you have controls 
in place to make sure that there is no violation :-). Also I need to check if 
our central logging system is scalable enough and we can send logs there 
without creating security holes.

Another possibility is to log like syslog our apmphora agent logs to a central 
system to help with trouble shooting debugging. Those could be sufficiently 
anonymized to avoid privacy issue. What are your thoughts on logging those?

Thanks,
German

-Original Message-
From: Jorge Miramontes [mailto:jorge.miramon...@rackspace.com] 
Sent: Thursday, October 23, 2014 3:30 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hey German/Susanne,

To continue our conversation from our IRC meeting could you all provide more 
insight into you usage requirements? Also, I'd like to clarify a few points 
related to using logging.

I am advocating that logs be used for multiple purposes, including billing. 
Billing requirements are different that connection logging requirements. 
However, connection logging is a very accurate mechanism to capture billable 
metrics and thus, is related. My vision for this is something like the 
following:

- Capture logs in a scalable way (i.e. capture logs and put them on a separate 
scalable store somewhere so that it doesn't affect the amphora).
- Every X amount of time (every hour, for example) process the logs and send 
them on their merry way to cielometer or whatever service an operator will be 
using for billing purposes.
- Keep logs for some configurable amount of time. This could be anything from 
indefinitely to not at all. Rackspace is planing on keeping them for a certain 
period of time for the following reasons:

A) We have connection logging as a planned feature. If a customer turns 
on the connection logging feature for their load balancer it will already have 
a history. One important aspect of this is that customers (at least
ours) tend to turn on logging after they realize they need it (usually after a 
tragic lb event). By already capturing the logs I'm sure customers will be 
extremely happy to see that there are already X days worth of logs they can 
immediately sift through.
B) Operators and their support teams can leverage logs when providing 
service to their customers. This is huge for finding issues and resolving them 
quickly.
C) Albeit a minor point, building support for logs from the get-go 
mitigates capacity management uncertainty. My example earlier was the extreme 
case of every customer turning on logging at the same time. While unlikely, I 
would hate to manage that!

I agree that there are other ways to capture billing metrics but, from my 
experience, those tend to be more complex than what I am advocating and without 
the added benefits listed above. An understanding of HP's desires on this 
matter will hopefully get this to a point where we can start working on a spec.

Cheers,
--Jorge

P.S. Real-time stats is a different beast and I envision there being an API 
call that returns real-time data such as this == 
http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.


From:  Eichberger, German german.eichber...@hp.com
Reply-To:  OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Date:  Wednesday, October 22, 2014 2:41 PM
To:  OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Subject:  Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements


Hi Jorge,
 
Good discussion so far + glad to have you back J
 
I am not a big fan of using logs for billing information since 
ultimately (at least at HP) we need to pump it into ceilometer. So I am 
envisioning either the  amphora (via a proxy) to pump it straight into 
that system or we collect it on the controller and pump it from there.
 
Allowing/enabling logging creates some requirements on the hardware, 
mainly, that they can handle the IO coming from logging. Some operators 
might choose to  hook up very cheap and non performing disks which 
might not be able to deal with the log traffic. So I would suggest that 
there is some rate limiting on the log output to help with that.

 
Thanks,
German
 
From: Jorge Miramontes [mailto:jorge.miramon...@rackspace.com]

Sent: Wednesday

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-23 Thread Jorge Miramontes
Hey German/Susanne,

To continue our conversation from our IRC meeting could you all provide
more insight into you usage requirements? Also, I'd like to clarify a few
points related to using logging.

I am advocating that logs be used for multiple purposes, including
billing. Billing requirements are different that connection logging
requirements. However, connection logging is a very accurate mechanism to
capture billable metrics and thus, is related. My vision for this is
something like the following:

- Capture logs in a scalable way (i.e. capture logs and put them on a
separate scalable store somewhere so that it doesn't affect the amphora).
- Every X amount of time (every hour, for example) process the logs and
send them on their merry way to cielometer or whatever service an operator
will be using for billing purposes.
- Keep logs for some configurable amount of time. This could be anything
from indefinitely to not at all. Rackspace is planing on keeping them for
a certain period of time for the following reasons:

A) We have connection logging as a planned feature. If a customer turns
on the connection logging feature for their load balancer it will already
have a history. One important aspect of this is that customers (at least
ours) tend to turn on logging after they realize they need it (usually
after a tragic lb event). By already capturing the logs I'm sure customers
will be extremely happy to see that there are already X days worth of logs
they can immediately sift through.
B) Operators and their support teams can leverage logs when providing
service to their customers. This is huge for finding issues and resolving
them quickly.
C) Albeit a minor point, building support for logs from the get-go
mitigates capacity management uncertainty. My example earlier was the
extreme case of every customer turning on logging at the same time. While
unlikely, I would hate to manage that!

I agree that there are other ways to capture billing metrics but, from my
experience, those tend to be more complex than what I am advocating and
without the added benefits listed above. An understanding of HP's desires
on this matter will hopefully get this to a point where we can start
working on a spec.

Cheers,
--Jorge

P.S. Real-time stats is a different beast and I envision there being an
API call that returns real-time data such as this ==
http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.


From:  Eichberger, German german.eichber...@hp.com
Reply-To:  OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Date:  Wednesday, October 22, 2014 2:41 PM
To:  OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Subject:  Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements


Hi Jorge,
 
Good discussion so far + glad to have you back
J
 
I am not a big fan of using logs for billing information since ultimately
(at least at HP) we need to pump it into ceilometer. So I am envisioning
either the
 amphora (via a proxy) to pump it straight into that system or we collect
it on the controller and pump it from there.
 
Allowing/enabling logging creates some requirements on the hardware,
mainly, that they can handle the IO coming from logging. Some operators
might choose to
 hook up very cheap and non performing disks which might not be able to
deal with the log traffic. So I would suggest that there is some rate
limiting on the log output to help with that.

 
Thanks,
German
 
From: Jorge Miramontes [mailto:jorge.miramon...@rackspace.com]

Sent: Wednesday, October 22, 2014 6:51 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements


 
Hey Stephen (and Robert),

 

For real-time usage I was thinking something similar to what you are
proposing. Using logs for this would be overkill IMO so your suggestions
were what I was
 thinking of starting with.

 

As far as storing logs is concerned I was definitely thinking of
offloading these onto separate storage devices. Robert, I totally hear
you on the scalability
 part as our current LBaaS setup generates TB of request logs. I'll start
planning out a spec and then I'll let everyone chime in there. I just
wanted to get a general feel for the ideas I had mentioned. I'll also
bring it up in today's meeting.

 

Cheers,

--Jorge




 

From:
Stephen Balukoff sbaluk...@bluebox.net
Reply-To: OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Date: Wednesday, October 22, 2014 4:04 AM
To: OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

 

Hi Jorge!

 

Welcome back, eh! You've been missed.

 

Anyway, I just wanted to say that your proposal sounds great to me, and
it's good to finally be closer to having

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-22 Thread Stephen Balukoff
Hi Jorge!

Welcome back, eh! You've been missed.

Anyway, I just wanted to say that your proposal sounds great to me, and
it's good to finally be closer to having concrete requirements for logging,
eh. Once this discussion is nearing a conclusion, could you write up the
specifics of logging into a specification proposal document?

Regarding the discussion itself: I think we can ignore UDP for now, as
there doesn't seem to be high demand for it, and it certainly won't be
supported in v 0.5 of Octavia (and maybe not in v1 or v2 either, unless we
see real demand).

Regarding the 'real-time usage' information: I have some ideas regarding
getting this from a combination of iptables and / or the haproxy stats
interface. Were you thinking something different that involves on-the-fly
analysis of the logs or something?  (I tend to find that logs are great for
non-real time data, but can often be lacking if you need, say, a gauge like
'currently open connections' or something.)

One other thing: If there's a chance we'll be storing logs on the amphorae
themselves, then we need to have log rotation as part of the configuration
here. It would be silly to have an amphora failure just because its
ephemeral disk fills up, eh.

Stephen

On Wed, Oct 15, 2014 at 4:03 PM, Jorge Miramontes 
jorge.miramon...@rackspace.com wrote:

 Hey Octavia folks!


 First off, yes, I'm still alive and kicking. :)

 I,d like to start a conversation on usage requirements and have a few
 suggestions. I advocate that, since we will be using TCP and HTTP/HTTPS
 based protocols, we inherently enable connection logging for load
 balancers for several reasons:

 1) We can use these logs as the raw and granular data needed to track
 usage. With logs, the operator has flexibility as to what usage metrics
 they want to bill against. For example, bandwidth is easy to track and can
 even be split into header and body data so that the provider can choose if
 they want to bill on header data or not. Also, the provider can determine
 if they will bill their customers for failed requests that were the fault
 of the provider themselves. These are just a few examples; the point is
 the flexible nature of logs.

 2) Creating billable usage from logs is easy compared to other options
 like polling. For example, in our current LBaaS iteration at Rackspace we
 bill partly on average concurrent connections. This is based on polling
 and is not as accurate as it possibly can be. It's very close, but it
 doesn't get more accurate that the logs themselves. Furthermore, polling
 is more complex and uses up resources on the polling cadence.

 3) Enabling logs for all load balancers can be used for debugging, support
 and audit purposes. While the customer may or may not want their logs
 uploaded to swift, operators and their support teams can still use this
 data to help customers out with billing and setup issues. Auditing will
 also be easier with raw logs.

 4) Enabling logs for all load balancers will help mitigate uncertainty in
 terms of capacity planning. Imagine if every customer suddenly enabled
 logs without it ever being turned on. This could produce a spike in
 resource utilization that will be hard to manage. Enabling logs from the
 start means we are certain as to what to plan for other than the nature of
 the customer's traffic pattern.

 Some Cons I can think of (please add more as I think the pros outweigh the
 cons):

 1) If we every add UDP based protocols then this model won't work.  1% of
 our load balancers at Rackspace are UDP based so we are not looking at
 using this protocol for Octavia. I'm more of a fan of building a really
 good TCP/HTTP/HTTPS based load balancer because UDP load balancing solves
 a different problem. For me different problem == different product.

 2) I'm assuming HA Proxy. Thus, if we choose another technology for the
 amphora then this model may break.


 Also, and more generally speaking, I have categorized usage into three
 categories:

 1) Tracking usage - this is usage that will be used my operators and
 support teams to gain insight into what load balancers are doing in an
 attempt to monitor potential issues.
 2) Billable usage - this is usage that is a subset of tracking usage used
 to bill customers.
 3) Real-time usage - this is usage that should be exposed via the API so
 that customers can make decisions that affect their configuration (ex.
 Based off of the number of connections my web heads can handle when
 should I add another node to my pool?).

 These are my preliminary thoughts, and I'd love to gain insight into what
 the community thinks. I have built about 3 usage collection systems thus
 far (1 with Brandon) and have learned a lot. Some basic rules I have
 discovered with collecting usage are:

 1) Always collect granular usage as it paints a picture of what actually
 happened. Massaged/un-granular usage == lost information.
 2) Never imply, always be explicit. Implications usually stem from bad
 

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-22 Thread Robert van Leeuwen
 I,d like to start a conversation on usage requirements and have a few
 suggestions. I advocate that, since we will be using TCP and HTTP/HTTPS
 based protocols, we inherently enable connection logging for load
 balancers for several reasons:

Just request from the operator side of things:
Please think about the scalability when storing all logs.

e.g. we are currently logging http requests to one load balanced application 
(that would be a fit for LBAAS)
It is about 500 requests per second, which adds up to 40GB per day (in 
elasticsearch.)
Please make sure whatever solution is chosen it can cope with machines doing 
1000s of requests per second...

Cheers,
Robert van Leeuwen
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-22 Thread Jorge Miramontes
Hey Stephen (and Robert),

For real-time usage I was thinking something similar to what you are proposing. 
Using logs for this would be overkill IMO so your suggestions were what I was 
thinking of starting with.

As far as storing logs is concerned I was definitely thinking of offloading 
these onto separate storage devices. Robert, I totally hear you on the 
scalability part as our current LBaaS setup generates TB of request logs. I'll 
start planning out a spec and then I'll let everyone chime in there. I just 
wanted to get a general feel for the ideas I had mentioned. I'll also bring it 
up in today's meeting.

Cheers,
--Jorge

From: Stephen Balukoff sbaluk...@bluebox.netmailto:sbaluk...@bluebox.net
Reply-To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Date: Wednesday, October 22, 2014 4:04 AM
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hi Jorge!

Welcome back, eh! You've been missed.

Anyway, I just wanted to say that your proposal sounds great to me, and it's 
good to finally be closer to having concrete requirements for logging, eh. Once 
this discussion is nearing a conclusion, could you write up the specifics of 
logging into a specification proposal document?

Regarding the discussion itself: I think we can ignore UDP for now, as there 
doesn't seem to be high demand for it, and it certainly won't be supported in v 
0.5 of Octavia (and maybe not in v1 or v2 either, unless we see real demand).

Regarding the 'real-time usage' information: I have some ideas regarding 
getting this from a combination of iptables and / or the haproxy stats 
interface. Were you thinking something different that involves on-the-fly 
analysis of the logs or something?  (I tend to find that logs are great for 
non-real time data, but can often be lacking if you need, say, a gauge like 
'currently open connections' or something.)

One other thing: If there's a chance we'll be storing logs on the amphorae 
themselves, then we need to have log rotation as part of the configuration 
here. It would be silly to have an amphora failure just because its ephemeral 
disk fills up, eh.

Stephen

On Wed, Oct 15, 2014 at 4:03 PM, Jorge Miramontes 
jorge.miramon...@rackspace.commailto:jorge.miramon...@rackspace.com wrote:
Hey Octavia folks!


First off, yes, I'm still alive and kicking. :)

I,d like to start a conversation on usage requirements and have a few
suggestions. I advocate that, since we will be using TCP and HTTP/HTTPS
based protocols, we inherently enable connection logging for load
balancers for several reasons:

1) We can use these logs as the raw and granular data needed to track
usage. With logs, the operator has flexibility as to what usage metrics
they want to bill against. For example, bandwidth is easy to track and can
even be split into header and body data so that the provider can choose if
they want to bill on header data or not. Also, the provider can determine
if they will bill their customers for failed requests that were the fault
of the provider themselves. These are just a few examples; the point is
the flexible nature of logs.

2) Creating billable usage from logs is easy compared to other options
like polling. For example, in our current LBaaS iteration at Rackspace we
bill partly on average concurrent connections. This is based on polling
and is not as accurate as it possibly can be. It's very close, but it
doesn't get more accurate that the logs themselves. Furthermore, polling
is more complex and uses up resources on the polling cadence.

3) Enabling logs for all load balancers can be used for debugging, support
and audit purposes. While the customer may or may not want their logs
uploaded to swift, operators and their support teams can still use this
data to help customers out with billing and setup issues. Auditing will
also be easier with raw logs.

4) Enabling logs for all load balancers will help mitigate uncertainty in
terms of capacity planning. Imagine if every customer suddenly enabled
logs without it ever being turned on. This could produce a spike in
resource utilization that will be hard to manage. Enabling logs from the
start means we are certain as to what to plan for other than the nature of
the customer's traffic pattern.

Some Cons I can think of (please add more as I think the pros outweigh the
cons):

1) If we every add UDP based protocols then this model won't work.  1% of
our load balancers at Rackspace are UDP based so we are not looking at
using this protocol for Octavia. I'm more of a fan of building a really
good TCP/HTTP/HTTPS based load balancer because UDP load balancing solves
a different problem. For me different problem == different product.

2) I'm assuming HA Proxy. Thus, if we choose another technology

Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-22 Thread Eichberger, German
Hi Jorge,

Good discussion so far + glad to have you back :)

I am not a big fan of using logs for billing information since ultimately (at 
least at HP) we need to pump it into ceilometer. So I am envisioning either the 
amphora (via a proxy) to pump it straight into that system or we collect it on 
the controller and pump it from there.

Allowing/enabling logging creates some requirements on the hardware, mainly, 
that they can handle the IO coming from logging. Some operators might choose to 
hook up very cheap and non performing disks which might not be able to deal 
with the log traffic. So I would suggest that there is some rate limiting on 
the log output to help with that.

Thanks,
German

From: Jorge Miramontes [mailto:jorge.miramon...@rackspace.com]
Sent: Wednesday, October 22, 2014 6:51 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hey Stephen (and Robert),

For real-time usage I was thinking something similar to what you are proposing. 
Using logs for this would be overkill IMO so your suggestions were what I was 
thinking of starting with.

As far as storing logs is concerned I was definitely thinking of offloading 
these onto separate storage devices. Robert, I totally hear you on the 
scalability part as our current LBaaS setup generates TB of request logs. I'll 
start planning out a spec and then I'll let everyone chime in there. I just 
wanted to get a general feel for the ideas I had mentioned. I'll also bring it 
up in today's meeting.

Cheers,
--Jorge

From: Stephen Balukoff sbaluk...@bluebox.netmailto:sbaluk...@bluebox.net
Reply-To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Date: Wednesday, October 22, 2014 4:04 AM
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hi Jorge!

Welcome back, eh! You've been missed.

Anyway, I just wanted to say that your proposal sounds great to me, and it's 
good to finally be closer to having concrete requirements for logging, eh. Once 
this discussion is nearing a conclusion, could you write up the specifics of 
logging into a specification proposal document?

Regarding the discussion itself: I think we can ignore UDP for now, as there 
doesn't seem to be high demand for it, and it certainly won't be supported in v 
0.5 of Octavia (and maybe not in v1 or v2 either, unless we see real demand).

Regarding the 'real-time usage' information: I have some ideas regarding 
getting this from a combination of iptables and / or the haproxy stats 
interface. Were you thinking something different that involves on-the-fly 
analysis of the logs or something?  (I tend to find that logs are great for 
non-real time data, but can often be lacking if you need, say, a gauge like 
'currently open connections' or something.)

One other thing: If there's a chance we'll be storing logs on the amphorae 
themselves, then we need to have log rotation as part of the configuration 
here. It would be silly to have an amphora failure just because its ephemeral 
disk fills up, eh.

Stephen

On Wed, Oct 15, 2014 at 4:03 PM, Jorge Miramontes 
jorge.miramon...@rackspace.commailto:jorge.miramon...@rackspace.com wrote:
Hey Octavia folks!


First off, yes, I'm still alive and kicking. :)

I,d like to start a conversation on usage requirements and have a few
suggestions. I advocate that, since we will be using TCP and HTTP/HTTPS
based protocols, we inherently enable connection logging for load
balancers for several reasons:

1) We can use these logs as the raw and granular data needed to track
usage. With logs, the operator has flexibility as to what usage metrics
they want to bill against. For example, bandwidth is easy to track and can
even be split into header and body data so that the provider can choose if
they want to bill on header data or not. Also, the provider can determine
if they will bill their customers for failed requests that were the fault
of the provider themselves. These are just a few examples; the point is
the flexible nature of logs.

2) Creating billable usage from logs is easy compared to other options
like polling. For example, in our current LBaaS iteration at Rackspace we
bill partly on average concurrent connections. This is based on polling
and is not as accurate as it possibly can be. It's very close, but it
doesn't get more accurate that the logs themselves. Furthermore, polling
is more complex and uses up resources on the polling cadence.

3) Enabling logs for all load balancers can be used for debugging, support
and audit purposes. While the customer may or may not want their logs
uploaded to swift, operators and their support teams can still use this
data to help customers out with billing and setup issues

[openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

2014-10-15 Thread Jorge Miramontes
Hey Octavia folks!


First off, yes, I'm still alive and kicking. :)

I,d like to start a conversation on usage requirements and have a few
suggestions. I advocate that, since we will be using TCP and HTTP/HTTPS
based protocols, we inherently enable connection logging for load
balancers for several reasons:

1) We can use these logs as the raw and granular data needed to track
usage. With logs, the operator has flexibility as to what usage metrics
they want to bill against. For example, bandwidth is easy to track and can
even be split into header and body data so that the provider can choose if
they want to bill on header data or not. Also, the provider can determine
if they will bill their customers for failed requests that were the fault
of the provider themselves. These are just a few examples; the point is
the flexible nature of logs.

2) Creating billable usage from logs is easy compared to other options
like polling. For example, in our current LBaaS iteration at Rackspace we
bill partly on average concurrent connections. This is based on polling
and is not as accurate as it possibly can be. It's very close, but it
doesn't get more accurate that the logs themselves. Furthermore, polling
is more complex and uses up resources on the polling cadence.

3) Enabling logs for all load balancers can be used for debugging, support
and audit purposes. While the customer may or may not want their logs
uploaded to swift, operators and their support teams can still use this
data to help customers out with billing and setup issues. Auditing will
also be easier with raw logs.

4) Enabling logs for all load balancers will help mitigate uncertainty in
terms of capacity planning. Imagine if every customer suddenly enabled
logs without it ever being turned on. This could produce a spike in
resource utilization that will be hard to manage. Enabling logs from the
start means we are certain as to what to plan for other than the nature of
the customer's traffic pattern.

Some Cons I can think of (please add more as I think the pros outweigh the
cons):

1) If we every add UDP based protocols then this model won't work.  1% of
our load balancers at Rackspace are UDP based so we are not looking at
using this protocol for Octavia. I'm more of a fan of building a really
good TCP/HTTP/HTTPS based load balancer because UDP load balancing solves
a different problem. For me different problem == different product.

2) I'm assuming HA Proxy. Thus, if we choose another technology for the
amphora then this model may break.


Also, and more generally speaking, I have categorized usage into three
categories:

1) Tracking usage - this is usage that will be used my operators and
support teams to gain insight into what load balancers are doing in an
attempt to monitor potential issues.
2) Billable usage - this is usage that is a subset of tracking usage used
to bill customers.
3) Real-time usage - this is usage that should be exposed via the API so
that customers can make decisions that affect their configuration (ex.
Based off of the number of connections my web heads can handle when
should I add another node to my pool?).

These are my preliminary thoughts, and I'd love to gain insight into what
the community thinks. I have built about 3 usage collection systems thus
far (1 with Brandon) and have learned a lot. Some basic rules I have
discovered with collecting usage are:

1) Always collect granular usage as it paints a picture of what actually
happened. Massaged/un-granular usage == lost information.
2) Never imply, always be explicit. Implications usually stem from bad
assumptions.


Last but not least, we need to store every user and system load balancer
event such as creation, updates, suspension and deletion so that we may
bill on things like uptime and serve our customers better by knowing what
happened and when.


Cheers,
--Jorge


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev