Re: [openstack-dev] [all] how to send messages (and events) to our users

2015-04-08 Thread Halterman, Jonathan
The ability to send general purpose notifications is clearly a cross-cutting
concern. The absence of an AWS SNS like service in OpenStack is the reason
that services like Monasca had to roll their own notifications. This has
been a gaping hole in the OpenStack portfolio for a while, and I I think the
right way to think of a solution is as a new service built around a pub/sub
notification API (again, see SNS) as opposed to something which merely
exposes OpenStack¹s internal messaging infrastructure in some way (that
would be inappropriate).

Cheers,
Jonathan

From:  Vipul Sabhaya vip...@gmail.com
Reply-To:  OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Date:  Wednesday, April 8, 2015 at 5:18 PM
To:  OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Subject:  Re: [openstack-dev] [all] how to send messages (and events) to our
users

 
 
 On Wed, Apr 8, 2015 at 4:45 PM, Min Pae sputni...@gmail.com wrote:
 
 
 an under-the-clould service ? - That is not what I am after here.
 
 I think the thread went off on a tangent and this point got lost.  A user
 facing notification system absolutely should be a web centric protocol, as I
 imagine one of the big consumers of such a system will be monitoring
 dashboards which is trending more and more toward rich client side ³Single
 Page Applications².  AMQP would not work well in such cases.
  
 
 So is the yagi + atom hopper solution something we can point end-users to?
 Is it per-tenant etc...
 
 While I haven¹t seen it yet, if that solution provides a means to expose the
 atom events to end users, it seems like a promising start.  The thing that¹s
 required, though, is authentication/authorization that¹s tied in to keystone,
 so that notification regarding a tenant¹s resource is available only to that
 tenant.
 
 
 Sandy, do you have a write up somewhere on how to set this up so I can
 experiment a bit?
 
 Maybe this needs to be a part of Cue?
 
 Sorry, Cue¹s goal is to provision Message Queue/Broker services and manage
 them, just like Trove provisions and manages databases.  Cue would be ideally
 used to stand up and scale the RabbitMQ cluster providing messaging for an
 application backend, but it does not provide messaging itself (that would be
 Zaqar).
  
 
 Agree ‹ I don¹t think a multi-tenant notification service (which we seem to be
 after here) is the goal of Cue.
 
 That said, Monasca https://wiki.openstack.org/wiki/Monasca seems have
 implemented the collection, aggregation, and notification of these events.
 What may be missing is in Monasca is a mechanism for the tenant to consume
 these events via something other than AMQP.
 
  
 - Min
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 




smime.p7s
Description: S/MIME cryptographic signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [swift] On Object placement

2015-03-03 Thread Halterman, Jonathan
Hi Christian,

Sorry for the slow response. I was looking into the feasibility of your
suggestion for Sahara in particular and it took a bit.

On 2/19/15, 2:46 AM, Christian Schwede christian.schw...@enovance.com
wrote:

Hello Jonathan,

On 18.02.15 18:13, Halterman, Jonathan wrote:
 1. Swift should allow authorized services to place a given number
 of object replicas onto a particular rack, and onto separate
 racks.
 
 This is already possible if you use zones and regions in your ring
 files. For example, if you have 2 racks, you could assign one zone
 to each of them and Swift places at least one replica on each
 rack.
 
 Because Swift takes care of the device weight you could also ensure
 that a specific rack gets two copies, and another rack only one.
 
 Presumably a deployment would/should match the DC layout, where
 racks could correspond to Azs.

yes, that makes a lot of sense (to assign zones to racks), because in
this case you can ensure that there aren't multiple replicas stored
within the same rack. You can still access your data if a rack goes down
(power, network, maintenance).

 However, this is only true as long as all primary nodes are
 accessible. If Swift stores data on a handoff node this data might
 be written to a different node first, and moved to the primary node
 later on.
 
 Note that placing objects on other than the primary nodes (for
 example using an authorized service you described) will only store
 the data on these nodes until the replicator moves the data to the
 primary nodes described by the ring. As far as I can see there is
 no way to ensure that an authorized service can decide where to
 place data, and that this data stays on the selected nodes. That
 would require a fundamental change within Swift.
 
 So - how can we influence where data is stored? In terms of
 placement based on a hash ring, I¹m thinking of perhaps restricting
 the placement of an object to a subset of the ring based on a zone.
 We can still hash an object somewhere on the ring, for the purposes
 of controlling locality, we just want it to be within (or without) a
 particular zone. Any ideas?

You can't (at least not from the client side). The ring determines the
placement and if you have more zones (or regions) than replicas you
can't ensure an object replica is stored within a determined rack. Even
if you store it on a handoff node it will be moved to the primary node
sooner or later.
Determining that an object is stored in a specific zone is not possible
with the current architecture; you can only discover in which zone it
will be placed finally (based on the ring).

What you could do (especially if you have more racks than replicas) is
to use storage policies and only assign three racks to each policy, and
splitting them into three zones (if you store three replicas).
For example, let's assume you have 5 racks, then you create 5 storage
policies (SP) with the following assignment:

   Rack
SP 1   2   3   4   5
0  x   x   x
1  x   x   x
2  x   x   x
3  x   x   x
4  x   x   x

Doing this you can ensure the following:
- Data is distributed somehow evenly across the cluster (if you use the
storage policies also evenly distributed)
- From a given SP you can ensure that a replica is stored in a specific
rack; and because a SP is assigned to a container you can determine the
SP based on the container metadata (name SP0 rack_1_2_3 and so on to
make it even more simpler for the application to determine the racks).

That could help in your case?

While this wouldn’t give us all the control we need (2 replicas on 1 rack,
1 replica on another rack), ensuring at least 1 copy winds up on a
particular rack is part way there. With the way that Swift’s placement
works, are the other replicas likely to end up on different racks?

Where this might not work is for services that need to control rack
locality and allow users to select the containers that data is placed in.
This is currently the case with Sahara.



 2. Swift should allow authorized services and administrators to
 learn which racks an object resides on, along with endpoints.
 
 You already mentioned the endpoint middleware, though it is
 currently not protected and unauthenticated access is allowed if
 enabled.
 
 This is good to know. We still need to learn which rack an object
 resides on though. This information is important in determining
 whether a swift object resides on the same rack as a VM.

Well, that information is available using the /endpoint middleware? You
know the server IPs in a rack, and compare that to the output from the
endpoint middleware.

We don’t actually know the server IPs in a rack though, and collecting and
maintaining this host-rack information is something we’d like to avoid
having various individual services do. Currently Sahara does collect this
information

Re: [openstack-dev] [swift] On Object placement

2015-02-18 Thread Halterman, Jonathan
Hi Christian - thanks for the response,

On 2/18/15, 1:53 AM, Christian Schwede christian.schw...@enovance.com
wrote:

Hello Jonathan,

On 17.02.15 22:17, Halterman, Jonathan wrote:
 Various services desire the ability to control the location of data
 placed in Swift in order to minimize network saturation when moving data
 to compute, or in the case of services like Hadoop, to ensure that
 compute can be moved to wherever the data resides. Read/write latency
 can also be minimized by allowing authorized services to place one or
 more replicas onto the same rack (with other replicas being placed on
 separate racks). Fault tolerance can also be enhanced by ensuring that
 some replica(s) are placed onto separate racks. Breaking this down we
 come up with the following potential requirements:
 
 1. Swift should allow authorized services to place a given number of
 object replicas onto a particular rack, and onto separate racks.

This is already possible if you use zones and regions in your ring
files. For example, if you have 2 racks, you could assign one zone to
each of them and Swift places at least one replica on each rack.

Because Swift takes care of the device weight you could also ensure that
a specific rack gets two copies, and another rack only one.

Presumably a deployment would/should match the DC layout, where racks
could correspond to Azs.

However, this is only true as long as all primary nodes are accessible.
If Swift stores data on a handoff node this data might be written to a
different node first, and moved to the primary node later on.

Note that placing objects on other than the primary nodes (for example
using an authorized service you described) will only store the data on
these nodes until the replicator moves the data to the primary nodes
described by the ring.
As far as I can see there is no way to ensure that an authorized service
can decide where to place data, and that this data stays on the selected
nodes. That would require a fundamental change within Swift.

So - how can we influence where data is stored? In terms of placement
based on a hash ring, I¹m thinking of perhaps restricting the placement of
an object to a subset of the ring based on a zone. We can still hash an
object somewhere on the ring, for the purposes of controlling locality, we
just want it to be within (or without) a particular zone. Any ideas?


 2. Swift should allow authorized services and administrators to learn
 which racks an object resides on, along with endpoints.

You already mentioned the endpoint middleware, though it is currently
not protected and unauthenticated access is allowed if enabled.

This is good to know. We still need to learn which rack an object resides
on though. This information is important in determining whether a swift
object resides on the same rack as a VM.

You
could easily add another small middleware in the pipeline to check
authentication and grant or deny access to /endpoints based on the
authentication.
You can also get the node (and disk) if you have access to the ring
files. There is a tool included in the Swift source code called
swift-get-nodes; however you could simply reuse existing code to
include it in your projects.

I¹m guessing this would not work for in cloud services?

- jonathan


Christian

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


smime.p7s
Description: S/MIME cryptographic signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] On VM placement

2015-02-17 Thread Halterman, Jonathan
I'm working on some services that require the ability to place VMs onto the
same or separate racks, and I wanted to start a discussion to see what the
community thinks the best way of achieving this with Nova might be.

Quick overview:

Various clustered datastores require related data to be placed in close
proximity (such as on the same rack) for optimum read latency across
contiguous/partitioned datasets. Additionally, clustered datastores may
require that replicas be placed in particular locations, such as on the same
rack to minimize network saturation or on separate racks to enhance fault
tolerance. An example of this is Hadoop's common policy of placing two
replicas onto one rack and another onto a separate rack. For datastores that
use ephemeral storage, the ability to control the rack locality of Nova VMs
is crucial for meeting these needs. Breaking this down we come up with the
following potential requirements:

1. Nova should allow a VM to be booted onto the same rack as existing VMs
(rack affinity).
2. Nova should allow a VM to be booted onto a different rack from existing
VMs (rack anti-affinity).
3. Nova should allow authorized services to learn which rack a VM resides
on.

Currently, host aggregates are the best way to approximate a solution for
requirements 1 and 2. One could create host aggregates to represent the
physical racks in a datacenter and boot VMs into those racks as necessary,
but there are some challenges with this approach including the management of
different flavors to correspond to host aggregates, the need to determine
the placement of existing VMs, and the general problem of maintaining the
host aggregate information as hosts come and go. Simply booting VMs with
server-group style rack affinity and anti-affinity is not a direct process.
Requirement 3 is a move towards allowing authorized in cloud services to
learn about their location relative to other cloud resources such as Swift,
so that they might place compute and data in close proximity.

I'm interested to gather input on how we might approach this problem and
what the best path forward for implementing a solution might be. Please
share your ideas and input. It's also worth noting that a similar/related
need exists for Swift which I'm addressing in a separate message.

Cheers,
Jonathan




smime.p7s
Description: S/MIME cryptographic signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [swift] On Object placement

2015-02-17 Thread Halterman, Jonathan
I've been working on some services that require the ability to exploit the
co-location of compute and data storage (via Swift) onto the same racks, and
I wanted to start a discussion to see what the best way of controlling the
physical placement of Swift replicas might be.

Quick overview:

Various services desire the ability to control the location of data placed
in Swift in order to minimize network saturation when moving data to
compute, or in the case of services like Hadoop, to ensure that compute can
be moved to wherever the data resides. Read/write latency can also be
minimized by allowing authorized services to place one or more replicas onto
the same rack (with other replicas being placed on separate racks). Fault
tolerance can also be enhanced by ensuring that some replica(s) are placed
onto separate racks. Breaking this down we come up with the following
potential requirements:

1. Swift should allow authorized services to place a given number of object
replicas onto a particular rack, and onto separate racks.
2. Swift should allow authorized services and administrators to learn which
racks an object resides on, along with endpoints.

While requirement 1 addresses the rack-local writing of objects, requirement
2 facilitates the rack-local reading of objects. Swift's middelware
currently offers a list endpoints capability which could allow services to
select an endpoint on the same rack to read an object from, but there
doesn't appear to be a comparable solution for authorized in cloud services.

Currently I'm not sure of the best way to approach this problem. While
storage policies might offer some solution, I'm interested to gather input
on how we might move forward on a solution that addresses these requirements
in as direct a way as possible. Please share your ideas and input. It's also
worth noting that a similar need exists for Nova which I'm addressing in a
separate message.

Cheers,
Jonathan




smime.p7s
Description: S/MIME cryptographic signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev