Last Tuesday the TC held the first graduation review for Zaqar. During
the meeting some concerns arose. I've listed those concerns below with
some comments hoping that it will help starting a discussion before the
next meeting. In addition, I've added some comments about the project
stability at the bottom and an etherpad link pointing to a list of use
cases for Zaqar.

# Concerns

- Concern on operational burden of requiring NoSQL deploy expertise to
the mix of openstack operational skills

For those of you not familiar with Zaqar, it currently supports 2 nosql
drivers - MongoDB and Redis - and those are the only 2 drivers it
supports for now. This will require operators willing to use Zaqar to
maintain a new (?) NoSQL technology in their system. Before expressing
our thoughts on this matter, let me say that:

        1. By removing the SQLAlchemy driver, we basically removed the chance
for operators to use an already deployed "OpenStack-technology"
        2. Zaqar won't be backed by any AMQP based messaging technology for
now. Here's[0] a summary of the research the team (mostly done by
Victoria) did during Juno
        3. We (OpenStack) used to require Redis for the zmq matchmaker
        4. We (OpenStack) also use memcached for caching and as the oslo
caching lib becomes available - or a wrapper on top of dogpile.cache -
Redis may be used in place of memcached in more and more deployments.
        5. Ceilometer's recommended storage driver is still MongoDB, although
Ceilometer has now support for sqlalchemy. (Please correct me if I'm wrong).

That being said, it's obvious we already, to some extent, promote some
NoSQL technologies. However, for the sake of the discussion, lets assume
we don't.

I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
keep avoiding these technologies. NoSQL technologies have been around
for years and we should be prepared - including OpenStack operators - to
support these technologies. Not every tool is good for all tasks - one
of the reasons we removed the sqlalchemy driver in the first place -
therefore it's impossible to keep an homogeneous environment for all

With this, I'm not suggesting to ignore the risks and the extra burden
this adds but, instead of attempting to avoid it completely by not
evolving the stack of services we provide, we should probably work on
defining a reasonable subset of NoSQL services we are OK with
supporting. This will help making the burden smaller and it'll give
operators the option to choose.


- Concern on should we really reinvent a queue system rather than
piggyback on one

As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
flavor on top. [0]

Some things that differentiate Zaqar from SQS is it's capability for
supporting different protocols without sacrificing multi-tenantcy and
other intrinsic features it provides. Some protocols you may consider
for Zaqar are: STOMP, MQTT.

As far as the backend goes, Zaqar is not re-inventing it either. It sits
on top of existing storage technologies that have proven to be fast and
reliable for this task. The choice of using NoSQL technologies has a lot
to do with this particular thing and the fact that Zaqar needs a storage
capable of scaling, replicating and good support for failover.


- concern on dataplane vs. controlplane, should we add more dataplane
things in the integrated release

I'm really not sure I understand the arguments against dataplane
services. What concerns do people have about these services?
As far as I can tell, we already have several services - some in the
lower layers - that provide a data plane API. For example:

        * keystone (service catalogs and tokens)
        * glance (image management)
        * swift (object storage)
        * ceilometer (metrics)
        * heat (provisioning)
        * barbican (key management)

Are the concerns specific to Zaqar's dataplane API?

- concern on API v2 being already planned

At the meeting, we discussed a bit about Zaqar's API and more
importantly how stable it is. During that discussion I mentioned an
hypothetical v2 of the API. I'd like to clarify that a v2 is not being
planned for Kilo, what we would like to do is to gather feedback from
the community and services consuming Zaqar about the existing API and
use that feedback to design a new version of the API if necessary.

All this has yet to be discussed but most importantly, we would first
like to get more feedback from the community. We have already gotten
some feedback, but it has been fairly limited because most people are
waiting for us to graduate before kicking the tires.

We do have some endpoints that will go away in the API v2 - getting
messages by id, for  example. Nonetheless, we believe the current
1.x API already delivers a lot of value and can deliver on use cases
such as those put forward by the Heat team. 1.0 has been stable for a
good 6 months or more now, and 1.1 just adds some minor polishing.

- concern on the maturity of the NoQSL not AGPL backend (Redis)

Redis backend just landed and I've been working on a gate job for it
today. Although it hasn't been tested in production, if Zaqar graduates,
it still has a full development cycle to be tested and improved before
the first integrated release happens.

# Stability

During Juno, the team spent time mostly in the following blueprints:

3. Docs

The first one contains a subset of smaller blueprints, all intended to
polish the existing v1 of the API based on feedback from the community
and things we realized while working on the client library. The second
blueprint, as already mentioned in other parts of this email, refers to
the work on the new Redis driver. And the third item, tracked the work
to complete Zaqar's documentation from a user, developer and operator
perspective. There were other items we worked on - like queue flavors
and py33 support - but those had a lower priority in our queue.

The above is to say that during this cycle, we focused on extending
Zaqar's feature set, integration with the OpenStack's and Python's
ecosystem and making sure that the project is ready to be consumed
without too much burden. The project is considered stable and it's been
running in production on Rackspace's Clouds Queue. I've also heard
Catalyst is looking to deploy Zaqar in production as well.

During this cycle, we also spent some time working on a small benchmark
tool[0] - that we will contribute back to Rally as soon as it has
support for load benchmarks - and we posted the results of the tests we
ran here[1]. More benchmarks will be sent to the list soon.


# Use Cases

In addition to the aforementioned concerns and comments, I also would
like to share an etherpad that contains some use cases that other
integrated projects have for Zaqar[0]. The list is not exhaustive and
it'll contain more information before the next meeting.


Thanks to all of you who attended the meeting, I'm looking forward to
your feedback and obviously the next meeting :)

Flavio Percoco

OpenStack-dev mailing list

Reply via email to