Re: [openstack-dev] [oslo][messaging][zmq] Discussion on zmq driver design issues

2015-03-21 Thread Li Ma
On Tue, Mar 10, 2015 at 8:14 PM, ozamiatin ozamia...@mirantis.com wrote:
 Hi Li Ma,

 Thank you very much for your reply


 It's good to hear you have a living deployment with zmq driver.
 Is there a big divergence between your production and upstream versions
 of the driver? Besides [1] and [2] fixes for redis we have [5] and [6]
 critical multi-backend related issues for using the driver in real-world
 deployment.

Actually there's no such a big divergence between our driver and
upstream version. We didn't refactor it much, but just fixed all the
bugs that we met before and implemented socket reuse mechanism to
greatly improve its performance. For some bugs available, especially
cinder multi-backend and neutron multi-worker, we hacked cinder and
neutron to get rid of these bugs.

I discussed with our cinder developer several times about these
problems you mentioned above. Due to the current architecture, it is
really difficult to fix it in zeromq driver. However, it is very easy
to deal with it in cinder. We have patches on hand, but the
implementation is a little tricky that the upstream may not accept it.
:-( No worry, I'll find out it soon.

By the way, we are discussing about fanout performance and message
persistance. I don't have codes available, but I've got some ideas to
implement it.


 The only functionality for large-scale deployment that lacks in the
 current upstream codebase is socket pool scheduling (to provide
 lifecycle management, including recycle and reuse zeromq sockets). It
 was done several months ago and we are willing to contribute. I plan
 to propose a blueprint in the next release.

 Pool, recycle and reuse sounds good for performance.

Yes, actually our implementation is a little ugly and there's no unit
test available. Right now, I'm trying to refactor it and hopefully
I'll submit a spec soon.

 We also need a refactoring of the driver to reduce redundant entities
 or reconsider them (like ZmqClient or InternalContext) and to reduce code
 replications (like with topics).
 There is also some topics management needed.
 Clear code == less bugs == easy understand == easy contribute.
 We need a discussion (with related spec and UMLs) about what the driver
 architecture should be (I'm in progress with that).

+1, cannot agree with you more.

 3. ZeroMQ integration

 I've been working on the integration of ZeroMQ and DevStack for a
 while and actually it is working right now. I updated the deployment
 guide [3].

 That's true it works! :)

 I think it is the time to bring a non-voting gate for ZeroMQ and we
 can make the functional tests work.

 You can turn it with 'check experimental'. It is broken now.

I'll figure it out soon.

 5. ZeroMQ discussion

 Here I'd like to say sorry for this driver. Due to spare time and
 timezone, I'm not available for IRC or other meeting or discussions.
 But if it is possible, should we create a subgroup for ZeroMQ and
 schedule meetings for it? If we can schedule in advance or at a fixed
 date  time, I'm in.

 That's great idea
 +1 for zmq subgroup and meetings

I'll open another thread to discuss this topic.


 Subfolder is actually what I mean (python package like '_drivers')
 it should stay in oslo.messaging. Separate package like
 oslo.messaging.zeromq is overkill.
 As Doug proposed we can do consistently to AMQP-driver.

I suggest you go for it right now. It is really important for further
development.
If I submit new codes based upon the current code structure, it will
greatly affect this work in the future.

Best regards,
-- 
Li Ma (Nick)
Email: skywalker.n...@gmail.com

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][messaging][zmq] Discussion on zmq driver design issues

2015-03-10 Thread ozamiatin

Hi Li Ma,

Thank you very much for your reply

On 06.03.15 05:01, Li Ma wrote:

Hi all, actually I'm writing the same mail topic for zeromq driver,
but I haven't done it yet. Thank you for proposing this topic,
ozamiatin.

1. ZeroMQ functionality

Actually I proposed a session topic in the coming summit to show our
production system, named 'Distributed Messaging System for OpenStack
at Scale'. I don't know whether it will be allowed to present.
Otherwise, if it is possible, I can share my experience in design
summit.

Currently, AWCloud (the company I'm working) deployed more than 20
private clouds and 3 public clouds for our customers in production,
scaling from 10 to 500 physical nodes without any performance issue.
The performance dominates all the existing drivers in every aspect.
All is using ZeroMQ driver. We started improving ZeroMQ driver in
Icehouse and currently the modified driver has switched to
oslo.messaging.

As all knows, ZeroMQ has been unmaintainable for long. My colleagues
and I continuously contribute patches to upstream. The progress is a
little bit slow because we are doing everything just in our spare time
and the review procedure is also not efficient.

Here are two important patches [1], [2], for matchmaker redis. When
they are landed, I think ZeroMQ driver is capable of running in small
deployments.

It's good to hear you have a living deployment with zmq driver.
Is there a big divergence between your production and upstream versions
of the driver? Besides [1] and [2] fixes for redis we have [5] and [6]
critical multi-backend related issues for using the driver in real-world 
deployment.

The only functionality for large-scale deployment that lacks in the
current upstream codebase is socket pool scheduling (to provide
lifecycle management, including recycle and reuse zeromq sockets). It
was done several months ago and we are willing to contribute. I plan
to propose a blueprint in the next release.

Pool, recycle and reuse sounds good for performance.
We also need a refactoring of the driver to reduce redundant entities
or reconsider them (like ZmqClient or InternalContext) and to reduce 
code replications (like with topics).

There is also some topics management needed.
Clear code == less bugs == easy understand == easy contribute.
We need a discussion (with related spec and UMLs) about what the driver 
architecture should be (I'm in progress with that).


2. Why ZeroMQ matters for OpenStack

ZeroMQ is the only driver that depends on a stable library not an open
source product. This is the most important thing that comes up my
mind. When we deploy clouds with RabbitMQ or Qpid, we need
comprehensive knowledge from their community, from deployment best
practice to performance tuning for different scales. As an open source
product, no doubt that bugs are always there. You need to push lots of
things in different communities rather than OpenStack community.
Finally, it is not that working, you all know it, right?

ZeroMQ library itself is just encapsulation of sockets and is stable
enough and widely used in large-scale cluster communication for long.
We can build our own messaging system for inter-component RPC. We can
improve it for OpenStack and have the governance for codebase. We
don't need to rely on different products out of the community.
Actually, only ZeroMQ provides the possibility.

IMO, we can just keep it and improve it and finally it becomes another
choice for operators.

Zmq is also an opensource product and it has it's own community,
but I agree that rabbit is a complicated blackbox software we depend on.
While zmq is just a library and it is much more simpler (more reliable) 
inside.
Zmq driver is more lower-level than rabbit and qpid therefore it 
provides more flexibility.

By now it is the only driver where brokerless implementation is possible.


3. ZeroMQ integration

I've been working on the integration of ZeroMQ and DevStack for a
while and actually it is working right now. I updated the deployment
guide [3].

That's true it works! :)

I think it is the time to bring a non-voting gate for ZeroMQ and we
can make the functional tests work.

You can turn it with 'check experimental'. It is broken now.


4. ZeroMQ blueprints

We'd love to provide blueprints to improve ZeroMQ, as ozamiatin does.
According to my estimation, ZeroMQ can be another choice for
production in 1-2 release cycles due to bp review and patch review
procedure.

5. ZeroMQ discussion

Here I'd like to say sorry for this driver. Due to spare time and
timezone, I'm not available for IRC or other meeting or discussions.
But if it is possible, should we create a subgroup for ZeroMQ and
schedule meetings for it? If we can schedule in advance or at a fixed
date  time, I'm in.

That's great idea
+1 for zmq subgroup and meetings

6. Feedback to ozamiatin's suggestions

I'm with you in most all the proposals, but for packages, I think we
can just separate all the components in a sub-directory. This step 

Re: [openstack-dev] [oslo][messaging][zmq] Discussion on zmq driver design issues

2015-03-10 Thread ozamiatin

Hi, Eric

Thanks a lot for your comments.

On 06.03.15 06:21, Eric Windisch wrote:
On Wed, Mar 4, 2015 at 12:10 PM, ozamiatin ozamia...@mirantis.com 
mailto:ozamia...@mirantis.com wrote:


Hi,

By this e-mail I'd like to start a discussion about current zmq
driver internal design problems I've found out.
I wish to collect here all proposals and known issues. I hope this
discussion will be continued on Liberty design summit.
And hope it will drive our further zmq driver development efforts.

ZMQ Driver issues list (I address all issues with # and references
are in []):

1. ZMQContext per socket (blocker is neutron improper usage of
messaging via fork) [3]
2. Too many different contexts.
We have InternalContext used for ZmqProxy, RPCContext used in
ZmqReactor, and ZmqListener.
There is also zmq.Context which is zmq API entity. We need to
consider a possibility to unify their usage over inheritance
(maybe stick to RPCContext)
or to hide them as internal entities in their modules (see
refactoring #6)


The code, when I abandoned it, was moving toward fixing these issues, 
but for backwards compatibility was doing so in a staged fashion 
across the stable releases.


I agree it's pretty bad. Fixing this now, with the driver in a less 
stable state should be easier, as maintaining compatibility is of less 
importance.


3. Topic related code everywhere. We have no topic entity. It is
all string operations.
We need some topics management entity and topic itself as an
entity (not a string).
It causes issues like [4], [5]. (I'm already working on it).
There was a spec related [7].


Good! It's ugly. I had proposed a patch at one point, but I believe 
the decision was that it was better and cleaner to move toward the 
oslo.messaging abstraction as we solve the topic issue. Now that 
oslo.messaging exists, I agree it's well past time to fix this 
particular ugliness.


4. Manual implementation of messaging patterns.
   Now we can observe poor usage of zmq features in zmq driver.
Almost everything is implemented over PUSH/PULL.

4.1 Manual polling - use zmq.Poller (listening and replying
for multiple sockets)
4.2 Manual request/reply implementation for call [1].
Using of REQ/REP (ROUTER/DEALER) socket solves many
issues. A lot of code may be reduced.
4.3 Timeouts waiting


There are very specific reasons for the use of PUSH/PULL. I'm firmly 
of the belief that it's the only viable solution for an OpenStack RPC 
driver. This has to do with how asynchronous programming in Python is 
performed, with how edge-triggered versus level-triggered events are 
processed, and general state management for REQ/REP sockets.


I could be proven wrong, but I burned quite a bit of time in the 
beginning of the ZMQ effort looking at REQ/REP before realizing that 
PUSH/PULL was the only reasonable solution. Granted, this was over 3 
years ago, so I would not be too surprised if my assumptions are no 
longer valid.


I agree that REQ/REP is very limited because of their synchronous nature 
and 1-to-1 connection possibility.
But there are ROUTER/DEALER proxy sockets recommended to use with 
REQ/REP to compose 1-to-N and N-to-N asynchronous patterns.
I'm in research of that now and I didn't finally decide yet. When 
everything will be clear for me I'll come with a spec on that.


5. Add possibility to work without eventlet [2]. #4.1 is also
related here, we can reuse many of the implemented solutions
   like zmq.Poller over asynchronous sockets in one separate
thread (instead of spawning on each new socket).
   I will update the spec [2] on that.


Great. This was one of the motivations behind oslo.messaging and it 
would be great to see this come to fruition.


6. Put all zmq driver related stuff (matchmakers, most classes
from zmq_impl) into a separate package.
   Don't keep all classes (ZmqClient, ZmqProxy, Topics management,
ZmqListener, ZmqSocket, ZmqReactor)
   in one impl_zmq.py module.


Seems fine. In fact, I think a lot of code could be shared with an 
AMQP v1 driver...


I'll check what can be shared. Actually I didn't yet dig into AMQP v1 
driver enough.


7. Need more technical documentation on the driver like [6].
   I'm willing to prepare a current driver architecture overview
with some graphics UML charts, and to continue discuss the driver
architecture.


Documentation has always been a sore point. +2
--
Regards,
Eric Windisch
ᐧ


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Regards,
Oleksii Zamiatin

Re: [openstack-dev] [oslo][messaging][zmq] Discussion on zmq driver design issues

2015-03-05 Thread Li Ma
Hi all, actually I'm writing the same mail topic for zeromq driver,
but I haven't done it yet. Thank you for proposing this topic,
ozamiatin.

1. ZeroMQ functionality

Actually I proposed a session topic in the coming summit to show our
production system, named 'Distributed Messaging System for OpenStack
at Scale'. I don't know whether it will be allowed to present.
Otherwise, if it is possible, I can share my experience in design
summit.

Currently, AWCloud (the company I'm working) deployed more than 20
private clouds and 3 public clouds for our customers in production,
scaling from 10 to 500 physical nodes without any performance issue.
The performance dominates all the existing drivers in every aspect.
All is using ZeroMQ driver. We started improving ZeroMQ driver in
Icehouse and currently the modified driver has switched to
oslo.messaging.

As all knows, ZeroMQ has been unmaintainable for long. My colleagues
and I continuously contribute patches to upstream. The progress is a
little bit slow because we are doing everything just in our spare time
and the review procedure is also not efficient.

Here are two important patches [1], [2], for matchmaker redis. When
they are landed, I think ZeroMQ driver is capable of running in small
deployments.

The only functionality for large-scale deployment that lacks in the
current upstream codebase is socket pool scheduling (to provide
lifecycle management, including recycle and reuse zeromq sockets). It
was done several months ago and we are willing to contribute. I plan
to propose a blueprint in the next release.

2. Why ZeroMQ matters for OpenStack

ZeroMQ is the only driver that depends on a stable library not an open
source product. This is the most important thing that comes up my
mind. When we deploy clouds with RabbitMQ or Qpid, we need
comprehensive knowledge from their community, from deployment best
practice to performance tuning for different scales. As an open source
product, no doubt that bugs are always there. You need to push lots of
things in different communities rather than OpenStack community.
Finally, it is not that working, you all know it, right?

ZeroMQ library itself is just encapsulation of sockets and is stable
enough and widely used in large-scale cluster communication for long.
We can build our own messaging system for inter-component RPC. We can
improve it for OpenStack and have the governance for codebase. We
don't need to rely on different products out of the community.
Actually, only ZeroMQ provides the possibility.

IMO, we can just keep it and improve it and finally it becomes another
choice for operators.

3. ZeroMQ integration

I've been working on the integration of ZeroMQ and DevStack for a
while and actually it is working right now. I updated the deployment
guide [3].

I think it is the time to bring a non-voting gate for ZeroMQ and we
can make the functional tests work.

4. ZeroMQ blueprints

We'd love to provide blueprints to improve ZeroMQ, as ozamiatin does.
According to my estimation, ZeroMQ can be another choice for
production in 1-2 release cycles due to bp review and patch review
procedure.

5. ZeroMQ discussion

Here I'd like to say sorry for this driver. Due to spare time and
timezone, I'm not available for IRC or other meeting or discussions.
But if it is possible, should we create a subgroup for ZeroMQ and
schedule meetings for it? If we can schedule in advance or at a fixed
date  time, I'm in.

6. Feedback to ozamiatin's suggestions

I'm with you in most all the proposals, but for packages, I think we
can just separate all the components in a sub-directory. This step is
enough at the current stage.

Packaging the components are complicated. I don't think it is possible
for oslo.messaging to break into two packages, like oslo.messaging and
oslo.messaging.zeromq. And I cannot see the benefit clearly.

For priorities, I think the number 1, 6 and 7 have the high priority,
especially 7. Because ZeroMQ is pretty new for everyone, we do need
more paper work to promote and introduce it to the community. By the
way, I made a wiki before and everyone is welcome to update it [4].

[1] https://review.openstack.org/#/c/152471/
[2] https://review.openstack.org/#/c/155673/
[3] https://review.openstack.org/#/c/130943/
[4] https://wiki.openstack.org/wiki/ZeroMQ

Thanks a lot,
Li Ma (Nick)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][messaging][zmq] Discussion on zmq driver design issues

2015-03-05 Thread Eric Windisch
On Wed, Mar 4, 2015 at 12:10 PM, ozamiatin ozamia...@mirantis.com wrote:

 Hi,

 By this e-mail I'd like to start a discussion about current zmq driver
 internal design problems I've found out.
 I wish to collect here all proposals and known issues. I hope this
 discussion will be continued on Liberty design summit.
 And hope it will drive our further zmq driver development efforts.

 ZMQ Driver issues list (I address all issues with # and references are in
 []):

 1. ZMQContext per socket (blocker is neutron improper usage of messaging
 via fork) [3]
 2. Too many different contexts.
 We have InternalContext used for ZmqProxy, RPCContext used in
 ZmqReactor, and ZmqListener.
 There is also zmq.Context which is zmq API entity. We need to consider
 a possibility to unify their usage over inheritance (maybe stick to
 RPCContext)
 or to hide them as internal entities in their modules (see refactoring
 #6)


The code, when I abandoned it, was moving toward fixing these issues, but
for backwards compatibility was doing so in a staged fashion across the
stable releases.

I agree it's pretty bad. Fixing this now, with the driver in a less stable
state should be easier, as maintaining compatibility is of less importance.



 3. Topic related code everywhere. We have no topic entity. It is all
 string operations.
 We need some topics management entity and topic itself as an entity
 (not a string).
 It causes issues like [4], [5]. (I'm already working on it).
 There was a spec related [7].


Good! It's ugly. I had proposed a patch at one point, but I believe the
decision was that it was better and cleaner to move toward the
oslo.messaging abstraction as we solve the topic issue. Now that
oslo.messaging exists, I agree it's well past time to fix this particular
ugliness.


 4. Manual implementation of messaging patterns.
Now we can observe poor usage of zmq features in zmq driver. Almost
 everything is implemented over PUSH/PULL.

 4.1 Manual polling - use zmq.Poller (listening and replying for
 multiple sockets)
 4.2 Manual request/reply implementation for call [1].
 Using of REQ/REP (ROUTER/DEALER) socket solves many issues. A lot
 of code may be reduced.
 4.3 Timeouts waiting


There are very specific reasons for the use of PUSH/PULL. I'm firmly of the
belief that it's the only viable solution for an OpenStack RPC driver. This
has to do with how asynchronous programming in Python is performed, with
how edge-triggered versus level-triggered events are processed, and general
state management for REQ/REP sockets.

I could be proven wrong, but I burned quite a bit of time in the beginning
of the ZMQ effort looking at REQ/REP before realizing that PUSH/PULL was
the only reasonable solution. Granted, this was over 3 years ago, so I
would not be too surprised if my assumptions are no longer valid.



 5. Add possibility to work without eventlet [2]. #4.1 is also related
 here, we can reuse many of the implemented solutions
like zmq.Poller over asynchronous sockets in one separate thread
 (instead of spawning on each new socket).
I will update the spec [2] on that.


Great. This was one of the motivations behind oslo.messaging and it would
be great to see this come to fruition.


 6. Put all zmq driver related stuff (matchmakers, most classes from
 zmq_impl) into a separate package.
Don't keep all classes (ZmqClient, ZmqProxy, Topics management,
 ZmqListener, ZmqSocket, ZmqReactor)
in one impl_zmq.py module.


Seems fine. In fact, I think a lot of code could be shared with an AMQP v1
driver...


 7. Need more technical documentation on the driver like [6].
I'm willing to prepare a current driver architecture overview with some
 graphics UML charts, and to continue discuss the driver architecture.


Documentation has always been a sore point. +2

-- 
Regards,
Eric Windisch
ᐧ
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][messaging][zmq] Discussion on zmq driver design issues

2015-03-04 Thread Doug Hellmann
Thanks for pulling this list together, Oleksii. More comments inline. -
Doug

On Wed, Mar 4, 2015, at 12:10 PM, ozamiatin wrote:
 Hi,
 
 By this e-mail I'd like to start a discussion about current zmq driver 
 internal design problems I've found out.
 I wish to collect here all proposals and known issues. I hope this 
 discussion will be continued on Liberty design summit.
 And hope it will drive our further zmq driver development efforts.
 
 ZMQ Driver issues list (I address all issues with # and references are 
 in []):
 
 1. ZMQContext per socket (blocker is neutron improper usage of messaging 
 via fork) [3]

It looks like I had a question about managing backwards-compatibility on
that, and Mehdi responded that he thinks things are broken enough ZMQ
can't actually be used in production now. If that's true, then I agree
we don't need to be concerned about upgrades. Can you add a comment to
the review with your impression of the current version's suitability for
production use?

 
 2. Too many different contexts.
  We have InternalContext used for ZmqProxy, RPCContext used in 
 ZmqReactor, and ZmqListener.
  There is also zmq.Context which is zmq API entity. We need to 
 consider a possibility to unify their usage over inheritance (maybe 
 stick to RPCContext)
  or to hide them as internal entities in their modules (see 
 refactoring #6)
 
 
 3. Topic related code everywhere. We have no topic entity. It is all 
 string operations.
  We need some topics management entity and topic itself as an entity 
 (not a string).
  It causes issues like [4], [5]. (I'm already working on it).
  There was a spec related [7].
 
 
 4. Manual implementation of messaging patterns.
 Now we can observe poor usage of zmq features in zmq driver. Almost 
 everything is implemented over PUSH/PULL.
 
  4.1 Manual polling - use zmq.Poller (listening and replying for 
 multiple sockets)
  4.2 Manual request/reply implementation for call [1].
  Using of REQ/REP (ROUTER/DEALER) socket solves many issues. A 
 lot of code may be reduced.
  4.3 Timeouts waiting
 
 
 5. Add possibility to work without eventlet [2]. #4.1 is also related 
 here, we can reuse many of the implemented solutions
 like zmq.Poller over asynchronous sockets in one separate thread 
 (instead of spawning on each new socket).
 I will update the spec [2] on that.
 
 
 6. Put all zmq driver related stuff (matchmakers, most classes from 
 zmq_impl) into a separate package.
 Don't keep all classes (ZmqClient, ZmqProxy, Topics management, 
 ZmqListener, ZmqSocket, ZmqReactor)
 in one impl_zmq.py module.

The AMQP 1.0 driver work did something similar under a protocols
directory. It would be nice to be consistent with the existing work on
organizing driver-related files.

 
  _drivers (package)
  +-- impl_rabbit.py
  +-- impl_zmq.py - leave only ZmqDriver class here
  +-- zmq_driver (package)
  |+--- matchmaker.py
  |+--- matchmaker_ring.py
  |+--- matchmaker_redis.py
  |+--- matchmaker_.py
  ...
  |+--- client.py
  |+--- reactor.py
  |+--- proxy.py
  |+--- topic.py
  ...
 
 7. Need more technical documentation on the driver like [6].
 I'm willing to prepare a current driver architecture overview with 
 some graphics UML charts, and to continue discuss the driver
 architecture.
 
 Please feel free to add or to argue about any issue, I'd like to have 
 your feedback on these issues.

This looks like a good list, and I'm encouraged to see activity around
the ZMQ driver. I would like to see more participation in reviews for
the ZMQ-related specs before the summit, so we can use our time together
in person to resolve remaining issues rather than starting from scratch.

Doug

 Thanks.
 
 Oleksii Zamiatin
 
 
 References:
 
 [1] https://review.openstack.org/#/c/154094/
 [2] https://review.openstack.org/#/c/151185/
 [3] https://review.openstack.org/#/c/150735/
 [4] https://bugs.launchpad.net/oslo.messaging/+bug/1282297
 [5] https://bugs.launchpad.net/oslo.messaging/+bug/1381972
 [6] https://review.openstack.org/#/c/130943/8
 [7] https://review.openstack.org/#/c/144149/1
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev