Re: [openstack-dev] [Requirements] Freeze
No worries, good to know I did not miss anything about release procedures :-) Thanks, Dmitry On Mon, Jan 30, 2017 at 6:51 PM, Matthew Thode <prometheanf...@gentoo.org> wrote: > On 01/30/2017 03:24 AM, Dmitry Mescheryakov wrote: > > Hello Matthew, > > > > I see that you have frozen my > > CR https://review.openstack.org/#/c/425132/ , but it is for > > stable/newton. Should not freeze apply to master only? > > > > Thanks, > > > > Dmitry > > > > Yep, fixed, sorry about that. > > > -- > Matthew Thode (prometheanfire) > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Requirements] Freeze
Hello Matthew, I see that you have frozen my CR https://review.openstack.org/#/c/425132/ , but it is for stable/newton. Should not freeze apply to master only? Thanks, Dmitry On Wed, Jan 25, 2017 at 12:22 AM, Matthew Thodewrote: > We are going to be freezing Thursday at ~20:00 UTC. > > So if you need any changes we'll be needing needing them in soon, with > reasoning. Thanks. > > -- > Matthew Thode (prometheanfire) > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [fuel][plugins] Detached components plugin update requirement
Sergii, I am curious - does it mean that the plugins will stop working with older versions of Fuel? Thanks, Dmitry 2016-01-20 19:58 GMT+03:00 Sergii Golovatiuk: > Hi, > > Recently I merged the change to master and 8.0 that moves one task from > Nailgun to Library [1]. Actually, it replaces [2] to allow operator more > flexibility with repository management. However, it affects the detached > components as they will require one more task to add as written at [3]. > Please adapt your plugin accordingly. > > [1] > https://review.openstack.org/#/q/I1b83e3bfaebecdb8455d5697e320f24fb4941536 > [2] > https://github.com/openstack/fuel-web/blob/master/nailgun/nailgun/orchestrator/tasks_serializer.py#L149-L190 > [3] https://review.openstack.org/#/c/270232/1/deployment_tasks.yaml > > -- > Best regards, > Sergii Golovatiuk, > Skype #golserge > IRC #holser > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [fuel][plugins] Detached components plugin update requirement
2016-01-20 22:34 GMT+03:00 Sergii Golovatiuk <sgolovat...@mirantis.com>: > Plugin master branch won't be compatible with older versions. Though the > plugin developer may create stable branch to have compatibility with older > versions. > Got it, thank you for clarifying this. Dmitry > -- > Best regards, > Sergii Golovatiuk, > Skype #golserge > IRC #holser > > On Wed, Jan 20, 2016 at 6:41 PM, Dmitry Mescheryakov < > dmescherya...@mirantis.com> wrote: > >> Sergii, >> >> I am curious - does it mean that the plugins will stop working with older >> versions of Fuel? >> >> Thanks, >> >> Dmitry >> >> 2016-01-20 19:58 GMT+03:00 Sergii Golovatiuk <sgolovat...@mirantis.com>: >> >>> Hi, >>> >>> Recently I merged the change to master and 8.0 that moves one task from >>> Nailgun to Library [1]. Actually, it replaces [2] to allow operator more >>> flexibility with repository management. However, it affects the detached >>> components as they will require one more task to add as written at [3]. >>> Please adapt your plugin accordingly. >>> >>> [1] >>> https://review.openstack.org/#/q/I1b83e3bfaebecdb8455d5697e320f24fb4941536 >>> [2] >>> https://github.com/openstack/fuel-web/blob/master/nailgun/nailgun/orchestrator/tasks_serializer.py#L149-L190 >>> [3] https://review.openstack.org/#/c/270232/1/deployment_tasks.yaml >>> >>> -- >>> Best regards, >>> Sergii Golovatiuk, >>> Skype #golserge >>> IRC #holser >>> >>> >>> __ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: >>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel] Extend FFE for "Disable queue mirroring for RPC queues in RabbitMQ"
Folks, First, let me report current feature status: we continued the work with Bogdan Dobrelya and Sergii Golovatiuk. I have incorporated their feedback into the change. Also, I have fully tested it on custom ISO and Fuel CI passes successfully. Also, I have an approval from Bogdan on the implementation of the change (see his +2 before he casted -2 because of FF). Now, I still would like to ask the change to be merged into 8.0 due to the following reasons: 1. It is small and isolated 2. It is disabled by default and marked as experimental 3. It promises big value by reducing load on RabbitMQ, which becomes bottleneck on big environments Thanks, Dmitry The CR: https://review.openstack.org/#/c/249180 2015-12-08 13:11 GMT+03:00 Igor Kalnitsky <ikalnit...@mirantis.com>: > Hey Dmitry, > > Despite the fact the feature promises performance boost (IIUC) and > it's really nice to have it, I agree with Mike's opinion - it's late > to continue working on features. Each delay means less time to test > it, and we need to focus on quality. > > I'm sorry, but I have to say "No" on requested exception. > > - Igor > > On Tue, Dec 8, 2015 at 9:55 AM, Mike Scherbakov > <mscherba...@mirantis.com> wrote: > > Hi Dmitry, > > as much as I support the change, and glad that we got time for it, my > > opinion is that we should not extend a FFE. I have following reasons to > > think this way: > > > > 1) Feature Freeze is time based milestone, with the rational "FF ensures > > that sufficient share of the ReleaseCycle is dedicated to QA, until we > > produce the first release candidates. Limiting the changes that affect > the > > behavior of the software allow for consistent testing and efficient > > bugfixing" [1]. Even though this feature will be disabled by default, it > is > > important to note the first part of this rationale - we need to focus on > > stability now, not on features. > > 2) 7 FFEs for Fuel [2] I'd subjectively consider as high number, as in > total > > there are ~25 major blueprints to be delivered. Dmitry, our PTL, > > unfortunately is absent for a couple of weeks, but his opinion is quite > > similar: "The list of exceptions is much longer than I'd like, and some > have > > larger impact than I'd like, lets all of us make sure we don't come to > > regret granting these exceptions." [3]. Taking any exception further > means > > moving FF, in fact. That means moving of release date, which I don't > think > > we should even consider doing. > > 3) Exception to exception, in my opinion, should only be allowed in > > extremely rare cases for essential features only. When it becomes clear > that > > the whole release has a major gap or serious issue, which can only be > > resolved by finishing an essential feature. I have no evidence to think > that > > this functionality, which will be disabled by default, can fall into this > > category. > > 4) Changeset [4] has a change to the packaging spec. Any small change to > > packaging after FF imposes additional risk, as there is no good test > > automation for such kind of changes. Even if it's just include of a new > > file. In case of regression, we may easily lose a day for figuring out > what > > is wrong and reverting a change. > > > > I'd like to hear component leads while PTL is absent these days > > > > [1] https://wiki.openstack.org/wiki/FeatureFreeze > > [2] > > > http://lists.openstack.org/pipermail/openstack-dev/2015-December/081131.html > > [3] > > > http://lists.openstack.org/pipermail/openstack-dev/2015-December/081149.html > > [4] https://review.openstack.org/#/c/249180/ > > > > On Mon, Dec 7, 2015 at 2:30 PM Adam Heczko <ahec...@mirantis.com> wrote: > >> > >> Hello Dmitry, > >> I like this idea and very much appreciate it. > >> +1 from me :) > >> > >> A. > >> > >> On Mon, Dec 7, 2015 at 9:48 PM, Dmitry Mescheryakov > >> <dmescherya...@mirantis.com> wrote: > >>> > >>> Hello folks, > >>> > >>> I'd like to request extension of current FFE for the feature [1]. > During > >>> the three FFE days we merged the spec [2] after big discussion and > made a > >>> couple iterations over the implementation [3]. We had a chat with > Bogdan on > >>> how to progress and here are the action items still need to be done: > >>> * part of the change responsible for RabbitMQ policy need to be > >>> upstreamed first to RabbitMQ repo. > >>> * the change needs to be review an
[openstack-dev] [Fuel] Extend FFE for "Disable queue mirroring for RPC queues in RabbitMQ"
Hello folks, I'd like to request extension of current FFE for the feature [1]. During the three FFE days we merged the spec [2] after big discussion and made a couple iterations over the implementation [3]. We had a chat with Bogdan on how to progress and here are the action items still need to be done: * part of the change responsible for RabbitMQ policy need to be upstreamed first to RabbitMQ repo. * the change needs to be review and merged by our library folks. Overall I think that 2-3 more days should be enough to finish it. What do you think folks? Dmitry [1] https://blueprints.launchpad.net/fuel/+spec/rabbitmq-disable-mirroring-for-rpc [2] https://review.openstack.org/247517 [3] https://review.openstack.org/249180 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel][FFE] Disabling HA for RPC queues in RabbitMQ
2015-12-02 13:11 GMT+03:00 Bogdan Dobrelya <bdobre...@mirantis.com>: > On 01.12.2015 23:34, Peter Lemenkov wrote: > > Hello All! > > > > Well, side-effects (or any other effects) are quite obvious and > > predictable - this will decrease availability of RPC queues a bit. > > That's for sure. > > And consistency. Without messages and queues being synced between all of > the rabbit_hosts, how exactly dispatching rpc calls would work then > workers connected to different AMQP urls? > There will be no problem with consistency here. Since we will disable HA, queues will not be synced across the cluster and there will be exactly one node hosting messages for a queue. > Perhaps that change would only raise the partitions tolerance to the > very high degree? But this should be clearly shown by load tests - under > network partitions with mirroring against network partitions w/o > mirroring. Rally could help here a lot. Nope, the change will not increase partitioning tolerance at all. What I expect is that it will not get worse. Regarding tests, sure we are going to perform destructive testing to verify that there is no regression in recovery time. > > > > > However, Dmitry's guess is that the overall messaging backplane > > stability increase (RabitMQ won't fail too often in some cases) would > > compensate for this change. This issue is very much real - speaking of > > Agree, that should be proven by (rally) tests for the specific case I > described in the spec [0]. Please correct it as I may understand things > wrong, but here it is: > - client 1 submits RPC call request R to the server 1 connected to the > AMQP host X > - worker A listens for jobs topic to the AMQP host X > - worker B listens for jobs topic to the AMQP host Y > - a job by the R was dispatched to the worker B > Q: would the B never receive its job message because it just cannot see > messages at the X? > Q: timeout failure as the result. > > And things may go even much more weird for more complex scenarios. > Yes, in the described scenario B will receive the job. Node Y will proxy B listening to node X. So, we will not experience timeout. Also, I have replied in the review. > > [0] https://review.openstack.org/247517 > > > me I've seen an awful cluster's performance degradation when a failing > > RabbitMQ node was killed by some watchdog application (or even worse > > wasn't killed at all). One of these issues was quite recently, and I'd > > love to see them less frequently. > > > > That said I'm uncertain about the stability impact of this change, yet > > I see a reasoning worth discussing behind it. > > I would support this to the 8.0 if only proven by the load tests within > scenario I described plus standard destructive tests As I said in my initial email, I've run boot_and_delete_server_with_secgroups Rally scenario to verify my change. I think I should provide more details: Scale team considers this test to be the worst case we have for RabbitMQ. I've ran the test on 200 nodes lab and what I saw is that when I disable HA, test time becomes 2 times smaller. That clearly shows that there is a test where our current messaging system is bottleneck and just tuning it considerably improves performance of OpenStack as a whole. Also while there was small fail rate for HA mode (around 1-2%), in non-HA mode all tests always completed successfully. Overall, I think current results are already enough to consider the change useful. What is left is to confirm that it does not make our failover worse. > > > > 2015-12-01 20:53 GMT+01:00 Sergii Golovatiuk <sgolovat...@mirantis.com>: > >> Hi, > >> > >> -1 for FFE for disabling HA for RPC queue as we do not know all side > effects > >> in HA scenarios. > >> > >> On Tue, Dec 1, 2015 at 7:34 PM, Dmitry Mescheryakov > >> <dmescherya...@mirantis.com> wrote: > >>> > >>> Folks, > >>> > >>> I would like to request feature freeze exception for disabling HA for > RPC > >>> queues in RabbitMQ [1]. > >>> > >>> As I already wrote in another thread [2], I've conducted tests which > >>> clearly show benefit we will get from that change. The change itself > is a > >>> very small patch [3]. The only thing which I want to do before > proposing to > >>> merge this change is to conduct destructive tests against it in order > to > >>> make sure that we do not have a regression here. That should take just > >>> several days, so if there will be no other objections, we will be able > to > >>> merge the change in a week or two timeframe. > >>> >
Re: [openstack-dev] [Fuel][FFE] Disabling HA for RPC queues in RabbitMQ
2015-12-02 16:52 GMT+03:00 Jordan Pittier <jordan.pitt...@scality.com>: > > On Wed, Dec 2, 2015 at 1:05 PM, Dmitry Mescheryakov < > dmescherya...@mirantis.com> wrote: > >> >> >> My point is simple - lets increase our architecture scalability by 2-3 >> times by _maybe_ causing more errors for users during failover. The >> failover time itself should not get worse (to be tested by me) and errors >> should be correctly handler by services anyway. >> > > Scalability is great, but what about correctness ? > Jordan, users will encounter problems only when some of RabbitMQ nodes go down. Under normal circumstances it will not cause any additional errors. And when RabbitMQ goes down and oslo.messaging fails over to alive hosts, we anyway have couple minutes messaging downtime at the moment, which disrupts almost all RPC calls. On the other side, disabling mirroring greatly reduces chances a RabbitMQ node goes down due to high load. > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel][FFE] Disabling HA for RPC queues in RabbitMQ
2015-12-02 12:48 GMT+03:00 Sergii Golovatiuk <sgolovat...@mirantis.com>: > Hi, > > > On Tue, Dec 1, 2015 at 11:34 PM, Peter Lemenkov <lemen...@gmail.com> > wrote: > >> Hello All! >> >> Well, side-effects (or any other effects) are quite obvious and >> predictable - this will decrease availability of RPC queues a bit. >> That's for sure. >> > > Imagine the case when user creates VM instance, and some nova messages are > lost. I am not sure we want half-created instances. Who is going to clean > up them? Since we do not have results of destructive tests, I vote -2 for > FFE for this feature. > Sergii, actually messaging layer can not provide any guarantee that it will not happen even if all messages are preserved. Assume the following scenario: * nova-scheduler (or conductor?) sends request to nova-compute to spawn a VM * nova-compute receives the message and spawned the VM * due to some reason (rabbitmq unavailable, nova-compute lagged) nova-compute did not respond within timeout (1 minute, I think) * nova-scheduler does not get response within 1 minute and marks the VM with Error status. In that scenario no message was lost, but still we have a VM half spawned and it is up to Nova to handle the error and do the cleanup in that case. Such issue already happens here and there when something glitches. For instance our favorite MessagingTimeout exception could be caused by such scenario. Specifically, in that example when nova-scheduler times out waiting for reply, it will throw exactly that exception. My point is simple - lets increase our architecture scalability by 2-3 times by _maybe_ causing more errors for users during failover. The failover time itself should not get worse (to be tested by me) and errors should be correctly handler by services anyway. >> However, Dmitry's guess is that the overall messaging backplane >> stability increase (RabitMQ won't fail too often in some cases) would >> compensate for this change. This issue is very much real - speaking of >> me I've seen an awful cluster's performance degradation when a failing >> RabbitMQ node was killed by some watchdog application (or even worse >> wasn't killed at all). One of these issues was quite recently, and I'd >> love to see them less frequently. >> >> That said I'm uncertain about the stability impact of this change, yet >> I see a reasoning worth discussing behind it. >> >> 2015-12-01 20:53 GMT+01:00 Sergii Golovatiuk <sgolovat...@mirantis.com>: >> > Hi, >> > >> > -1 for FFE for disabling HA for RPC queue as we do not know all side >> effects >> > in HA scenarios. >> > >> > On Tue, Dec 1, 2015 at 7:34 PM, Dmitry Mescheryakov >> > <dmescherya...@mirantis.com> wrote: >> >> >> >> Folks, >> >> >> >> I would like to request feature freeze exception for disabling HA for >> RPC >> >> queues in RabbitMQ [1]. >> >> >> >> As I already wrote in another thread [2], I've conducted tests which >> >> clearly show benefit we will get from that change. The change itself >> is a >> >> very small patch [3]. The only thing which I want to do before >> proposing to >> >> merge this change is to conduct destructive tests against it in order >> to >> >> make sure that we do not have a regression here. That should take just >> >> several days, so if there will be no other objections, we will be able >> to >> >> merge the change in a week or two timeframe. >> >> >> >> Thanks, >> >> >> >> Dmitry >> >> >> >> [1] https://review.openstack.org/247517 >> >> [2] >> >> >> http://lists.openstack.org/pipermail/openstack-dev/2015-December/081006.html >> >> [3] https://review.openstack.org/249180 >> >> >> >> >> __ >> >> OpenStack Development Mailing List (not for usage questions) >> >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> > >> > >> > >> __ >> > OpenStack Development Mailing List (not for usage questions) >> > Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> >> >> >> -- >> With
[openstack-dev] [Fuel] Disabling HA for RPC queues in RabbitMQ
Hello guys, I would like to propose to disable HA for OpenStack RPC queues. The rationale is to reduce load on RabbitMQ by removing necessity for it to replicate messages across the cluster. You can find more details about proposal in spec [1]. To what is in the spec I can add that I've ran a test on scale which confirms that there is at least one case where our messaging stack is bottleneck currently. That is a Rally boot_and_delete_server_with_secgroups run against setup with Neutron VXLAN with DVR. Just removing HA policy reduces the test time 2 times and increases message throughput 2-3 times. I think that is a very clear indication of benefit we can get. I do understand that we are almost in Feature Freeze, so I will request a feature freeze exception for that change in a separate thread with detailed plan. Thanks, Dmitry [1] https://review.openstack.org/#/c/247517/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Fuel][FFE] Disabling HA for RPC queues in RabbitMQ
Folks, I would like to request feature freeze exception for disabling HA for RPC queues in RabbitMQ [1]. As I already wrote in another thread [2], I've conducted tests which clearly show benefit we will get from that change. The change itself is a very small patch [3]. The only thing which I want to do before proposing to merge this change is to conduct destructive tests against it in order to make sure that we do not have a regression here. That should take just several days, so if there will be no other objections, we will be able to merge the change in a week or two timeframe. Thanks, Dmitry [1] https://review.openstack.org/247517 [2] http://lists.openstack.org/pipermail/openstack-dev/2015-December/081006.html [3] https://review.openstack.org/249180 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel][Plugins] Plugin deployment questions
Hello folks, I second Patrick's idea. In our case we would like to install standalone RabbitMQ cluster with Fuel reference architecture to perform destructive tests on it. Requirement to install controller is an excessive burden in that case. Thanks, Dmitry 2015-10-19 13:44 GMT+03:00 Patrick Petit: > Hi There, > > There are situations where we’d like to deploy only Fuel plugins in an > environment. > That’s typically the case with Elasticsearch and InfluxDB plugins of LMA > tools. > Currently it’s not possible because you need to at least have one > controller. > What exactly is making that limitation? How hard would it be to have it > removed? > > Thanks > Patrick > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel][RabbitMQ][puppet] Kilo status of the heartbeats implementation
Bogdan, Answering your questions, in MOS 7.0 source code heartbeats are enabled by default (with heartbeat_timeout_threshold value set to 60). We patched our version of upstream stable/kilo to do so. So, in installed env heartbeats are enabled for every component except Neutron, for which puppet manifests explicitly disable heartbeats, as you already noted. Regarding how should we proceed, right now we are testing how upstream implementation performs in our environments. I suggest to postpone the decision until we have enough data. Thanks, Dmitry 2015-07-28 12:06 GMT+03:00 Bogdan Dobrelya bdobre...@mirantis.com: Folks, it seems the situation with Kilo support for RabbitMQ heartbeats should be elaborated. There is a bug [0] and a ML [1] related. The questions are: a) Should Fuel 7.0 with Kilo *explicitly* disable via puppet the upstream implementation of heartbeats for all OpenStack components (Neutron example [2]) and keep the MOS specific implementation of heartbeats configured the same way as it was for Juno? b) Or should we change nothing additionally allowing Oslo defaults for Kilo being populated for heartbeats settings out of box? Related question - what are upstream heartbeat defaults in MOS, do they differ to upstream ones? [0] https://bugs.launchpad.net/fuel/+bug/1477689 [1] http://lists.openstack.org/pipermail/openstack-dev/2015-July/068751.html [2] https://review.openstack.org/#/c/194381/ -- Best regards, Bogdan Dobrelya, Irc #bogdando __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel] Default templates for Sahara feature in 6.0
Oops, the last line should be read as On the other side, it is a nice UX feature we really want to have 6.0 Dmitry 2014-11-15 3:50 GMT+03:00 Dmitry Mescheryakov dmescherya...@mirantis.com: Dmitry, Lets review the CR from the point of danger to current deployment process: in the essence it is 43 lines of change in puppet module. The module calls a shell script which always returns 0. So whatever happens inside, the deployment will not fail. The only changes (non-get requests) the script does, it does to Sahara. It tries to upload cluster and node-group templates. That is not dangerous operation for Sahara - in the worst case the templates will just not be created and that is all. It will not affect Sahara correctness in any way. On the other side, it is a nice UX feature we really want to have 5.1.1. Thanks, Dmitry 2014-11-15 3:04 GMT+03:00 Dmitry Borodaenko dborodae...@mirantis.com: +286 lines a week after Feature Freeze, IMHO it's too late to make an exception for this one. On Wed, Nov 12, 2014 at 7:37 AM, Dmitry Mescheryakov dmescherya...@mirantis.com wrote: Hello fuelers, I would like to request you merging CR [1] which implements blueprint [2]. It is a nice UX feature we really would like to have in 6.0. On the other side, the implementation is really small: it is a small piece of puppet which runs a shell script. The script always exits with 0, so the change should not be dangerous. Other files in the change are used in the shell script only. Please consider reviewing and merging this though we've already reached FF. Thanks, Dmitry [1] https://review.openstack.org/#/c/132196/ [2] https://blueprints.launchpad.net/mos/+spec/sahara-create-default-templates ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Dmitry Borodaenko ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel] Default templates for Sahara feature in 6.0
Dmitry, Lets review the CR from the point of danger to current deployment process: in the essence it is 43 lines of change in puppet module. The module calls a shell script which always returns 0. So whatever happens inside, the deployment will not fail. The only changes (non-get requests) the script does, it does to Sahara. It tries to upload cluster and node-group templates. That is not dangerous operation for Sahara - in the worst case the templates will just not be created and that is all. It will not affect Sahara correctness in any way. On the other side, it is a nice UX feature we really want to have 5.1.1. Thanks, Dmitry 2014-11-15 3:04 GMT+03:00 Dmitry Borodaenko dborodae...@mirantis.com: +286 lines a week after Feature Freeze, IMHO it's too late to make an exception for this one. On Wed, Nov 12, 2014 at 7:37 AM, Dmitry Mescheryakov dmescherya...@mirantis.com wrote: Hello fuelers, I would like to request you merging CR [1] which implements blueprint [2]. It is a nice UX feature we really would like to have in 6.0. On the other side, the implementation is really small: it is a small piece of puppet which runs a shell script. The script always exits with 0, so the change should not be dangerous. Other files in the change are used in the shell script only. Please consider reviewing and merging this though we've already reached FF. Thanks, Dmitry [1] https://review.openstack.org/#/c/132196/ [2] https://blueprints.launchpad.net/mos/+spec/sahara-create-default-templates ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Dmitry Borodaenko ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Fuel] Default templates for Sahara feature in 6.0
Hello fuelers, I would like to request you merging CR [1] which implements blueprint [2]. It is a nice UX feature we really would like to have in 6.0. On the other side, the implementation is really small: it is a small piece of puppet which runs a shell script. The script always exits with 0, so the change should not be dangerous. Other files in the change are used in the shell script only. Please consider reviewing and merging this though we've already reached FF. Thanks, Dmitry [1] https://review.openstack.org/#/c/132196/ [2] https://blueprints.launchpad.net/mos/+spec/sahara-create-default-templates ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [keystone][swift] Has anybody considered storing tokens in Swift?
Hey Jay, Did you consider Swift's eventual consistency? The general use case for many OpenStack application is: 1. obtain the token from Keystone 2. perform some operation in OpenStack providing token as credentials. As a result of operation #1 the token will be saved into Swift by the Keystone. But due to eventual consistency it could happen that validation of token in operation #2 will not see the saved token. Probability depends on time gap between ops #1 and #2: the smaller the gap, the higher is probability (less time to sync). Also it depends on Swift installation size: the bigger is installation, the higher is probability (bigger 'space' for inconsistency). I believe that I've seen such inconsistency in Rackspace Cloud Files a couple of years ago. We uploaded a file using an application into the Files, but saw it in browser only a couple minutes later. It is my understanding that Ceph exposing Swift API is not affected though, as it is strongly consistent. Thanks, Dmitry 2014-09-29 20:12 GMT+04:00 Jay Pipes jaypi...@gmail.com: Hey Stackers, So, I had a thought this morning (uh-oh, I know...). What if we wrote a token driver in Keystone that uses Swift for backend storage? I have long been an advocate of the memcache token driver versus the SQL driver for performance reasons. However, the problem with the memcache token driver is that if you want to run multiple OpenStack regions, you could share the identity data in Keystone using replicated database technology (mysql galera/PXC, pgpool II, or even standard mysql master/slave), but each region needs to have its own memcache service for tokens. This means that tokens are not shared across regions, which means that users have to log in separately to each region's dashboard. I personally considered this a tradeoff worth accepting. But then, today, I thought... what about storing tokens in a globally-distributed Swift cluster? That would take care of the replication needs automatically, since Swift would do the needful. And, add to that, Swift was designed for storing lots of small objects, which tokens are... Thoughts? I think it would be a cool dogfooding effort if nothing else, and give users yet another choice in how they handle multi-region tokens. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Sahara][Doc] Updating documentation for overview; Needs changing image
Hello, I used google docs to create the initial image. If you want to edit that one, copy the doc[1] to your drive and edit it. It is not the latest version of the image, but the only difference is that this one has the very first project name EHO in place of Sahara. Thanks, Dmitry [1] https://docs.google.com/a/mirantis.com/drawings/d/1kCahSrGI0OvPeQBcqjX9GV54GZYBZpt_W4nOt5nsKB8/edit 2014-09-22 16:29 GMT+04:00 Sharan Kumar M sharan.monikan...@gmail.com: Hi all, The bug on updating documentation for overview / details https://bugs.launchpad.net/sahara/+bug/1350063 also requires the changing of image openstack-interop.png. So is there any specific tool used for creating the image? Since I am working on fixing this bug, I thought I could also update the image. Thanks, Sharan Kumar M ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Fuel] Doc team working with bugs
Hello Fuelers, On the previous meeting a topic was raised on how Fuel doc team should work with bugs, see [1] for details. We agreed to move the discussion into the mailing list. The thing is there are two members in the team at the moment (Meg and Irina) and they need to distribute work among themselves. The natural way to distribute load is to assign bugs. But frequently they document bugs which are in the process of being fixed, so they are already assigned to an engineer. I.e. a bug needs to be assigned to an engineer and a tech writer at the same time. I've proposed to create a separate series 'docs' in launchpad (it is the thing like '5.0.x', '5.1.x'). If bug affects several series, a different engineer could be assigned on each of them. So doc team will be free to assign bugs to themselves within this new series. Mike Scherbakov and Dmitry Borodaenko objected creating another series in launchpad. Instead they proposed to mark bugs with tags like 'docs-irina' and 'docs-meg' thus assigning them. What do you think is the best way to handle this? As for me, I don't have strong preference there. One last note: the question applies to two launchpad projects actually: Fuel and MOS. But naturally we want to do this the same way in both projects. Thanks, Dmitry [1] http://eavesdrop.openstack.org/meetings/fuel/2014/fuel.2014-09-04-15.59.log.html#l-310 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Fuel] Working on 6.0 and new releases in general
Hello Fuelers, Right now we have the following policy in place: the branches for a release are opened only after its 'parent' release have reached hard code freeze (HCF). Say, 5.1 release is parent releases for 5.1.1 and 6.0. And that is the problem: if parent release is delayed, we can't properly start development of a child release because we don't have branches to commit. That is current issue with 6.0: we already started to work on pushing Juno in to 6.0, but if we are to make changes to our deployment code we have nowhere to store them. IMHO the issue could easily be resolved by creation of pre-release branches, which are merged together with parent branches once the parent reaches HCF. Say, we use branch 'pre-6.0' for initial development of 6.0. Once 5.1 reaches HCF, we merge pre-6.0 into master and continue development here. After that pre-6.0 is abandoned. What do you think? Thanks, Dmitry ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Sahara] Swift authentication and backwards compatibility
Hello people, I think backward compatibility is a good idea. We can make the user/pass inputs for data objects optional (they are required currently), maybe even gray them out in the UI with a checkbox to turn them on, or something like that. This is similar to what I was thinking. We would allow the username and password inputs to accept a blank input. I like the idea of keeping backward compatibility by supporting username/password. And also I really dislike one more config (domain for temp users) to be mandatory. So supporting old behaviour here also simplifies deployment, which is good especially for new users. Thanks, Dmitry 2014-08-15 18:04 GMT+04:00 mike mccune mimcc...@redhat.com: thanks for the thoughts Trevor, On 08/15/2014 09:32 AM, Trevor McKay wrote: I think backward compatibility is a good idea. We can make the user/pass inputs for data objects optional (they are required currently), maybe even gray them out in the UI with a checkbox to turn them on, or something like that. This is similar to what I was thinking. We would allow the username and password inputs to accept a blank input. I also like the idea of giving some sort of visual reference, like graying out the fields. Sahara can detect whether or not the proxy domain is there, and whether or not it can be created. If Sahara ends up in a situation where it thinks user/pass are required, but the data objects don't have them, we can return a meaningful error. I think it sounds like we are going to avoid having Sahara attempt to create a domain. It will be the duty of a stack administrator to create the domain and give it's name in the sahara.conf file. Agreed about meaning errors. The job manager can key off of the values supplied for the data source objects (no user/pass? must be proxy) and/or cluster configs (for instance, a new cluster config could be added -- if it's absent we assume old cluster and therefore old hadoop swfit plugin). Workflow can be generated accordingly. This sounds good. If there is some way to determine the version of the hadoop-swiftfs on the cluster that would be ideal. The hadoop swift plugin can look at the config values provided, as you noted yesterday, and get auth tokens in either manor. exactly. mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [sahara] team meeting minutes July 3
Again, thanks everyone who have joined Sahara meeting. Below are the logs from the meeting. Minutes: http://eavesdrop.openstack.org/meetings/sahara/2014/sahara.2014-07-03-18.06.html Logs: http://eavesdrop.openstack.org/meetings/sahara/2014/sahara.2014-07-03-18.06.log.html Thanks, Dmitry ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [sahara] 2014.1.1 preparation
I agree with Andrew and actually think that we do need to have https://review.openstack.org/#/c/87573 (Fix running EDP job on transient cluster) fixed in stable branch. We also might want to add https://review.openstack.org/#/c/93322/ (Create trusts for admin user with correct tenant name). This is another fix for transient clusters, but it is not even merged into master branch yet. Thanks, Dmitry 2014-06-03 13:27 GMT+04:00 Sergey Lukjanov slukja...@mirantis.com: Here is etherpad to track preparation - https://etherpad.openstack.org/p/sahara-2014.1.1 On Tue, Jun 3, 2014 at 10:08 AM, Sergey Lukjanov slukja...@mirantis.com wrote: /me proposing to backport: Docs: https://review.openstack.org/#/c/87531/ Change IRC channel name to #openstack-sahara https://review.openstack.org/#/c/96621/ Added validate_edp method to Plugin SPI doc https://review.openstack.org/#/c/89647/ Updated architecture diagram in docs EDP: https://review.openstack.org/#/c/93564/ https://review.openstack.org/#/c/93564/ On Tue, Jun 3, 2014 at 10:03 AM, Sergey Lukjanov slukja...@mirantis.com wrote: Hey folks, this Thu, June 5 is the date for 2014.1.1 release. We already have some back ported patches to the stable/icehouse branch, so, the question is do we need some more patches to back port? Please, propose them here. 2014.1 - stable/icehouse diff: https://github.com/openstack/sahara/compare/2014.1...stable/icehouse Thanks. -- Sincerely yours, Sergey Lukjanov Sahara Technical Lead (OpenStack Data Processing) Principal Software Engineer Mirantis Inc. -- Sincerely yours, Sergey Lukjanov Sahara Technical Lead (OpenStack Data Processing) Principal Software Engineer Mirantis Inc. -- Sincerely yours, Sergey Lukjanov Sahara Technical Lead (OpenStack Data Processing) Principal Software Engineer Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Sahara] Split sahara-api into api and engine
Hello people, The following patch set splits monolithic sahara-api process into two - sahara-api and sahara-engine: https://review.openstack.org/#/c/90350/ After the change is merged, there will be three binaries to run Sahara: * sahara-all - runs Sahara all-in-one (like sahara-api does right now) * sahara-api - runs Sahara API endpoint, offloads 'heavy' tasks to sahara-engine * sahara-engine - executes tasks which are either 'heavy' or require remote connection to VMs Most probably you will want to keep running all-in-one process in your dev environment, so you need to switch using sahara-all instead of sahara-api. To make transition smooth, we've merged another change which adds sahara-all process as an alias to sahara-api. That means that you can switch to using sahara-all right now, so when the patch is merged, you will not notice that. Thanks, Dmitry ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [olso][neutron] proxying oslo.messaging from management network into tenant network/VMs
Hello Isaku, Thanks for sharing this! Right now in Sahara project we think to use Marconi as a mean to communicate with VM. Seems like you are familiar with the discussions happened so far. If not, please see links at the bottom of UnifiedGuestAgent [1] wiki page. In short we see Marconi's supports for multi-tenancy as a huge advantage over other MQ solutions. Our agent is network-based, so tenant isolation is a real issue here. For clarity, here is the overview scheme of network based agent: server - MQ (Marconi) - agent All communication goes over network. I've made a PoC of the Marconi driver for oslo.messaging, you can find it at [2] We also considered 'hypervisor-dependent' agents (as I called them in the initial thread) like the one you propose. They also provide tenant isolation. But the drawback is _much_ bigger development cost and more fragile and complex deployment. In case of network-based agent all the code is * Marconi driver for RPC library (oslo.messaging) * thin client for server to make calls * a guest agent with thin server-side If you write your agent on python, it will work on any OS with any host hypervisor. For hypervisor dependent-agent it becomes much more complex. You need one more additional component - a proxy-agent running on Compute host, which makes deployment harder. You also need to support various transports for various hypervisors: virtio-serial for KVM, XenStore for Xen, something for Hyper-V, etc. Moreover guest OS must have driver for these transports and you will probably need to write different implementation for different OSes. Also you mention that in some cases a second proxy-agent is needed and again in some cases only cast operations could be used. Using cast only is not an option for Sahara, as we do need feedback from the agent and sometimes getting the return value is the main reason to make an RPC call. I didn't see a discussion in Neutron on which approach to use (if it was, I missed it). I see simplicity of network-based agent as a huge advantage. Could you please clarify why you've picked design depending on hypervisor? Thanks, Dmitry [1] https://wiki.openstack.org/wiki/UnifiedGuestAgent [2] https://github.com/dmitrymex/oslo.messaging 2014-04-09 12:33 GMT+04:00 Isaku Yamahata isaku.yamah...@gmail.com: Hello developers. As discussed many times so far[1], there are many projects that needs to propagate RPC messages into VMs running on OpenStack. Neutron in my case. My idea is to relay RPC messages from management network into tenant network over file-like object. By file-like object, I mean virtio-serial, unix domain socket, unix pipe and so on. I've wrote some code based on oslo.messaging[2][3] and a documentation on use cases.[4][5] Only file-like transport and proxying messages would be in oslo.messaging and agent side code wouldn't be a part of oslo.messaging. use cases:([5] for more figures) file-like object: virtio-serial, unix domain socket, unix pipe server - AMQP - agent in host -virtio serial- guest agent in VM per VM server - AMQP - agent in host -unix socket/pipe- agent in tenant network - guest agent in VM So far there are security concerns to forward oslo.messaging from management network into tenant network. One approach is to allow only cast-RPC from server to guest agent in VM so that guest agent in VM only receives messages and can't send anything to servers. With unix pipe, it's write-only for server, read-only for guest agent. Thoughts? comments? Details of Neutron NFV use case[6]: Neutron services so far typically runs agents in host, the host agent in host receives RPCs from neutron server, then it executes necessary operations. Sometimes the agent in host issues RPC to neutron server periodically.(e.g. status report etc) It's desirable to make such services virtualized as Network Function Virtualizaton(NFV), i.e. make those features run in VMs. So it's quite natural approach to propagate those RPC message into agents into VMs. [1] https://wiki.openstack.org/wiki/UnifiedGuestAgent [2] https://review.openstack.org/#/c/77862/ [3] https://review.openstack.org/#/c/77863/ [4] https://blueprints.launchpad.net/oslo.messaging/+spec/message-proxy-server [5] https://wiki.openstack.org/wiki/Oslo/blueprints/message-proxy-server [6] https://blueprints.launchpad.net/neutron/+spec/adv-services-in-vms -- Isaku Yamahata isaku.yamah...@gmail.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [olso][neutron] proxying oslo.messaging from management network into tenant network/VMs
I agree those arguments. But I don't see how network-based agent approach works with Neutron network for now. Can you please elaborate on it? Here is the scheme of network-based agent: server - MQ (Marconi) - agent As Doug said, Marconi exposes REST API, just like any other OpenStack service. The services it provides are similar to the MQ ones (Rabbit MQ, Qpid, etc.). I.e. very simply there are methods: * put_message(queue_name, message_payload) * get_message(queue_name) Multi-tenancy is provided by the same means as in the other OpenStack projects - user supplies Keystone token in the request and it determines the tenant used. As for the network, a networking-based agent requires tcp connection to Marconi. I.e. you need an agent running on the VM to be able to connect to Marconi, but not vice versa. That does not sound like a harsh requirement. The standard MQ solutions like Rabbit and Qpid actually could be used here instead of Marconi with one drawback - it is really hard to reliably implement tenant isolation with them. Thanks, Dmitry 2014-04-09 17:38 GMT+04:00 Isaku Yamahata isaku.yamah...@gmail.com: Hello Dmitry. Thank you for reply. On Wed, Apr 09, 2014 at 03:19:10PM +0400, Dmitry Mescheryakov dmescherya...@mirantis.com wrote: Hello Isaku, Thanks for sharing this! Right now in Sahara project we think to use Marconi as a mean to communicate with VM. Seems like you are familiar with the discussions happened so far. If not, please see links at the bottom of UnifiedGuestAgent [1] wiki page. In short we see Marconi's supports for multi-tenancy as a huge advantage over other MQ solutions. Our agent is network-based, so tenant isolation is a real issue here. For clarity, here is the overview scheme of network based agent: server - MQ (Marconi) - agent All communication goes over network. I've made a PoC of the Marconi driver for oslo.messaging, you can find it at [2] I'm not familiar with Marconi, so please enlighten me first. How does MQ(Marconi) communicates both to management network and tenant network? Does it work with Neutron network? not nova-network. Neutron network isolates not only tenant networks each other, but also management network at L2. So openstack servers can't send any packets to VMs. VMs can't to openstack servers. This is the reason why neutron introduced HTTP proxy for instance metadata. It is also the reason why I choose to introduce new agent on host. If Marconi (or other porjects like sahara) already solved those issues, that's great. We also considered 'hypervisor-dependent' agents (as I called them in the initial thread) like the one you propose. They also provide tenant isolation. But the drawback is _much_ bigger development cost and more fragile and complex deployment. In case of network-based agent all the code is * Marconi driver for RPC library (oslo.messaging) * thin client for server to make calls * a guest agent with thin server-side If you write your agent on python, it will work on any OS with any host hypervisor. For hypervisor dependent-agent it becomes much more complex. You need one more additional component - a proxy-agent running on Compute host, which makes deployment harder. You also need to support various transports for various hypervisors: virtio-serial for KVM, XenStore for Xen, something for Hyper-V, etc. Moreover guest OS must have driver for these transports and you will probably need to write different implementation for different OSes. Also you mention that in some cases a second proxy-agent is needed and again in some cases only cast operations could be used. Using cast only is not an option for Sahara, as we do need feedback from the agent and sometimes getting the return value is the main reason to make an RPC call. I didn't see a discussion in Neutron on which approach to use (if it was, I missed it). I see simplicity of network-based agent as a huge advantage. Could you please clarify why you've picked design depending on hypervisor? I agree those arguments. But I don't see how network-based agent approach works with Neutron network for now. Can you please elaborate on it? thanks, Thanks, Dmitry [1] https://wiki.openstack.org/wiki/UnifiedGuestAgent [2] https://github.com/dmitrymex/oslo.messaging 2014-04-09 12:33 GMT+04:00 Isaku Yamahata isaku.yamah...@gmail.com: Hello developers. As discussed many times so far[1], there are many projects that needs to propagate RPC messages into VMs running on OpenStack. Neutron in my case. My idea is to relay RPC messages from management network into tenant network over file-like object. By file-like object, I mean virtio-serial, unix domain socket, unix pipe and so on. I've wrote some code based on oslo.messaging[2][3] and a documentation on use cases.[4][5] Only file-like transport and proxying messages would be in oslo.messaging and agent side code wouldn't be a part
Re: [openstack-dev] auto-delete in amqp reply_* queues in OpenStack
Ok, assuming that you've run that query against the 5 stuck queues I would expect the following results: * if an active listener for a queue lives on one of compute hosts: that queue was created by compute service initiating rpc command. Since you didn't restart them during switchover, the compute services still use the same queues. * if queue does not have a listener: the queue was created by the controller which was active before the switchover. That queue could have become stuck not exactly at previous switchover, but as well at some other switchover occurred in the past. 2014-03-25 0:33 GMT+04:00 Chris Friesen chris.frie...@windriver.com: On 03/24/2014 01:27 PM, Dmitry Mescheryakov wrote: I see two possible explanations for these 5 remaining queues: * They were indeed recreated by 'compute' services. I.e. controller service send some command over rpc and then it was shut down. Its reply queue was automatically deleted, since its only consumer was disconnected. The compute services replied after that and so recreated the queue. According to Rabbit MQ docs, such queue will be stuck alive indefinitely, since it will never have a consumer. * Possibly there are services on compute nodes which initiate RPC calls themselves. I don't know OpenStack architecture enough to say if services running on compute nodes do so. In that case these 5 queues are still used by compute services. Do Rabbit MQ management tools (web or cli) allow to view active consumers for queues? If yes, then you can find out which of the cases above you encountered. Or it maybe be some third case I didn't account for :-) It appears that the cli tools do not provide a way to print the info, but if you query a single queue via the web API it will give the IP address and port of the consumers for the queue. The vhost needs to be URL-encoded, so the query looks something like this: curl -i -u guest:guest http://192.168.204.2:15672/api/queues/%2f/reply_08e35acffe2c4078ae4603f08e9d0997 Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] auto-delete in amqp reply_* queues in OpenStack
Chris, In oslo.messaging a single reply queue is used to gather results from all the calls. It is created lazily on the first call and is used until the process is killed. I did a quick look at oslo.rpc from oslo-incubator and it seems like it uses the same pattern, which is not surprising since oslo.messaging descends from oslo.rpc. So if you restart some process which does rpc calls (nova-api, I guess), you should see one reply queue gone and another one created instead after some time. Dmitry 2014-03-24 7:55 GMT+04:00 Chris Friesen chris.frie...@windriver.com: Hi, If I run rabbitmqadmin list queues on my controller node I see 28 queues with names of the form reply_uuid. From what I've been reading, these queues are supposed to be used for the replies to rpc calls, they're not durable', and they all have auto_delete set to True. Given the above, I would have expected that queues with names of that form would only exist for in-flight rpc operations, and that subsequent listings of the queues would show mostly different ones, but these 28 seem to be fairly persistent. Is this expected or do I have something unusual going on? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] auto-delete in amqp reply_* queues in OpenStack
I see two possible explanations for these 5 remaining queues: * They were indeed recreated by 'compute' services. I.e. controller service send some command over rpc and then it was shut down. Its reply queue was automatically deleted, since its only consumer was disconnected. The compute services replied after that and so recreated the queue. According to Rabbit MQ docs, such queue will be stuck alive indefinitely, since it will never have a consumer. * Possibly there are services on compute nodes which initiate RPC calls themselves. I don't know OpenStack architecture enough to say if services running on compute nodes do so. In that case these 5 queues are still used by compute services. Do Rabbit MQ management tools (web or cli) allow to view active consumers for queues? If yes, then you can find out which of the cases above you encountered. Or it maybe be some third case I didn't account for :-) I assume that those 5 queues are (re)created by the services running on the compute nodes, but if that's the case then how would the services running on the controller node find out about the names of the queues? When process initiating rpc call is restarted, there is no way for it to know about queue it used before for receiving replies. The replies simply never got back. On the other hand, the restarted process does not know about calls it did before the restart, so it is not a big loss anyway. For clarity, here is a simplified algorithm RPC client (the one initiating RPC call) uses: msg_id = uuid.uuid4().hex if not self.reply_q: self.reply_q = 'reply_' + uuid.uuid4().hex message = { 'msg_id': msg_id, 'reply_q': self.reply_q, 'payload': payload,} send(message) reply = wait_for_reply(queue=self.reply_q, msg_id=msg_id) Dmitry 2014-03-24 19:52 GMT+04:00 Chris Friesen chris.frie...@windriver.com: On 03/24/2014 02:59 AM, Dmitry Mescheryakov wrote: Chris, In oslo.messaging a single reply queue is used to gather results from all the calls. It is created lazily on the first call and is used until the process is killed. I did a quick look at oslo.rpc from oslo-incubator and it seems like it uses the same pattern, which is not surprising since oslo.messaging descends from oslo.rpc. So if you restart some process which does rpc calls (nova-api, I guess), you should see one reply queue gone and another one created instead after some time. Okay, that makes a certain amount of sense. How does it work for queues used by both the controller and the compute node? If I do a controlled switchover from one controller to another (killing and restarting rabbit, nova-api, nova-conductor, nova-scheduler, neutron, cinder, etc.) I see that the number of reply queues drops from 28 down to 5, but those 5 are all ones that existed before. I assume that those 5 queues are (re)created by the services running on the compute nodes, but if that's the case then how would the services running on the controller node find out about the names of the queues? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Oslo] [Marconi] oslo.messaging on VMs
Hey folks, Just wanted to thank you all for the input, it is really valuable. Indeed it seems like overall Marconi does what is needed, so I'll experiment with it. Thanks, Dmitry 2014-03-07 0:16 GMT+04:00 Georgy Okrokvertskhov gokrokvertsk...@mirantis.com: As a result of this discussion, I think we need also involve Marconi team to this discussion. (I am sorry for changing the Subject). I am not very familiar with Marconi project details, but at first look it looks like it can help to setup separate MQ infrastructure for agent - service communication. I don't have any specific design suggestions and I hope Marconi team will help us to find a right approach. It looks like that option with oslo.message framework has now lower priority due to security reasons. Thanks Georgy On Thu, Mar 6, 2014 at 11:33 AM, Steven Dake sd...@redhat.com wrote: On 03/06/2014 10:24 AM, Daniel P. Berrange wrote: On Thu, Mar 06, 2014 at 07:25:37PM +0400, Dmitry Mescheryakov wrote: Hello folks, A number of OpenStack and related projects have a need to perform operations inside VMs running on OpenStack. A natural solution would be an agent running inside the VM and performing tasks. One of the key questions here is how to communicate with the agent. An idea which was discussed some time ago is to use oslo.messaging for that. That is an RPC framework - what is needed. You can use different transports (RabbitMQ, Qpid, ZeroMQ) depending on your preference or connectivity your OpenStack networking can provide. At the same time there is a number of things to consider, like networking, security, packaging, etc. So, messaging people, what is your opinion on that idea? I've already raised that question in the list [1], but seems like not everybody who has something to say participated. So I am resending with the different topic. For example, yesterday we started discussing security of the solution in the openstack-oslo channel. Doug Hellmann at the start raised two questions: is it possible to separate different tenants or applications with credentials and ACL so that they use different queues? My opinion that it is possible using RabbitMQ/Qpid management interface: for each application we can automatically create a new user with permission to access only her queues. Another question raised by Doug is how to mitigate a DOS attack coming from one tenant so that it does not affect another tenant. The thing is though different applications will use different queues, they are going to use a single broker. Looking at it from the security POV, I'd absolutely not want to have any tenant VMs connected to the message bus that openstack is using between its hosts. Even if you have security policies in place, the inherent architectural risk of such a design is just far too great. One small bug or misconfiguration and it opens the door to a guest owning the entire cloud infrastructure. Any channel between a guest and host should be isolated per guest, so there's no possibility of guest messages finding their way out to either the host or to other guests. If there was still a desire to use oslo.messaging, then at the very least you'd want a completely isolated message bus for guest comms, with no connection to the message bus used between hosts. Ideally the message bus would be separate per guest too, which means it ceases to be a bus really - just a point-to-point link between the virt host + guest OS that happens to use the oslo.messaging wire format. Regards, Daniel I agree and have raised this in the past. IMO oslo.messaging is a complete nonstarter for guest communication because of security concerns. We do not want guests communicating on the same message bus as infrastructure. The response to that was well just have all the guests communicate on their own unique messaging server infrastructure. The downside of this is one guests activity could damage a different guest because of a lack of isolation and the nature in which message buses work. The only workable solution which ensures security is a unique message bus per guest - which means a unique daemon per guest. Surely there has to be a better way. The idea of isolating guests on a user basis, but allowing them to all exchange messages on one topic doesn't make logical sense to me. I just don't think its possible, unless somehow rpc delivery were changed to deliver credentials enforced by the RPC server in addition to calling messages. Then some type of credential management would need to be done for each guest in the infrastructure wishing to use the shared message bus. The requirements of oslo.messaging solution for a shared agent is that the agent would only be able to listen and send messages directed towards it (point to point) but would be able to publish messages to a topic for server consumption (the agent service, which may be integrated into other projects). This way any
Re: [openstack-dev] [nova][novaclient] How to get user's credentials for using novaclient API?
Hello Nader, You should use python-keystoneclient [1] to obtain the token. You can find example usage in helper script [2]. Dmitry [1] https://github.com/openstack/python-keystoneclient [2] https://github.com/openstack/savanna/blob/master/tools/get_auth_token.py#L74 2014-03-10 21:25 GMT+04:00 Nader Lahouti nader.laho...@gmail.com: Hi All, I have a question regarding using novaclient API. I need to use it for getting a list of instances for an user/project. In oder to do that I tried to use : from novaclient.v1_1 import client nc = client.Client(username,token_id, project_id, auth_url,insecure,cacert) nc.servers.list() ( however, the comment on the code/document says different thing which as far as tried it didn't work. client = Client(USERNAME, PASSWORD, PROJECT_ID, AUTH_URL) so it seems token_id has to be provided. I can get the token_id using keystone REST API (http://localhost:5000/v2.0/tokens …-d ' the credentials …username and password'. And my question is: how to get credentials for an user in the code when using the keystone's REST API? Is there any api to get such an info? Appreciate your comments. Regards, Nader. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] os-cloud-config ssh access to cloud
For what it's worth in Sahara (former Savanna) we inject the second key by userdata. I.e. we add echo ${public_key} ${user_home}/.ssh/authorized_keys to the other stuff we do in userdata. Dmitry 2014-03-10 17:10 GMT+04:00 Jiří Stránský ji...@redhat.com: On 7.3.2014 14:50, Imre Farkas wrote: On 03/07/2014 10:30 AM, Jiří Stránský wrote: Hi, there's one step in cloud initialization that is performed over SSH -- calling keystone-manage pki_setup. Here's the relevant code in keystone-init [1], here's a review for moving the functionality to os-cloud-config [2]. The consequence of this is that Tuskar will need passwordless ssh key to access overcloud controller. I consider this suboptimal for two reasons: * It creates another security concern. * AFAIK nova is only capable of injecting one public SSH key into authorized_keys on the deployed machine, which means we can either give it Tuskar's public key and allow Tuskar to initialize overcloud, or we can give it admin's custom public key and allow admin to ssh into overcloud, but not both. (Please correct me if i'm mistaken.) We could probably work around this issue by having Tuskar do the user key injection as part of os-cloud-config, but it's a bit clumsy. This goes outside the scope of my current knowledge, i'm hoping someone knows the answer: Could pki_setup be run by combining powers of Heat and os-config-refresh? (I presume there's some reason why we're not doing this already.) I think it would help us a good bit if we could avoid having to SSH from Tuskar to overcloud. Yeah, it came up a couple times on the list. The current solution is because if you have an HA setup, the nodes can't decide on its own, which one should run pki_setup. Robert described this topic and why it needs to be initialized externally during a weekly meeting in last December. Check the topic 'After heat stack-create init operations (lsmola)': http://eavesdrop.openstack.org/meetings/tripleo/2013/tripleo.2013-12-17-19.02.log.html Thanks for the reply Imre. Yeah i vaguely remember that meeting :) I guess to do HA init we'd need to pick one of the controllers and run the init just there (set some parameter that would then be recognized by os-refresh-config). I couldn't find if Heat can do something like this on it's own, probably we'd need to deploy one of the controller nodes with different parameter set, which feels a bit weird. Hmm so unless someone comes up with something groundbreaking, we'll probably keep doing what we're doing. Having the ability to inject multiple keys to instances [1] would help us get rid of the Tuskar vs. admin key issue i mentioned in the initial e-mail. We might try asking a fellow Nova developer to help us out here. Jirka [1] https://bugs.launchpad.net/nova/+bug/917850 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] I am taking sick leave for today
Коллеги, Сегодня я опять беру выходной по болезни. Начал поправляться но все еще чувствую себя неуверенно. Надеюсь вылечиться целиком ко вторнику. Дмитрий ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] I am taking sick leave for today
Ooops, sorry, wrong recepient :-) 7 марта 2014 г., 14:13 пользователь Dmitry Mescheryakov dmescherya...@mirantis.com написал: Коллеги, Сегодня я опять беру выходной по болезни. Начал поправляться но все еще чувствую себя неуверенно. Надеюсь вылечиться целиком ко вторнику. Дмитрий ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Oslo] oslo.messaging on VMs
Hello folks, A number of OpenStack and related projects have a need to perform operations inside VMs running on OpenStack. A natural solution would be an agent running inside the VM and performing tasks. One of the key questions here is how to communicate with the agent. An idea which was discussed some time ago is to use oslo.messaging for that. That is an RPC framework - what is needed. You can use different transports (RabbitMQ, Qpid, ZeroMQ) depending on your preference or connectivity your OpenStack networking can provide. At the same time there is a number of things to consider, like networking, security, packaging, etc. So, messaging people, what is your opinion on that idea? I've already raised that question in the list [1], but seems like not everybody who has something to say participated. So I am resending with the different topic. For example, yesterday we started discussing security of the solution in the openstack-oslo channel. Doug Hellmann at the start raised two questions: is it possible to separate different tenants or applications with credentials and ACL so that they use different queues? My opinion that it is possible using RabbitMQ/Qpid management interface: for each application we can automatically create a new user with permission to access only her queues. Another question raised by Doug is how to mitigate a DOS attack coming from one tenant so that it does not affect another tenant. The thing is though different applications will use different queues, they are going to use a single broker. Do you share Doug's concerns or maybe you have your own? Thanks, Dmitry [1] http://lists.openstack.org/pipermail/openstack-dev/2013-December/021476.html ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Unified Guest Agent in Savanna
Hello folks, Not long ago we had a discussion on unified guest agent [1] - a way for performing actions 'inside' VMs. Such thing is needed for PaaS projects for tasks such as application reconfiguration and user requests pass-through. As a proof of concept I've made os_collect_config as a guest agent [2] based on the design proposed in [1]. Now I am focused on making an agent for Savanna. I'd like to invite everyone to review the initial my initial CR [3]. All subsequent changes will be listed as dependent on this one. This is going to be a more complex thing then os_collect_config rewrite. For instance, here we need to handle agent installation and configuration. Also I am going to check what can be done for more fine grained authorization. Also Sergey Lukjanov and me proposed a talk on the agent, so feel free to vote for it in case you're interested. Thanks, Dmitry [1] http://lists.openstack.org/pipermail/openstack-dev/2013-December/021476.html [2] http://lists.openstack.org/pipermail/openstack-dev/2014-January/024968.html [3] https://review.openstack.org/#/c/71015 [4] https://www.openstack.org/vote-atlanta/Presentation/unified-guest-agent-for-openstack ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [savanna] Choosing provisioning engine during cluster launch
I agree with Andrew. I see no value in letting users select how their cluster is provisioned, it will only make interface a little bit more complex. Dmitry 2014/1/30 Andrew Lazarev alaza...@mirantis.com Alexander, What is the purpose of exposing this to user side? Both engines must do exactly the same thing and they exist in the same time only for transition period until heat engine is stabilized. I don't see any value in proposed option. Andrew. On Wed, Jan 29, 2014 at 8:44 PM, Alexander Ignatov aigna...@mirantis.comwrote: Today Savanna has two provisioning engines, heat and old one known as 'direct'. Users can choose which engine will be used by setting special parameter in 'savanna.conf'. I have an idea to give an ability for users to define provisioning engine not only when savanna is started but when new cluster is launched. The idea is simple. We will just add new field 'provisioning_engine' to 'cluster' and 'cluster_template' objects. And profit is obvious, users can easily switch from one engine to another without restarting savanna service. Of course, this parameter can be omitted and the default value from the 'savanna.conf' will be applied. Is this viable? What do you think? Regards, Alexander Ignatov ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Unified Guest Agent, PoC for os-collect-config
Hello folks, At the end of the previous discussion on the topic [1] I've decided to make a PoC based on oslo.messaging. Clint suggested and I agreed to make it for os-collect-config. Actually I've made a PoC for Savanna first :-) but anyway here is the one for os-collect-config [2]. I've made a couple of observations: First, the os-collect-config naturally becomes an RPC server. That gives an advantage of having feedback, i.e. knowing that the desired config was actually received and applied. Second, with the oslo.messaging approach seems like there is almost nothing to extract to common code. It is rather well seen on the minimal example like os-config-apply. I thought there would be something to share between projects using the agent, but so far it looks like oslo.messaging already covers all the needs. Which IMHO is great! So, any thoughts? [1] http://lists.openstack.org/pipermail/openstack-dev/2013-December/021476.html [2] https://github.com/dmitrymex/os-collect-config ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] [Trove] [Savanna] [Oslo] Unified Agents - what is the actual problem?
I agree that enabling communication between guest and cloud service is a common problem for most agent designs. The only exception is agent based on hypervisor provided transport. But as far as I understand many people are interested in network-based agent, so indeed we can start a thread (or continue discussion in this on) on the problem. Dmitry 2013/12/19 Clint Byrum cl...@fewbar.com So I've seen a lot of really great discussion of the unified agents, and it has made me think a lot about the problem that we're trying to solve. I just wanted to reiterate that we should be trying to solve real problems and not get distracted by doing things right or even better. I actually think there are three problems to solve. * Private network guest to cloud service communication. * Narrow scope highly responsive lean guest agents (Trove, Savanna). * General purpose in-instance management agent (Heat). Since the private network guests problem is the only one they all share, perhaps this is where the three projects should collaborate, and the other pieces should be left to another discussion. Thoughts? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove] My thoughts on the Unified Guest Agent
2013/12/19 Fox, Kevin M kevin@pnnl.gov How about a different approach then... OpenStack has thus far been very successful providing an API and plugins for dealing with things that cloud providers need to be able to switch out to suit their needs. There seems to be two different parts to the unified agent issue: * How to get rpc messages to/from the VM from the thing needing to control it. * How to write a plugin to go from a generic rpc mechanism, to doing something useful in the vm. How about standardising what a plugin looks like, python api, c++ api, etc. It won't have to deal with transport at all. Also standardize the api the controller uses to talk to the system, rest or amqp. I think that is what we discussed when we tried to select between Salt + oslo.messaging and pure oslo.messaging framework for the agent. As you can see, we didn't came to agreement so far :-) Also Clint started a new thread to discuss what, I believe, you defined as the first part of unified agent issue. For clarity, the thread I am referring to is http://lists.openstack.org/pipermail/openstack-dev/2013-December/022690.html Then the mechanism is an implementation detail. If rackspace wants to do a VM serial driver, thats cool. If you want to use the network, that works too. Savanna/Trove/etc don't have to care which mechanism is used, only the cloud provider. Its not quite as good as one and only one implementation to rule them all, but would allow providers to choose what's best for their situation and get as much code shared as can be. What do you think? Thanks, Kevin From: Tim Simpson [tim.simp...@rackspace.com] Sent: Wednesday, December 18, 2013 11:34 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [trove] My thoughts on the Unified Guest Agent Thanks for the summary Dmitry. I'm ok with these ideas, and while I still disagree with having a single, forced standard for RPC communication, I should probably let things pan out a bit before being too concerned. - Tim From: Dmitry Mescheryakov [dmescherya...@mirantis.com] Sent: Wednesday, December 18, 2013 11:51 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [trove] My thoughts on the Unified Guest Agent Tim, The unified agent we proposing is based on the following ideas: * the core agent has _no_ functionality at all. It is a pure RPC mechanism with the ability to add whichever API needed on top of it. * the API is organized into modules which could be reused across different projects. * there will be no single package: each project (Trove/Savanna/Others) assembles its own agent based on API project needs. I hope that covers your concerns. Dmitry 2013/12/18 Tim Simpson tim.simp...@rackspace.commailto: tim.simp...@rackspace.com I've been following the Unified Agent mailing list thread for awhile now and, as someone who has written a fair amount of code for both of the two existing Trove agents, thought I should give my opinion about it. I like the idea of a unified agent, but believe that forcing Trove to adopt this agent for use as its by default will stifle innovation and harm the project. There are reasons Trove has more than one agent currently. While everyone knows about the Reference Agent written in Python, Rackspace uses a different agent written in C++ because it takes up less memory. The concerns which led to the C++ agent would not be addressed by a unified agent, which if anything would be larger than the Reference Agent is currently. I also believe a unified agent represents the wrong approach philosophically. An agent by design needs to be lightweight, capable of doing exactly what it needs to and no more. This is especially true for a project like Trove whose goal is to not to provide overly general PAAS capabilities but simply installation and maintenance of different datastores. Currently, the Trove daemons handle most logic and leave the agents themselves to do relatively little. This takes some effort as many of the first iterations of Trove features have too much logic put into the guest agents. However through perseverance the subsequent designs are usually cleaner and simpler to follow. A community approved, do everything agent would endorse the wrong balance and lead to developers piling up logic on the guest side. Over time, features would become dependent on the Unified Agent, making it impossible to run or even contemplate light-weight agents. Trove's interface to agents today is fairly loose and could stand to be made stricter. However, it is flexible and works well enough. Essentially, the duck typed interface of the trove.guestagent.api.API class is used to send messages, and Trove conductor is used to receive them at which point it updates the database. Because
Re: [openstack-dev] [Heat] [Trove] [Savanna] [Oslo] Unified Agents - what is the actual problem?
Tim, IMHO network-based and hypervisor-based agents definitely can co-exist. What I wanted to say is that the problem of enabling communication between guest and cloud service is not relevant for hypervisor-based agents. They simply don't need network access into a VM. Dmitry 2013/12/19 Tim Simpson tim.simp...@rackspace.com I agree that enabling communication between guest and cloud service is a common problem for most agent designs. The only exception is agent based on hypervisor provided transport. But as far as I understand many people are interested in network-based agent, so indeed we can start a thread (or continue discussion in this on) on the problem. Can't they co-exist? Let's say the interface to talk to an agent is simply some class loaded from a config file, the way it is in Trove. So we have a class which has the methods add_user, get_filesystem_stats. The first, and let's say default, implementation sends a message over Rabbit using oslo.rpc or something like it. All the arguments turn into a JSON object and are deserialized on the agent side using oslo.rpc or some C++ code capable of reading JSON. If someone wants to add a hypervisor provided transport, they could do so by instead changing this API class to one which contacts a service on the hypervisor node (using oslo.rpc) with arguments that include the guest agent ID and args, which is just a dictionary of the original arguments. This service would then shell out to execute some hypervisor specific command to talk to the given guest. That's what I meant when I said I liked how Trove handled this now- because it uses a simple, non-prescriptive interface, it's easy to swap out yet still easy to use. That would mean the job of a unified agent framework would be to offer up libraries to ease up the creation of the API class by offering Python code to send messages in various styles / formats, as well as Python or C++ code to read and interpret those messages. Of course, we'd still settle on one default (probably network based) which would become the standard way of sending messages to guests so that package maintainers, the Infra team, and newbies to OpenStack wouldn't have to deal with dozens of different ways of doing things, but the important thing is that other methods of communication would still be possible. Thanks, Tim From: Dmitry Mescheryakov [mailto:dmescherya...@mirantis.com] Sent: Thursday, December 19, 2013 7:15 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Heat] [Trove] [Savanna] [Oslo] Unified Agents - what is the actual problem? I agree that enabling communication between guest and cloud service is a common problem for most agent designs. The only exception is agent based on hypervisor provided transport. But as far as I understand many people are interested in network-based agent, so indeed we can start a thread (or continue discussion in this on) on the problem. Dmitry 2013/12/19 Clint Byrum cl...@fewbar.com So I've seen a lot of really great discussion of the unified agents, and it has made me think a lot about the problem that we're trying to solve. I just wanted to reiterate that we should be trying to solve real problems and not get distracted by doing things right or even better. I actually think there are three problems to solve. * Private network guest to cloud service communication. * Narrow scope highly responsive lean guest agents (Trove, Savanna). * General purpose in-instance management agent (Heat). Since the private network guests problem is the only one they all share, perhaps this is where the three projects should collaborate, and the other pieces should be left to another discussion. Thoughts? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
Clint, do you mean * use os-collect-config and its HTTP transport as a base for the PoC or * migrate os-collect-config on PoC after it is implemented on oslo.messaging I presume the later, but could you clarify? 2013/12/18 Clint Byrum cl...@fewbar.com Excerpts from Dmitry Mescheryakov's message of 2013-12-17 08:01:38 -0800: Folks, The discussion didn't result in a consensus, but it did revealed a great number of things to be accounted. I've tried to summarize top-level points in the etherpad [1]. It lists only items everyone (as it seems to me) agrees on, or suggested options where there was no consensus. Let me know if i misunderstood or missed something. The etherpad does not list advantages/disadvantages of options, otherwise it just would be too long. Interested people might search the thread for the arguments :-) . I've thought it over and I agree with people saying we need to move further. Savanna needs the agent and I am going to write a PoC for it. Sure the PoC will be implemented in project-independent way. I still think that Salt limitations overweight its advantages, so the PoC will be done on top of oslo.messaging without Salt. At least we'll have an example on how it might look. Most probably I will have more questions in the process, for instance we didn't finish discussion on enabling networking for the agent yet. In that case I will start a new, more specific thread in the list. If you're not going to investigate using salt, can I suggest you base your POC on os-collect-config? It it would not take much to add two-way communication to it. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove] My thoughts on the Unified Guest Agent
Tim, The unified agent we proposing is based on the following ideas: * the core agent has _no_ functionality at all. It is a pure RPC mechanism with the ability to add whichever API needed on top of it. * the API is organized into modules which could be reused across different projects. * there will be no single package: each project (Trove/Savanna/Others) assembles its own agent based on API project needs. I hope that covers your concerns. Dmitry 2013/12/18 Tim Simpson tim.simp...@rackspace.com I've been following the Unified Agent mailing list thread for awhile now and, as someone who has written a fair amount of code for both of the two existing Trove agents, thought I should give my opinion about it. I like the idea of a unified agent, but believe that forcing Trove to adopt this agent for use as its by default will stifle innovation and harm the project. There are reasons Trove has more than one agent currently. While everyone knows about the Reference Agent written in Python, Rackspace uses a different agent written in C++ because it takes up less memory. The concerns which led to the C++ agent would not be addressed by a unified agent, which if anything would be larger than the Reference Agent is currently. I also believe a unified agent represents the wrong approach philosophically. An agent by design needs to be lightweight, capable of doing exactly what it needs to and no more. This is especially true for a project like Trove whose goal is to not to provide overly general PAAS capabilities but simply installation and maintenance of different datastores. Currently, the Trove daemons handle most logic and leave the agents themselves to do relatively little. This takes some effort as many of the first iterations of Trove features have too much logic put into the guest agents. However through perseverance the subsequent designs are usually cleaner and simpler to follow. A community approved, do everything agent would endorse the wrong balance and lead to developers piling up logic on the guest side. Over time, features would become dependent on the Unified Agent, making it impossible to run or even contemplate light-weight agents. Trove's interface to agents today is fairly loose and could stand to be made stricter. However, it is flexible and works well enough. Essentially, the duck typed interface of the trove.guestagent.api.API class is used to send messages, and Trove conductor is used to receive them at which point it updates the database. Because both of these components can be swapped out if necessary, the code could support the Unified Agent when it appears as well as future agents. It would be a mistake however to alter Trove's standard method of communication to please the new Unified Agent. In general, we should try to keep Trove speaking to guest agents in Trove's terms alone to prevent bloat. Thanks, Tim ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove] My thoughts on the Unified Guest Agent
2013/12/18 Steven Dake sd...@redhat.com On 12/18/2013 08:34 AM, Tim Simpson wrote: I've been following the Unified Agent mailing list thread for awhile now and, as someone who has written a fair amount of code for both of the two existing Trove agents, thought I should give my opinion about it. I like the idea of a unified agent, but believe that forcing Trove to adopt this agent for use as its by default will stifle innovation and harm the project. There are reasons Trove has more than one agent currently. While everyone knows about the Reference Agent written in Python, Rackspace uses a different agent written in C++ because it takes up less memory. The concerns which led to the C++ agent would not be addressed by a unified agent, which if anything would be larger than the Reference Agent is currently. I also believe a unified agent represents the wrong approach philosophically. An agent by design needs to be lightweight, capable of doing exactly what it needs to and no more. This is especially true for a project like Trove whose goal is to not to provide overly general PAAS capabilities but simply installation and maintenance of different datastores. Currently, the Trove daemons handle most logic and leave the agents themselves to do relatively little. This takes some effort as many of the first iterations of Trove features have too much logic put into the guest agents. However through perseverance the subsequent designs are usually cleaner and simpler to follow. A community approved, do everything agent would endorse the wrong balance and lead to developers piling up logic on the guest side. Over time, features would become dependent on the Unified Agent, making it impossible to run or even contemplate light-weight agents. Trove's interface to agents today is fairly loose and could stand to be made stricter. However, it is flexible and works well enough. Essentially, the duck typed interface of the trove.guestagent.api.API class is used to send messages, and Trove conductor is used to receive them at which point it updates the database. Because both of these components can be swapped out if necessary, the code could support the Unified Agent when it appears as well as future agents. It would be a mistake however to alter Trove's standard method of communication to please the new Unified Agent. In general, we should try to keep Trove speaking to guest agents in Trove's terms alone to prevent bloat. Thanks, Tim Tim, You raise very valid points that I'll summarize into bullet points: * memory footprint of a python-based agent * guest-agent feature bloat with no clear path to refactoring * an agent should do one thing and do it well The competing viewpoint is from downstream: * How do you get those various agents into the various linux distributions cloud images and maintain them A unified agent addresses the downstream viewpoint well, which is There is only one agent to package and maintain, and it supports all the integrated OpenStack Program projects. Putting on my Fedora Hat for a moment, I'm not a big fan of an agent per OpenStack project going into the Fedora 21 cloud images. Another option that we really haven't discussed on this long long thread is injecting the per-project agents into the vm on bootstrapping of the vm. If we developed common code for this sort of operation and placed it into oslo, *and* agreed to use it as our common unifying mechanism of agent support, each project would be free to ship whatever agents they wanted in their packaging, use the proposed oslo.bootstrap code to bootstrap the VM via cloudinit with the appropriate agents installed in the proper locations, whamo, problem solved for everyone. Funny thing is, the same idea was proposed and discussed among my colleagues and me recently. We saw it as a Heat extension which could be requested to inject guest agent into the VM. The list of required modules could be passed as a request parameter. That can ease life of us, Savanna devs, because we will not have to pre-install the agent on our images. Regards -steve ___ OpenStack-dev mailing listOpenStack-dev@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove] My thoughts on the Unified Guest Agent
Tim, we definitely don't want to force projects to migrate to the unified agent. I've started making the PoC with the idea that Savanna needs the agent anyway, and we want it to be ready for Icehouse. On the other side, I believe, it will be much easier to drive the discussion further with the PoC ready, as we will have something material to talk over. Dmitry 2013/12/18 Tim Simpson tim.simp...@rackspace.com Thanks for the summary Dmitry. I'm ok with these ideas, and while I still disagree with having a single, forced standard for RPC communication, I should probably let things pan out a bit before being too concerned. - Tim -- *From:* Dmitry Mescheryakov [dmescherya...@mirantis.com] *Sent:* Wednesday, December 18, 2013 11:51 AM *To:* OpenStack Development Mailing List (not for usage questions) *Subject:* Re: [openstack-dev] [trove] My thoughts on the Unified Guest Agent Tim, The unified agent we proposing is based on the following ideas: * the core agent has _no_ functionality at all. It is a pure RPC mechanism with the ability to add whichever API needed on top of it. * the API is organized into modules which could be reused across different projects. * there will be no single package: each project (Trove/Savanna/Others) assembles its own agent based on API project needs. I hope that covers your concerns. Dmitry 2013/12/18 Tim Simpson tim.simp...@rackspace.com I've been following the Unified Agent mailing list thread for awhile now and, as someone who has written a fair amount of code for both of the two existing Trove agents, thought I should give my opinion about it. I like the idea of a unified agent, but believe that forcing Trove to adopt this agent for use as its by default will stifle innovation and harm the project. There are reasons Trove has more than one agent currently. While everyone knows about the Reference Agent written in Python, Rackspace uses a different agent written in C++ because it takes up less memory. The concerns which led to the C++ agent would not be addressed by a unified agent, which if anything would be larger than the Reference Agent is currently. I also believe a unified agent represents the wrong approach philosophically. An agent by design needs to be lightweight, capable of doing exactly what it needs to and no more. This is especially true for a project like Trove whose goal is to not to provide overly general PAAS capabilities but simply installation and maintenance of different datastores. Currently, the Trove daemons handle most logic and leave the agents themselves to do relatively little. This takes some effort as many of the first iterations of Trove features have too much logic put into the guest agents. However through perseverance the subsequent designs are usually cleaner and simpler to follow. A community approved, do everything agent would endorse the wrong balance and lead to developers piling up logic on the guest side. Over time, features would become dependent on the Unified Agent, making it impossible to run or even contemplate light-weight agents. Trove's interface to agents today is fairly loose and could stand to be made stricter. However, it is flexible and works well enough. Essentially, the duck typed interface of the trove.guestagent.api.API class is used to send messages, and Trove conductor is used to receive them at which point it updates the database. Because both of these components can be swapped out if necessary, the code could support the Unified Agent when it appears as well as future agents. It would be a mistake however to alter Trove's standard method of communication to please the new Unified Agent. In general, we should try to keep Trove speaking to guest agents in Trove's terms alone to prevent bloat. Thanks, Tim ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
Folks, The discussion didn't result in a consensus, but it did revealed a great number of things to be accounted. I've tried to summarize top-level points in the etherpad [1]. It lists only items everyone (as it seems to me) agrees on, or suggested options where there was no consensus. Let me know if i misunderstood or missed something. The etherpad does not list advantages/disadvantages of options, otherwise it just would be too long. Interested people might search the thread for the arguments :-) . I've thought it over and I agree with people saying we need to move further. Savanna needs the agent and I am going to write a PoC for it. Sure the PoC will be implemented in project-independent way. I still think that Salt limitations overweight its advantages, so the PoC will be done on top of oslo.messaging without Salt. At least we'll have an example on how it might look. Most probably I will have more questions in the process, for instance we didn't finish discussion on enabling networking for the agent yet. In that case I will start a new, more specific thread in the list. Thanks, Dmitry [1] https://etherpad.openstack.org/p/UnifiedGuestAgent ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
Hello Thomas, I do understand your feelings. The problem is there were already many points raised both pro and contra adopting Salt as an agent. And so far no consensus was reached on that matter. Maybe someone else is willing to step out and write a PoC for Salt-based agent? Then we can agree on a functionality PoC should implement and compare the implementations. The PoCs also can reveal limitations we currently don't see. Thanks, Dmitry 2013/12/17 Thomas Herve thomas.he...@enovance.com The discussion didn't result in a consensus, but it did revealed a great number of things to be accounted. I've tried to summarize top-level points in the etherpad [1]. It lists only items everyone (as it seems to me) agrees on, or suggested options where there was no consensus. Let me know if i misunderstood or missed something. The etherpad does not list advantages/disadvantages of options, otherwise it just would be too long. Interested people might search the thread for the arguments :-) . I've thought it over and I agree with people saying we need to move further. Savanna needs the agent and I am going to write a PoC for it. Sure the PoC will be implemented in project-independent way. I still think that Salt limitations overweight its advantages, so the PoC will be done on top of oslo.messaging without Salt. At least we'll have an example on how it might look. Most probably I will have more questions in the process, for instance we didn't finish discussion on enabling networking for the agent yet. In that case I will start a new, more specific thread in the list. Hi Dimitri, While I agree that using Salt's transport may be wrong for us, the module system they have is really interesting, and a pretty big ecosystem already. It solved things like system-specific information, and it has a simple internal API to create modules. Redoing something from scratch Openstack-specific sounds like a mistake to me. As Salt seems to be able to work in a standalone mode, I think it'd be interesting to investigate that. Maybe it's worth separating the discussion between how to deliver messages to the servers (oslo.messaging, Marconi, etc), and what to do on the servers (where I think Salt is a great contender). -- Thomas ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
Clint, Kevin, Thanks for reassuring me :-) I just wanted to make sure that having direct access from VMs to a single facility is not a dead end in terms of security and extensibility. And since it is not, I agree it is much simpler (and hence better) than hypervisor-dependent design. Then returning to two major suggestions made: * Salt * Custom solution specific to our needs The custom solution could be made on top of oslo.messaging. That gives us RPC working on different messaging systems. And that is what we really need - an RPC into guest supporting various transports. What it lacks at the moment is security - it has neither authentication nor ACL. Salt also provides RPC service, but it has a couple of disadvantages: it is tightly coupled with ZeroMQ and it needs a server process to run. A single transport option (ZeroMQ) is a limitation we really want to avoid. OpenStack could be deployed with various messaging providers, and we can't limit the choice to a single option in the guest agent. Though it could be changed in the future, it is an obstacle to consider. Running yet another server process within OpenStack, as it was already pointed out, is expensive. It means another server to deploy and take care of, +1 to overall OpenStack complexity. And it does not look it could be fixed any time soon. For given reasons I give favor to an agent based on oslo.messaging. Thanks, Dmitry 2013/12/11 Fox, Kevin M kevin@pnnl.gov Yeah. Its likely that the metadata server stuff will get more scalable/hardened over time. If it isn't enough now, lets fix it rather then coming up with a new system to work around it. I like the idea of using the network since all the hypervisors have to support network drivers already. They also already have to support talking to the metadata server. This keeps OpenStack out of the hypervisor driver business. Kevin From: Clint Byrum [cl...@fewbar.com] Sent: Tuesday, December 10, 2013 1:02 PM To: openstack-dev Subject: Re: [openstack-dev] Unified Guest Agent proposal Excerpts from Dmitry Mescheryakov's message of 2013-12-10 12:37:37 -0800: What is the exact scenario you're trying to avoid? It is DDoS attack on either transport (AMQP / ZeroMQ provider) or server (Salt / Our own self-written server). Looking at the design, it doesn't look like the attack could be somehow contained within a tenant it is coming from. We can push a tenant-specific route for the metadata server, and a tenant specific endpoint for in-agent things. Still simpler than hypervisor-aware guests. I haven't seen anybody ask for this yet, though I'm sure if they run into these problems it will be the next logical step. In the current OpenStack design I see only one similarly vulnerable component - metadata server. Keeping that in mind, maybe I just overestimate the threat? Anything you expose to the users is vulnerable. By using the localized hypervisor scheme you're now making the compute node itself vulnerable. Only now you're asking that an already complicated thing (nova-compute) add another job, rate limiting. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
Vladik, Thanks for the suggestion, but hypervisor-dependent solution is exactly what scares off people in the thread :-) Thanks, Dmitry 2013/12/11 Vladik Romanovsky vladik.romanov...@enovance.com Maybe it will be useful to use Ovirt guest agent as a base. http://www.ovirt.org/Guest_Agent https://github.com/oVirt/ovirt-guest-agent It is already working well on linux and windows and has a lot of functionality. However, currently it is using virtio-serial for communication, but I think it can be extended for other bindings. Vladik - Original Message - From: Clint Byrum cl...@fewbar.com To: openstack-dev openstack-dev@lists.openstack.org Sent: Tuesday, 10 December, 2013 4:02:41 PM Subject: Re: [openstack-dev] Unified Guest Agent proposal Excerpts from Dmitry Mescheryakov's message of 2013-12-10 12:37:37 -0800: What is the exact scenario you're trying to avoid? It is DDoS attack on either transport (AMQP / ZeroMQ provider) or server (Salt / Our own self-written server). Looking at the design, it doesn't look like the attack could be somehow contained within a tenant it is coming from. We can push a tenant-specific route for the metadata server, and a tenant specific endpoint for in-agent things. Still simpler than hypervisor-aware guests. I haven't seen anybody ask for this yet, though I'm sure if they run into these problems it will be the next logical step. In the current OpenStack design I see only one similarly vulnerable component - metadata server. Keeping that in mind, maybe I just overestimate the threat? Anything you expose to the users is vulnerable. By using the localized hypervisor scheme you're now making the compute node itself vulnerable. Only now you're asking that an already complicated thing (nova-compute) add another job, rate limiting. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
Guys, I see two major trends in the thread: * use Salt * write our own solution with architecture similar to Salt or MCollective There were points raised pro and contra both solutions. But I have a concern which I believe was not covered yet. Both solutions use either ZeroMQ or message queues (AMQP/STOMP) as a transport. The thing is there is going to be a shared facility between all the tenants. And unlike all other OpenStack services, this facility will be directly accessible from VMs, which leaves tenants very vulnerable to each other. Harm the facility from your VM, and the whole Region/Cell/Availability Zone will be left out of service. Do you think that is solvable, or maybe I overestimate the threat? Thanks, Dmitry 2013/12/9 Dmitry Mescheryakov dmescherya...@mirantis.com 2013/12/9 Kurt Griffiths kurt.griffi...@rackspace.com This list of features makes me *very* nervous from a security standpoint. Are we talking about giving an agent an arbitrary shell command or file to install, and it goes and does that, or are we simply triggering a preconfigured action (at the time the agent itself was installed)? I believe the agent must execute only a set of preconfigured actions exactly due to security reasons. It should be up to the using project (Savanna/Trove) to decide which actions must be exposed by the agent. From: Steven Dake sd...@redhat.com Reply-To: OpenStack Dev openstack-dev@lists.openstack.org Date: Monday, December 9, 2013 at 11:41 AM To: OpenStack Dev openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] Unified Guest Agent proposal In terms of features: * run shell commands * install files (with selinux properties as well) * create users and groups (with selinux properties as well) * install packages via yum, apt-get, rpm, pypi * start and enable system services for systemd or sysvinit * Install and unpack source tarballs * run scripts * Allow grouping, selection, and ordering of all of the above operations ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
And one more thing, Sandy Walsh pointed to the client Rackspace developed and use - [1], [2]. Its design is somewhat different and can be expressed by the following formulae: App - Host (XenStore) - Guest Agent (taken from the wiki [3]) It has an obvious disadvantage - it is hypervisor dependent and currently implemented for Xen only. On the other hand such design should not have shared facility vulnerability as Agent accesses the server not directly but via XenStore (which AFAIU is compute node based). Thanks, Dmitry [1] https://github.com/rackerlabs/openstack-guest-agents-unix [2] https://github.com/rackerlabs/openstack-guest-agents-windows-xenserver [3] https://wiki.openstack.org/wiki/GuestAgent 2013/12/10 Dmitry Mescheryakov dmescherya...@mirantis.com Guys, I see two major trends in the thread: * use Salt * write our own solution with architecture similar to Salt or MCollective There were points raised pro and contra both solutions. But I have a concern which I believe was not covered yet. Both solutions use either ZeroMQ or message queues (AMQP/STOMP) as a transport. The thing is there is going to be a shared facility between all the tenants. And unlike all other OpenStack services, this facility will be directly accessible from VMs, which leaves tenants very vulnerable to each other. Harm the facility from your VM, and the whole Region/Cell/Availability Zone will be left out of service. Do you think that is solvable, or maybe I overestimate the threat? Thanks, Dmitry 2013/12/9 Dmitry Mescheryakov dmescherya...@mirantis.com 2013/12/9 Kurt Griffiths kurt.griffi...@rackspace.com This list of features makes me *very* nervous from a security standpoint. Are we talking about giving an agent an arbitrary shell command or file to install, and it goes and does that, or are we simply triggering a preconfigured action (at the time the agent itself was installed)? I believe the agent must execute only a set of preconfigured actions exactly due to security reasons. It should be up to the using project (Savanna/Trove) to decide which actions must be exposed by the agent. From: Steven Dake sd...@redhat.com Reply-To: OpenStack Dev openstack-dev@lists.openstack.org Date: Monday, December 9, 2013 at 11:41 AM To: OpenStack Dev openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] Unified Guest Agent proposal In terms of features: * run shell commands * install files (with selinux properties as well) * create users and groups (with selinux properties as well) * install packages via yum, apt-get, rpm, pypi * start and enable system services for systemd or sysvinit * Install and unpack source tarballs * run scripts * Allow grouping, selection, and ordering of all of the above operations ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
2013/12/10 Clint Byrum cl...@fewbar.com Excerpts from Dmitry Mescheryakov's message of 2013-12-10 08:25:26 -0800: And one more thing, Sandy Walsh pointed to the client Rackspace developed and use - [1], [2]. Its design is somewhat different and can be expressed by the following formulae: App - Host (XenStore) - Guest Agent (taken from the wiki [3]) It has an obvious disadvantage - it is hypervisor dependent and currently implemented for Xen only. On the other hand such design should not have shared facility vulnerability as Agent accesses the server not directly but via XenStore (which AFAIU is compute node based). I don't actually see any advantage to this approach. It seems to me that it would be simpler to expose and manage a single network protocol than it would be to expose hypervisor level communications for all hypervisors. I think the Rackspace agent design could be expanded as follows: Controller (Savanna/Trove) - AMQP/ZeroMQ - Agent on Compute host - XenStore - Guest Agent That is somewhat speculative because if I understood it correctly the opened code covers only the second part of exchange: Python API / CMD interface - XenStore - Guest Agent Assuming I got it right: While more complex, such design removes pressure from AMQP/ZeroMQ providers: on the 'Agent on Compute' you can easily control the amount of messages emitted by Guest with throttling. It is easy since such agent runs on a compute host. In the worst case, if it is happened to be abused by a guest, it affect this compute host only and not the whole segment of OpenStack. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
What is the exact scenario you're trying to avoid? It is DDoS attack on either transport (AMQP / ZeroMQ provider) or server (Salt / Our own self-written server). Looking at the design, it doesn't look like the attack could be somehow contained within a tenant it is coming from. In the current OpenStack design I see only one similarly vulnerable component - metadata server. Keeping that in mind, maybe I just overestimate the threat? 2013/12/10 Clint Byrum cl...@fewbar.com Excerpts from Dmitry Mescheryakov's message of 2013-12-10 11:08:58 -0800: 2013/12/10 Clint Byrum cl...@fewbar.com Excerpts from Dmitry Mescheryakov's message of 2013-12-10 08:25:26 -0800: And one more thing, Sandy Walsh pointed to the client Rackspace developed and use - [1], [2]. Its design is somewhat different and can be expressed by the following formulae: App - Host (XenStore) - Guest Agent (taken from the wiki [3]) It has an obvious disadvantage - it is hypervisor dependent and currently implemented for Xen only. On the other hand such design should not have shared facility vulnerability as Agent accesses the server not directly but via XenStore (which AFAIU is compute node based). I don't actually see any advantage to this approach. It seems to me that it would be simpler to expose and manage a single network protocol than it would be to expose hypervisor level communications for all hypervisors. I think the Rackspace agent design could be expanded as follows: Controller (Savanna/Trove) - AMQP/ZeroMQ - Agent on Compute host - XenStore - Guest Agent That is somewhat speculative because if I understood it correctly the opened code covers only the second part of exchange: Python API / CMD interface - XenStore - Guest Agent Assuming I got it right: While more complex, such design removes pressure from AMQP/ZeroMQ providers: on the 'Agent on Compute' you can easily control the amount of messages emitted by Guest with throttling. It is easy since such agent runs on a compute host. In the worst case, if it is happened to be abused by a guest, it affect this compute host only and not the whole segment of OpenStack. This still requires that we also write a backend to talk to the host for all virt drivers. It also means that any OS we haven't written an implementation for needs to be hypervisor-aware. That sounds like a never ending battle. If it is just a network API, it works the same for everybody. This makes it simpler, and thus easier to scale out independently of compute hosts. It is also something we already support and can very easily expand by just adding a tiny bit of functionality to neutron-metadata-agent. In fact we can even push routes via DHCP to send agent traffic through a different neutron-metadata-agent, so I don't see any issue where we are piling anything on top of an overstressed single resource. We can have neutron route this traffic directly to the Heat API which hosts it, and that can be load balanced and etc. etc. What is the exact scenario you're trying to avoid? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Unified Guest Agent proposal
2013/12/9 Clint Byrum cl...@fewbar.com Excerpts from Steven Dake's message of 2013-12-09 09:41:06 -0800: On 12/09/2013 09:41 AM, David Boucha wrote: On Sat, Dec 7, 2013 at 11:09 PM, Monty Taylor mord...@inaugust.com mailto:mord...@inaugust.com wrote: On 12/08/2013 07:36 AM, Robert Collins wrote: On 8 December 2013 17:23, Monty Taylor mord...@inaugust.com mailto:mord...@inaugust.com wrote: I suggested salt because we could very easily make trove and savana into salt masters (if we wanted to) just by having them import salt library and run an api call. When they spin up nodes using heat, we could easily have that to the cert exchange - and the admins of the site need not know _anything_ about salt, puppet or chef - only about trove or savana. Are salt masters multi-master / HA safe? E.g. if I've deployed 5 savanna API servers to handle load, and they all do this 'just import', does that work? If not, and we have to have one special one, what happens when it fails / is redeployed? Yes. You can have multiple salt masters. Can salt minions affect each other? Could one pretend to be a master, or snoop requests/responses to another minion? Yes and no. By default no - and this is protected by key encryption and whatnot. They can affect each other if you choose to explicitly grant them the ability to. That is - you can give a minion an acl to allow it inject specific command requests back up into the master. We use this in the infra systems to let a jenkins slave send a signal to our salt system to trigger a puppet run. That's all that slave can do though - send the signal that the puppet run needs to happen. However - I don't think we'd really want to use that in this case, so I think they answer you're looking for is no. Is salt limited: is it possible to assert that we *cannot* run arbitrary code over salt? In as much as it is possible to assert that about any piece of software (bugs, of course, blah blah) But the messages that salt sends to a minion are run this thing that you have a local definition for rather than here, have some python and run it Monty Salt was originally designed to be a unified agent for a system like openstack. In fact, many people use it for this purpose right now. I discussed this with our team management and this is something SaltStack wants to support. Are there any specifics things that the salt minion lacks right now to support this use case? David, If I am correct of my parsing of the salt nomenclature, Salt provides a Master (eg a server) and minions (eg agents that connect to the salt server). The salt server tells the minions what to do. This is not desirable for a unified agent (atleast in the case of Heat). The bar is very very very high for introducing new *mandatory* *server* dependencies into OpenStack. Requiring a salt master (or a puppet master, etc) in my view is a non-starter for a unified guest agent proposal. Now if a heat user wants to use puppet, and can provide a puppet master in their cloud environment, that is fine, as long as it is optional. What if we taught Heat to speak salt-master-ese? AFAIK it is basically an RPC system. I think right now it is 0mq, so it would be relatively straight forward to just have Heat start talking to the agents in 0mq. A guest agent should have the following properties: * minimal library dependency chain * no third-party server dependencies * packaged in relevant cloudy distributions That last one only matters if the distributions won't add things like agents to their images post-release. I am pretty sure work well in OpenStack is important for server distributions and thus this is at least something we don't have to freak out about too much. In terms of features: * run shell commands * install files (with selinux properties as well) * create users and groups (with selinux properties as well) * install packages via yum, apt-get, rpm, pypi * start and enable system services for systemd or sysvinit * Install and unpack source tarballs * run scripts * Allow grouping, selection, and ordering of all of the above operations All of those things are general purpose low level system configuration features. None of them will be needed for Trove or Savanna. They need to do higher level things like run a Hadoop job or create a MySQL user. I agree with Clint on this one, Savanna do needs high level domain-specific operations. We can do anything having just a root shell. But security-wise, as it was already mentioned in the
Re: [openstack-dev] Unified Guest Agent proposal
2013/12/9 Kurt Griffiths kurt.griffi...@rackspace.com This list of features makes me *very* nervous from a security standpoint. Are we talking about giving an agent an arbitrary shell command or file to install, and it goes and does that, or are we simply triggering a preconfigured action (at the time the agent itself was installed)? I believe the agent must execute only a set of preconfigured actions exactly due to security reasons. It should be up to the using project (Savanna/Trove) to decide which actions must be exposed by the agent. From: Steven Dake sd...@redhat.com Reply-To: OpenStack Dev openstack-dev@lists.openstack.org Date: Monday, December 9, 2013 at 11:41 AM To: OpenStack Dev openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] Unified Guest Agent proposal In terms of features: * run shell commands * install files (with selinux properties as well) * create users and groups (with selinux properties as well) * install packages via yum, apt-get, rpm, pypi * start and enable system services for systemd or sysvinit * Install and unpack source tarballs * run scripts * Allow grouping, selection, and ordering of all of the above operations ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Unified Guest Agent proposal
Hello all, We would like to push further the discussion on unified guest agent. You may find the details of our proposal at [1]. Also let me clarify why we started this conversation. Savanna currently utilizes SSH to install/configure Hadoop on VMs. We were happy with that approach until recently we realized that in many OpenStack deployments VMs are not accessible from controller. That brought us to idea to use guest agent for VM configuration instead. That approach is already used by Trove, Murano and Heat and we can do the same. Uniting the efforts on a single guest agent brings a couple advantages: 1. Code reuse across several projects. 2. Simplified deployment of OpenStack. Guest agent requires additional facilities for transport like message queue or something similar. Sharing agent means projects can share transport/config and hence ease life of deployers. We see it is a library and we think that Oslo is a good place for it. Naturally, since this is going to be a _unified_ agent we seek input from all interested parties. [1] https://wiki.openstack.org/wiki/UnifiedGuestAgent Thanks, Dmitry ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [savanna] Anti-affinity
Arindam, It is not achievable with the current Savanna. The anti-affinity feature allows to run one VM per compute-node only. It can not evenly distribute VMs in case the number of compute nodes is lower than the desired size of Hadoop cluster. Dmitry 2013/12/5 Arindam Choudhury arin...@live.com HI, I have 11 compute nodes. I want to create a hadoop cluster with 1 master(namenode+jobtracker) with 20 worker (datanode+tasktracker). How to configure the Anti-affinty so I can run the master in one host, while others will be hosting two worker? I tried some configuration, but I can not achieve it. Regards, Arindam ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [savanna] This request was rate-limited. (HTTP 413)
Hmm, not sure, I am not an expert in Nova. By the way the link I gave you is for Grizzly. If you are running a different release, take a look at that release doc as configuration might look different here. Dmitry 2013/12/5 Arindam Choudhury arin...@live.com Hi, I introduced this in my nova api-paste.ini: [pipeline:openstack_compute_api_v2] pipeline = faultwrap authtoken keystonecontext ratelimit osapi_compute_app_v2 [pipeline:openstack_volume_api_v1] pipeline = faultwrap authtoken keystonecontext ratelimit osapi_volume_app_v1 [filter:ratelimit] paste.filter_factory = nova.api.openstack.compute.limits:RateLimitingMiddleware.factory limits =(POST, *, .*, 100, MINUTE);(POST, */servers, ^/servers, 500, DAY);(PUT, *, .*, 100, MINUTE);(GET, *changes-since*, .*changes-since.*, 30, MINUTE);(DELETE, *, .*, 100, MINUTE) I am getting this error when I try to restart openstack-nova-api: 2013-12-05 12:51:20.035 2350 ERROR nova.wsgi [-] Ambiguous section names ['composite:openstack_compute_api_v2', 'pipeline:openstack_compute_api_v2'] for section 'openstack_compute_api_v2' (prefixed by 'app' or 'application' or 'composite' or 'composit' or 'pipeline' or 'filter-app') found in config /etc/nova/api-paste.ini 2013-12-05 12:51:20.044 2549 INFO nova.ec2.wsgi.server [-] (2549) wsgi starting up on http://0.0.0.0:8773/ 2013-12-05 12:51:20.045 2350 CRITICAL nova [-] Could not load paste app 'osapi_compute' from /etc/nova/api-paste.ini 2013-12-05 12:51:20.045 2350 TRACE nova Traceback (most recent call last): 2013-12-05 12:51:20.045 2350 TRACE nova File /usr/bin/nova-api, line 61, in module 2013-12-05 12:51:20.045 2350 TRACE nova server = service.WSGIService(api, use_ssl=should_use_ssl) 2013-12-05 12:51:20.045 2350 TRACE nova File /usr/lib/python2.6/site-packages/nova/service.py, line 598, in __init__ 2013-12-05 12:51:20.045 2350 TRACE nova self.app = self.loader.load_app(name) 2013-12-05 12:51:20.045 2350 TRACE nova File /usr/lib/python2.6/site-packages/nova/wsgi.py, line 485, in load_app 2013-12-05 12:51:20.045 2350 TRACE nova raise exception.PasteAppNotFound(name=name, path=self.config_path) 2013-12-05 12:51:20.045 2350 TRACE nova PasteAppNotFound: Could not load paste app 'osapi_compute' from /etc/nova/api-paste.ini 2013-12-05 12:51:20.045 2350 TRACE nova 2013-12-05 12:51:20.407 2549 INFO nova.service [-] Parent process has died unexpectedly, exiting Regards, Arindam -- Date: Thu, 5 Dec 2013 15:37:21 +0400 From: dmescherya...@mirantis.com To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [savanna] This request was rate-limited. (HTTP 413) Hello Arindam, While deploying Hadoop cluster Savanna does pretty many API requests to Nova. Naturally, the number of requests is directly proportional to the size of the cluster. On the other hand Nova has a protection agains users abusing API with too many requests. It is called rate limiting. You need to set limits higher than they are right now if you want to spin-up a cluster of that size. You can find details in the Nova docs: http://docs.openstack.org/grizzly/openstack-compute/admin/content//configuring-compute-API.html Dmitry 2013/12/5 Arindam Choudhury arin...@live.com Hi, When I try to create a big hadoop cluster (21 nodes), sometimes I am getting this error: 2013-12-05 12:17:57.920 29553 ERROR savanna.context [-] Thread 'cluster-creating-8d093d9b-c675-4222-b53a-3319d54bc61f' fails with exception: 'This request was rate-limited. (HTTP 413)' 2013-12-05 12:17:57.920 29553 TRACE savanna.context Traceback (most recent call last): 2013-12-05 12:17:57.920 29553 TRACE savanna.context File /root/savanna/savanna/context.py, line 128, in wrapper 2013-12-05 12:17:57.920 29553 TRACE savanna.context func(*args, **kwargs) 2013-12-05 12:17:57.920 29553 TRACE savanna.context File /root/savanna/savanna/service/api.py, line 123, in _provision_cluster 2013-12-05 12:17:57.920 29553 TRACE savanna.context i.create_cluster(cluster) 2013-12-05 12:17:57.920 29553 TRACE savanna.context File /root/savanna/savanna/service/instances.py, line 56, in create_cluster 2013-12-05 12:17:57.920 29553 TRACE savanna.context _rollback_cluster_creation(cluster, ex) 2013-12-05 12:17:57.920 29553 TRACE savanna.context File /usr/lib64/python2.6/contextlib.py, line 23, in __exit__ 2013-12-05 12:17:57.920 29553 TRACE savanna.context self.gen.next() 2013-12-05 12:17:57.920 29553 TRACE savanna.context File /root/savanna/savanna/service/instances.py, line 36, in create_cluster 2013-12-05 12:17:57.920 29553 TRACE savanna.context _create_instances(cluster) 2013-12-05 12:17:57.920 29553 TRACE savanna.context File /root/savanna/savanna/service/instances.py, line 111, in _create_instances 2013-12-05 12:17:57.920 29553 TRACE savanna.context _run_instance(cluster, node_group, idx, aa_groups, userdata) 2013-12-05 12:17:57.920 29553 TRACE
Re: [openstack-dev] [savanna] Anti-affinity
No, anti-affinity does not work that way. It allows to distribute nodes running the same process, but you can't separate nodes running different processes (i.e. master and workers). Dmitry 2013/12/5 Arindam Choudhury arin...@live.com Hi, Is it possible using anti-affinity to reserve a compute node only for master(namenode+jobtracker)? Regards, Arindam -- From: arin...@live.com To: openstack-dev@lists.openstack.org Date: Thu, 5 Dec 2013 12:52:23 +0100 Subject: Re: [openstack-dev] [savanna] Anti-affinity Hi, Thanks a lot for your reply. Regards, Arindam -- Date: Thu, 5 Dec 2013 15:41:33 +0400 From: dmescherya...@mirantis.com To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [savanna] Anti-affinity Arindam, It is not achievable with the current Savanna. The anti-affinity feature allows to run one VM per compute-node only. It can not evenly distribute VMs in case the number of compute nodes is lower than the desired size of Hadoop cluster. Dmitry 2013/12/5 Arindam Choudhury arin...@live.com HI, I have 11 compute nodes. I want to create a hadoop cluster with 1 master(namenode+jobtracker) with 20 worker (datanode+tasktracker). How to configure the Anti-affinty so I can run the master in one host, while others will be hosting two worker? I tried some configuration, but I can not achieve it. Regards, Arindam ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat][hadoop][template] Does anyone has a hadoop template
Hello Jay, Just in case you've missed it, there is a project Savanna dedicated to deploying Hadoop clusters on OpenStack: https://github.com/openstack/savanna http://savanna.readthedocs.org/en/0.3/ Dmitry 2013/11/29 Jay Lau jay.lau@gmail.com Hi, I'm now trying to deploy a hadoop cluster with heat, just wondering if someone who has a heat template which can help me do the work. Thanks, Jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [savanna] Loading of savanna.conf properties
Hey Jon, Can you post your code as a work in progress review? Maybe we can perceive from the code what is wrong. Thanks, Dmitry 2013/11/10 Jon Maron jma...@hortonworks.com Hi, I am debugging an issue with the swift integration - I see os_auth_url with a value of 127.0.0.1, indicating that at the time the swift helper is invoked the default value for auth host is being leveraged rather than the value in the savanna.conf file. Any ideas how that may happen? More detail: We are invoking the swift_helper to configure the swift associated properties and ending up with the following in core-site.xml: /property property namefs.swift.service.savanna.auth.url/name valuehttp://127.0.0.1:35357/v2.0/tokens//value /property Which, as expected, yields the following when running on a tasktracker VM: org.apache.pig.impl.plan.VisitorException: ERROR 6000: file test.pig, line 7, column 0 Output Location Validation Failed for: 'swift://jmaron.savanna/output More info to follow: POST http://127.0.0.1:35357/v2.0/tokens/ failed on exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused -- Jon -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [savanna] Loading of savanna.conf properties
I've got two guesses why it does not work properly: 1. Savanna does not consume config file at all and the defaults (by coincidence) do not work for swift only. 2. you invoke swift_helper.get_swift_configs() at the top level of some module like that: SWIFT_CONFIG = swift_helper.get_swift_configs() The oslo.config is not initialized at module load time (that is true for any module), so that is why you see the defaults instead of the supplied values. Still, if you share the code, some other ideas might pop-up. Dmitry 2013/11/10 Jon Maron jma...@hortonworks.com I'm not sure that would help - all my code changes are runtime changes associated with EDP actions (i.e. the cluster and its configuration are already set). I was looking for help in trying to ascertain why the swift_helper would not return the savanna.conf value at provisioning time. -- Jon On Nov 10, 2013, at 3:53 AM, Dmitry Mescheryakov dmescherya...@mirantis.com wrote: Hey Jon, Can you post your code as a work in progress review? Maybe we can perceive from the code what is wrong. Thanks, Dmitry 2013/11/10 Jon Maron jma...@hortonworks.com Hi, I am debugging an issue with the swift integration - I see os_auth_url with a value of 127.0.0.1, indicating that at the time the swift helper is invoked the default value for auth host is being leveraged rather than the value in the savanna.conf file. Any ideas how that may happen? More detail: We are invoking the swift_helper to configure the swift associated properties and ending up with the following in core-site.xml: /property property namefs.swift.service.savanna.auth.url/name valuehttp://127.0.0.1:35357/v2.0/tokens//value /property Which, as expected, yields the following when running on a tasktracker VM: org.apache.pig.impl.plan.VisitorException: ERROR 6000: file test.pig, line 7, column 0 Output Location Validation Failed for: ' swift://jmaron.savanna/output More info to follow: POST http://127.0.0.1:35357/v2.0/tokens/ failed on exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused -- Jon -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Fwd: [Openstack-Dev] Announcement of the Compass Deployment project
I've noticed you list Remote install and configure a Hadoop cluster (synergy with Savanna?) among possible use cases. Recently there was a discussion about Savanna on bare metal provisioning through Nova (see thread [1]). Nobody tested that yet, but it was concluded that it should work without any changes in Savanna code. So if Compass could set up baremetal provisioning with Nova, possibly Savanna will work on top of that out of the box. Dmitry [1] http://lists.openstack.org/pipermail/openstack-dev/2013-October/017438.html 2013/11/1 Robert Collins robe...@robertcollins.net On 1 November 2013 20:41, Rochelle Grober roc...@gmail.com wrote: A message from my associate as he wings to the Icehouse OpenStack summit (and yes, we're psyched): Our project, code named Compass is a Restful API driven deployment platform that performs discovery of the physical machines attached to a specified set of switches. It then customizes configurations for machines you identify and installs the systems and networks to your configuration specs. Besides presenting the technical internals and design decisions of Compass at the Icehouse summit, we will also have a demo session. Cool - when is it? I'd like to get along. ... We look forward to showing the community our project, receiving and incorporating, brainstorming what else it could do, and integrating it into the OpenStack family . We are a part of the OpenStack community and want to support it both with core participation and with Compass. I'm /particularly/ interested in the interaction with Neutron and network modelling - do you use Neutron for the physical switch interrogation, do you inform Neutron about the topology and so on. Anyhow, lets make sure we can connect and see where we can collaborate! Cheers, Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Savanna] Savanna on Bare Metal and Base Requirements
Hello Travis, We didn't researched Savanna on bare metal, though we considered it some time ago. I know little of bare metal provisioning, so I am rather unsure what problems you might experience. My main concern are images: does bare metal provisioning work with qcow2 images? Vanilla plugin (which installs Vanilla Apache Hadoop) requires a pre-built Linux images with Hadoop, so if qcow2 does not work for bare metal, you will need to somehow build images in required format. On the other hand HDP plugin (which installs Hortonworks Data Platform), does not require pre-built images, but works only on Red Hat OSes, as far as I know. Another concern: does bare metal support cloud-init? Savanna relies on it and reimplementing that functionality some other way might take some time. As for your concern on which API calls Savanna makes: it is a pretty small list of requests. Mainly authentication with keystone, basic operations with VMs via nova (create, list, terminate), basic operations with images (list, set/get attributes). Snapshots are not used. That is for basic functionality. Other than that, some features might require additional API calls. For instance Cinder support naturally requires calls for volume create/list/delete. Thanks, Dmitry 2013/10/25 Tripp, Travis S travis.tr...@hp.com Hello Savanna team, ** ** I’ve just skimmed through the online documentation and I’m very interested in this project. We have a grizzly environment with all the latest patches as well as several Havana backports applied. We are are doing bare metal provisioning through Nova. It is limited to flat networking. ** ** Would Savanna work in this environment? What are the requirements? What are the minimum set of API calls that need to supported (for example, we can’t support snapshots)? ** ** Thank you, Travis ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [savanna] Program name and Mission statement
Mike, and if you looked up 'compute' in dictionary, you would never guess what OpenStack Compute does :-). I think that 'Data Processing' is a good name which in short describes what Savanna is going to be. The name 'MapReduce' for the program does not cover whole functionality provided by Savanna. Todays Hadoop distributions include not only MapReduce frameworks, but also a bunch of other products, not all of which are based on MapReduce. In fact the core of Hadoop 2.0, YARN, was built with idea of supporting other, non-MapReduce frameworks. For instance Twitter Storm was ported on YARN recently. I am also +1 on Matthew's mission proposal: Mission: To provide the OpenStack community with an open, cutting edge, performant and scalable data processing stack and associated management interfaces. Dmitry 2013/9/10 Mike Spreitzer mspre...@us.ibm.com A quick dictionary lookup of data processing yields the following. I wonder if you mean something more specific. data processing |ˈˌdædə ˈprɑsɛsɪŋ| noun a series of operations on data, esp. by a computer, to retrieve, transform, or classify information. From:Matthew Farrellee m...@redhat.com To:OpenStack Development Mailing List openstack-dev@lists.openstack.org, Date:09/10/2013 09:53 AM Subject:Re: [openstack-dev] [savanna] Program name and Mission statement -- Rough cut - Program: OpenStack Data Processing Mission: To provide the OpenStack community with an open, cutting edge, performant and scalable data processing stack and associated management interfaces. On 09/10/2013 09:26 AM, Sergey Lukjanov wrote: It sounds too broad IMO. Looks like we need to define Mission Statement first. Sincerely yours, Sergey Lukjanov Savanna Technical Lead Mirantis Inc. On Sep 10, 2013, at 17:09, Alexander Kuznetsov akuznet...@mirantis.com mailto:akuznet...@mirantis.com akuznet...@mirantis.com wrote: My suggestion OpenStack Data Processing. On Tue, Sep 10, 2013 at 4:15 PM, Sergey Lukjanov slukja...@mirantis.com mailto:slukja...@mirantis.comslukja...@mirantis.com wrote: Hi folks, due to the Incubator Application we should prepare Program name and Mission statement for Savanna, so, I want to start mailing thread about it. Please, provide any ideas here. P.S. List of existing programs: https://wiki.openstack.org/wiki/Programs P.P.S. https://wiki.openstack.org/wiki/Governance/NewPrograms Sincerely yours, Sergey Lukjanov Savanna Technical Lead Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.orgOpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.orgOpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Savanna] Cluster launch error
Linus, Sorry for taking so long to respond. The cluster machines were removed by rollback which was caused by exception: 2013-08-04 11:08:33.907 3542 INFO savanna.service.instances [-] Cluster 'cluster-test-01' creation rollback (reason: unexpected type type 'NoneType' for addr arg) Looks like currently code hides the exception stacktrace, so I can't tell what caused it. My suggestion would be to change the code to print the stacktrace. That way it will be much easier to diagnose the issue. Dmitry 2013/8/4 Linus Nova ln...@linusnova.com HI, I installed OpenStack Savanna in OpenStack Grizzely release. As you can see in savanna.log, the savanna-api start and operates correctly. When I launch the cluster, the VMs start correctly but soon after they are removed as shown in the log file. Do you have any ideas on what is happening? Best regards. Linus Nova ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Savanna] How to manage Hadoop configurations
Arindam, You may find examples of REST requests here: https://github.com/stackforge/savanna/tree/master/etc/rest-api-samples To get the full list of supported configs, send GET request to the following URL /v1.0/$TENANT/plugins/vanilla/1.1.2 Note that each config has 'scope' parameter. Configs with 'node' scope should be declared in node group (or node group template). With 'cluster' - in cluster (or cluster template). Dmitry 2013/7/17 Arindam Choudhury arin...@live.com Hi, Sorry for posting in wrong mailing list. Thanks for your reply. I want to know how to do it from CLI using directly the API. I think I have put it in the ng_master_template_create.json and ng_worker_template_create.json(from the quickstart). But I am confused about the syntax. Regards, Arindam -- Subject: Re: [Savanna-all] How to manage Hadoop configurations From: slukja...@mirantis.com Date: Wed, 17 Jul 2013 14:09:24 +0400 CC: openstack-dev@lists.openstack.org; savanna-...@lists.launchpad.net To: arin...@live.com Hi Arindam, you can specify all Hadoop configurations in node group and cluster templates (after choosing node processes). P.S. We are now using openstack-dev mailing list instead of savanna-all. Sincerely yours, Sergey Lukjanov Savanna Technical Lead Mirantis Inc. On Jul 17, 2013, at 14:07, Arindam Choudhury arin...@live.com wrote: Hi, Savanna is working very good. I can create a hadoop cluster and launch hadoop jobs. Using the Quick Start tutorial I have a basic understanding of the API. But I can not figure out how to manage hadoop configurations e.g. data split size, java heap size. If someone put a light on it, it will be very helpful. Thanks, Arindam -- Mailing list: https://launchpad.net/~savanna-all Post to : savanna-...@lists.launchpad.net Unsubscribe : https://launchpad.net/~savanna-all More help : https://help.launchpad.net/ListHelp ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev