[GitHub] cloudstack pull request #1709: CLOUDSTACK-7982 - KVM live migration
Github user mlsorensen commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1709#discussion_r84509497 --- Diff: core/src/com/cloud/agent/api/CancelMigrationCommand.java --- @@ -0,0 +1,35 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. +package com.cloud.agent.api; + +public class CancelMigrationCommand extends Command { --- End diff -- Along these lines of cancellation, I've long thought that we really need the ability to clean up long running jobs if the agent disconnects from the management server for any reason (say upgrade or restart of agent or management server, network problems, etc). Normally the management server will know the job failed but the agent keeps on trucking, causing problems, especially for things like migrations of storage. This may be an important thing to add for this feature, to avoid situations where a migration completes but CloudStack does not know about it because the management server was restarted during the migration. Rather than forcing the management server to know that the agent work needs to be cleaned up and sending a command to the hypervisor that is tailored to each command that can fail, one solution that I've seen implemented that has worked well is for LibvirtComputingResource to hold a global List of tasks, then it overrides the disconnected() method and loops through this list, running the tasks. It then exposes methods addDisconnectHook(Runnable hook) and removeDisconnectHook(Runnable hook) so that commands that are sensitive to being interrupted can add in cancellation logic in the case of disconnect before starting and remove it when finished. Something like: @Override public void disconnected() { this._connected = false; s_logger.info("Detected agent disconnect event, running through " + _disconnectHooks.size() + " disconnect hooks"); for (Runnable hook : _disconnectHooks) { hook.run(); } _disconnectHooks.clear(); } public void addDisconnectHook(Runnable hook) { s_logger.debug("Adding disconnect hook " + hook); _disconnectHooks.add(hook); } public void removeDisconnectHook(Runnable hook) { s_logger.debug("Removing disconnect hook " + hook); if (_disconnectHooks.contains(hook)) { s_logger.debug("Removing disconnect hook " + hook); _disconnectHooks.remove(hook); } else { s_logger.debug("Requested removal of disconnect hook, but hook not found: " + hook); } } An example hook to cancel the migration might look like this: public class MigrationCancelHook extends Thread { private static final Logger LOGGER = Logger.getLogger(MigrationCancelHook.class.getName()); private static final String HOOK_PREFIX = "MigrationCancelHook-"; Domain _migratingDomain; String _vmName; public MigrationCancelHook(Domain migratingDomain) throws LibvirtException { super(HOOK_PREFIX + migratingDomain.getName()); _migratingDomain = migratingDomain; _vmName = migratingDomain.getName(); } @Override public void run() { LOGGER.info("Interrupted migration of " + _vmName); try { if (_migratingDomain.abortJob() == 0) { LOGGER.warn("Aborted migration job for " + _vmName); } } catch (Exception ex) { LOGGER.warn("Failed to abort migration job for " + _vmName, ex); } } } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if
[GitHub] cloudstack issue #1708: Fwd merge 4.8->4.9 marvintestfixes to master
Github user mlsorensen commented on the issue: https://github.com/apache/cloudstack/pull/1708 If we can confirm that these were false positives, we're good to merge, no? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack issue #1692: Fix Smoke Test Failures
Github user mlsorensen commented on the issue: https://github.com/apache/cloudstack/pull/1692 If the three test failures don't represent regressions, we should probably get this merged per Rohit's suggestion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack issue #1572: CLOUDSTACK-9395: Add Virtio RNG device to Instances ...
Github user mlsorensen commented on the issue: https://github.com/apache/cloudstack/pull/1572 I realize I'm late to the party on this, but I'd like to suggest that we change from the agent.properties "vm.rng.enable=true" to simply using the libvirt version detection like we do for other things, and perhaps allow a "vm.rng.disable". It really seems like this is a useful setting and should probably be default wherever possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-8562: Dynamic Role-Based API C...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/1489#issuecomment-215147096 @koushik-das -- I'd say yes, that testing is good enough. If a test can determine that the default roles allow the expected access to the APIs then we're good. There was some discussion of commands.properties deprecation in 2013. In particular, plugin developers had issues with the way it works. People should really have been using the annotations by now, but I believe it was not well communicated and the fact that commands.properties is still around has confused new developers. From the original discussion, the only reason it was left in place was in case users wanted to disable APIs entirely by setting the bitmask of an API to 0. Do we have that functionality covered now? http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201310.mbox/%3c2b439486-9c4b-41b7-a698-b7ade234b...@netapp.com%3E https://github.com/apache/cloudstack/commit/9f7b4884a7a0bf0b15d94777cff449fde4664af2 Note also that existing installations are not being affected, they have to choose to use the new mechanism. In theory, I'm fine with some sort of staged transition where the dynamic checking is opt-in for a release, but I worry that it will perpetuate the issue we've seen since 2013, and puts a burden on the community to go back and clean up at some later date. For these reasons, I think if it's functionally identical on new installs and doesn't affect existing installs use of commands.properties, we should just cut over. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-8562: Dynamic Role-Based API C...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/1489#issuecomment-214843061 It seems complicated to try to actively promote a choice of multiple ways to check for API authorization (dynamic vs static). I am for deprecating commands.properties, and honestly I wish it would have been removed from the source tree altogether long ago instead of informally deprecated, because people have been adding to it rather than using annotations. I've spoken to committers even recently who didn't realize that you didn't have to use commands.properties. The annotation and dymanic methods are so much cleaner, particularly for plugin developers as it's often the only piece of the source tree that they have to change outside of their own plugin directory. I don't really think the majority of users will care about the implementation, so long as functionally the result is the same. That is to say that the API permissions on a fresh install before and after this is merged should behave the same out of the box. If that's the case then I don't think users will feel forced into anything or even notice that something was changed, and if someone really does care, it sounds simple enough to enable commands.properties. If someone is used to fiddling with this file, they won't have difficulty creating it and toggling a global setting manually. Speaking from having the experience of trying to manage a custom commands.properties, the work involved here to choose static is minor compared to simply maintaining a custom commands.properties, making sure an upgrade doesn't overwrite your change, and merging in new changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: systemvm: preserve file permissions, set ...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/1420#issuecomment-206540043 Let's merge this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-8832 : Update Nuage VSP plugin...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/801#issuecomment-153863283 John, I probably shouldn't respond in this thread, but from your response I feel like much of my point was missed. In fact I had pointed out the very thing you mention, that there's no requirement for third parties to submit their code to the community, they could ship their plugin standalone and have their users install it. If a user has a problem, it won't matter if the plugin was shipped with CloudStack or installed standalone, the result will be the same. I think we are in agreement that the code should have the same standards, but I think we (and I mean all of us) differ to varying degrees in what those standards should be. One may block an important commit because it uses String.length() == 0 instead of String.isEmpty(), or doesn't print enough debug info, and end up postponing the benefit of the fixes to the next release, while another committer might not care about that level of review and standard. There should probably be a basic list of items to look for so that the reviews themselves can be standardized. I think that would also address the concern I have for the community culture, because contributors will fear the review process less if they can see up front what the criteria are for merge. There's little I can think of that is more detrimental to growing our community of committers than making people feel like they're submitted to an arbitrary level of scrutiny whenever they contribute. When contributing becomes painful then we're held at arms length, rather than encouraging ownership of the code and participation. On Wed, Nov 4, 2015 at 10:35 AM, John Burwell <notificati...@github.com> wrote: > I agree with @runseb <https://github.com/runseb> that we need to move > this discussion to dev@, and re-assess accepting the maintenance > responsibly for code that cannot be tested and verified by the community. > This plugin has already been accepted into 4.5, and as such, would be > "grandfathered" into any decision we would make. Therefore, we should carry > our current precedent forward and not prevent acceptance of this PR for > this reason. We are at a point in the 4.6 release cycle that only blockers > should be accepted. Otherwise, we will never ship 4.6. > > @KrisSterckx <https://github.com/KrisSterckx> I realize my comments may > come across as unappreciative which is not my intention. I greatly > appreciate the work @nlivens <https://github.com/nlivens> and you have > done to contribute this capability to the community. This plugin highlights > a larger issue which is that, as a community, we are delivering releases > containing code that cannot be verified. By contributing this plugin to our > community, we become responsible for its support and maintenance. In 6-12 > months, how do answer answer a user who asks does this plugin work in the > latest release if you are unavailable? > > @runseb <https://github.com/runseb> while there is a fairness issue for > the contributor, I think it is incredibly unfair to our users that we ship > code without on-going validation. When a user sees the availability of > PluginX supporting Device1 in a release, they assume that we have verified > the proper operation of PluginX since we shipped it in that release. In > fact, we haven't, and they may be attempting to use something that is > completely broken. > > @mlsorensen <https://github.com/mlsorensen> yes, plugins provide a > mechanism for vendors to extend CloudStack. However, there is no > requirement for those plugins to be part of the community, OSS codebase. > Therefore, we should not conflate the ability of third-parties to extend > CloudStack using plugins with the code that is accepted as part of the > community's repositories. All code in our repositories, regardless of being > in the core or a plugin, should be held to the same standards. When a user > has a problem, they aren't any less frustrated with the project because > their problem happens to be in a plugin. > > @mike-tutkowski <https://github.com/mike-tutkowski> I don't believe that > code inspection by itself is adequate to determine the quality of code > integrating an external device. For example, I reviewed the code for this > plugin, and while I feel confident that it will behave well within the > management server, I have absolutely no idea if the various network > functions are being properly automated using the Nuage client A
[GitHub] cloudstack pull request: CLOUDSTACK-8832 : Update Nuage VSP plugin...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/801#issuecomment-141502048 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-8530: KVM hosts without active...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/340#issuecomment-107684409 Looks fine to me. On Mon, Jun 1, 2015 at 12:13 PM, Rohit Yadav notificati...@github.com wrote: pinging KVM and HA gurus to help review this @kishankavala https://github.com/kishankavala @wilderrodrigues https://github.com/wilderrodrigues @remibergsma https://github.com/remibergsma @mlsorensen https://github.com/mlsorensen @abhinandanprateek https://github.com/abhinandanprateek @devdeep https://github.com/devdeep and others thanks :) â Reply to this email directly or view it on GitHub https://github.com/apache/cloudstack/pull/340#issuecomment-107674771. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-6181: Allow RBD volumes to be ...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/281#issuecomment-105419893 Just make sure someone has actually used it to resize a volume with data. On Mon, May 25, 2015 at 5:47 AM, Remi Bergsma notificati...@github.com wrote: What do we want to do with this PR? As far as I can tell it is included in 4.5.1 and master. Just checked master and it has not been reverted. Let me know what we want to do. @andrijapanic https://github.com/andrijapanic requested a backport here http://markmail.org/thread/gau2xngqjpq5cza7. â Reply to this email directly or view it on GitHub https://github.com/apache/cloudstack/pull/281#issuecomment-105227207. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-8339: Allow non-root users to ...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/288#issuecomment-105031461 Oh, I see the original post mentions that! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-8339: Allow non-root users to ...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/288#issuecomment-105031405 Note you still have to add a sudoers entry for the user to run cloudstack-setup-agent. https://cwiki.apache.org/confluence/display/CLOUDSTACK/Allow+non-root+user+to+add+KVM+hypervisor --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-8339: Allow non-root users to ...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/288#issuecomment-105133656 The issue now is that root credentials for the host are stored in the db, and even echoed back if you ask to list hosts with details. It's a huge step forward to have a single user who only has access to run cloudstack-setup-agent. Perhaps you don't have the same root everywhere, but you don't have to have the same non-root user/password everywhere, either. You can even remove/disable this user after the host is added. In addition, many places don't allow root to SSH as a basic infosec policy. I'd like us to get out of the business of dictating and conflicting with configuration management, where possible, including things suggested like changing passwords upon agent setup. These things will rub big shops who have their config management story together the wrong way, and we already do way too much (like touching iptables and libvirt settings) IMO. They also confuse newbies more than they help (why can't I log in anymore?, and I've heard a lot of why did my networking get screwed up and restart? from the existing setups scripts). Also re: agent listening, I don't think we want the agent to listen. It has been a huge boon to have all ports closed on the hypervisor except SSH, everywhere I've gone, from an infosec standpoint. I'm also not sure there's a meaningful gain by having the mgmt server contact the host via a listening agent port vs contacting the host via SSH and a user who has no access other than to run setup. I'm also more confident that SSH would pass a penetration test. An easy fix that I think would accommodate everyone at this point is to only use 'sudo' if the user passed in is not 'root'. Then everyone doing it the current way is unaffected by the sudo and requiretty, and people who want to only pass non-root credentials to cloudstack mgmt server can do that as well with a sudoers line for cloudstack-agent-setup and a -t for the sudo (or perhaps just document the requiretty along with the sudoers line). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-6181: Allow RBD volumes to be ...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/281#issuecomment-104692430 It was fixed for QCOW2 and LVM, back in January or February (unless something has changed since then). It may be fine for RBD, I just wanted to bring it up as a caution. My impression was that there might be an issue with the java bindings, either a bug or a missing call. If I remember correctly, there are two things, there's libvirt managing volumes, and libvirt managing block devices attached to VMs, and they are not the same thing. Resizing a libvirt volume didn't necessarily notify any VM process using that volume of the change, nor resize the block device attached to the live VM. 'virsh blockresize' does this correctly at both the libvirt volume and VM block, though, so we just went back to the script rather than trying to figure out how to patch the java libvirt bindings. Just make sure the existing behavior is not changed and test that you can stop/start and read/write the root disk fine post a resize. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-6181: Allow RBD volumes to be ...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/281#issuecomment-104432629 You may want to be careful with this. We ran into issues with corruption by asking libvirt to resize the qcow2 images and reverted back to the script. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: Refactor/libvirt resource
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/233#issuecomment-100995179 Yeah, I got past tomcat and other systemd switchover issues on 4.5 code, but then ran into other issues. The mgmt server configs got all messed up by the new RPMs. I got the server xml worked out but it couldn't connect to the db, 'access denied for cloud@localhost'. It doesn't seem to even be reading my db.properties. Mainly master teething issues around upgrade, I presume, but I've spent too much time on it today. I'll try a fresh install later. One thing in particular I'd like to see tested is live migration. If I don't manage to test this please do at least this one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: Refactor/libvirt resource
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/233#issuecomment-99635718 Yeah, I think this is a big enough change that we should give all KVM stakeholders an opportunity to review and test before merge. At least the committers, but anyone who wants to participate, of course. Mike T, Wido, myself, Lucian, Andrija, just to try to get an array of use cases and storage types represented. Hopefully at least someone who uses OVS and vxlan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: return a state instead of null in Abstrac...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/211#issuecomment-97475041 I guess for KVM at least, the behavior should be similar to stopping/starting the Agent. It makes the hypervisor go into disconnected, but leaves the VM states as-is, and is reconciled when the Agent is restarted. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: return a state instead of null in Abstrac...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/211#issuecomment-97474185 I haven't researched this fix extensively. However, I just want to make sure that the alert state is not accompanied by any undesirable behavior. Moving the hypervisors to alert state to warn the admin is fine, but we don't want to assume that anything actually went down. In the scenario given, it is probably more likely that there was a break in the network between the hypervisors and mgmt server, and VMs are running fine, so as long as we are accounting for that scenario during recovery from alert state and not assuming the VMs are down we should be good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-8251: Adding automation test c...
Github user mlsorensen commented on the pull request: https://github.com/apache/cloudstack/pull/128#issuecomment-86101387 Is there a typo in the Jira ticket? CLOUDSTACK-8251 is Allow option for securing KVM Libvirt daemon network connections, which is one I opened, and now references this commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---