Re: VPC's VR missing public NIC eth1
If we get the 4.3.0 to 4.3.1 in then there is no need for my previous comment. I was concerned it would not happen. On Wed, Jun 4, 2014 at 9:20 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: On Wed, Jun 4, 2014 at 4:27 PM, Marcus shadow...@gmail.com wrote: Regarding 5e80e5d33d9a295b91cdba9377f52d9d963d802a, we should probably do that for IpAssocCommand as well. I am feeling like in a spiral down. If the user input is fixed up coming in it should probably be fixed down going out. when we save a uri in the db we could still present the user with an id. Not when the id can actually be an non vlan uri, however. This is going to take a lot of work to get right. I doubt ipAssocCommand is going to be the last one. -- Daan
Re: VPC's VR missing public NIC eth1
That would be a good idea On Wed, Jun 4, 2014 at 9:16 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: Marcus, I didn't do the db thing for 4.3 but it is idem-potent and can go in a Upgrade430to431.java as well. This one doesn't exist yet. On Wed, Jun 4, 2014 at 4:27 PM, Marcus shadow...@gmail.com wrote: That wasn't the patch I thought it was. Regarding 5e80e5d33d9a295b91cdba9377f52d9d963d802a, we should probably do that for IpAssocCommand as well. I'm not sure we have the db fix in 4.3 yet, and so a fix like this would be required for IpAssocCommand (and perhaps other unfound things). On Tue, Jun 3, 2014 at 3:22 PM, Marcus shadow...@gmail.com wrote: Hmm.. ok. I guess we can apply the bandaid patch as well On Tue, Jun 3, 2014 at 12:16 PM, Edison Su edison...@citrix.com wrote: I checked in a commit: 5e80e5d33d9a295b91cdba9377f52d9d963d802a, which will fix some of the mess of vlan id. -Original Message- From: Marcus [mailto:shadow...@gmail.com] Sent: Tuesday, June 03, 2014 9:57 AM To: Daan Hoogland Cc: dev Subject: Re: VPC's VR missing public NIC eth1 Ok, thanks. It seems there are other cases where the Command being passed from the mgmt server has inconsistent broadcastUri as well, this should blanket fix them. In the meantime there's a growing group of 4.3 upgraders who are getting pitchforks out over at CLOUDSTACK-6464, so we may want to have something in 4.3.1 too. On Tue, Jun 3, 2014 at 12:30 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: one clarification, I was not suggesting changing vlan://x back to x, just the case where x==untagged. I had a little analog discussion with Hugo and he convinced me that untagged has no special meaning in SDN cases, maybe for vxlan. So the problem I saw is at least smaller then in my mind. I have committed the db change to update 4.3.0 to 4.4.0. It will need heavy testing. And I didn't extensively look into other tables that need such a change. networks is the likely candidate but there may be others. On Mon, Jun 2, 2014 at 6:38 PM, Marcus shadow...@gmail.com wrote: Just to recap... I was trying to review the issue in my head and thought it might be useful to write it down. in 4.3 we got the BroadcastDomainType enum introduced, and many parts of the code were changed to use that when dealing with the vlan id. This code, among other things, returns a vlan id in URI format, describing both the technology used to provide the virtual lan, along with the id. Along the way this seems to have caused the value itself to be stored as a URI (still not sure where, by whom, or if it was intentional). That was fine and seemed to work after some fixing, until there was an upgrade done where the existing database value was NOT in URI format. We had a few places where the code was never changed to use BroadcastDomainType to 'normalize' the info from the database (e.g. the IpAssocVpcCommand the mgmt server constructs), so upgrades are broken. Most places in the code as it is now are working with a live value of 'vlan://x', regardless of whether the database has 'vlan://x' or just 'x', thanks to this code it returns the same 'vlan://' for either stored value. For these places it shouldn't matter if we fix the old databases to store 'vlan://x' or the 4.3 installs to go back to 'x'. However, there are a few places that are broken, like this IpAssocVpcCommand the mgmt server creates and CLOUDSTACK-5505. If we switch the db value back, we have to identify all of the outstanding ones and fix them. In addition, new code since then may have perhaps assumed that the db value is 'vlan://', and might have bothered to pass through the interpolation, so they may break as well. If we had full coverage on the test suite it would be easy to change the value back in the DB of a 4.3 or 4.4 install and see what breaks. If we don't switch the value back, and instead update old databases to the current way, it fixes the immediate issue but we end up with code doing the same thing in two different ways. Some places will be using the raw db value and other places will be asking for it to be normalized, and both will have the same result, which is kind of messy and prone to causing issues down the road if something changes again to separate these two. On Mon, Jun 2, 2014 at 10:01 AM, Marcus shadow...@gmail.com wrote: I'm not sure the KVM code needs to be changed, you're asking it to deal with an inconsistency from the mgmt server. Don't you find it odd
Re: VPC's VR missing public NIC eth1
That wasn't the patch I thought it was. Regarding 5e80e5d33d9a295b91cdba9377f52d9d963d802a, we should probably do that for IpAssocCommand as well. I'm not sure we have the db fix in 4.3 yet, and so a fix like this would be required for IpAssocCommand (and perhaps other unfound things). On Tue, Jun 3, 2014 at 3:22 PM, Marcus shadow...@gmail.com wrote: Hmm.. ok. I guess we can apply the bandaid patch as well On Tue, Jun 3, 2014 at 12:16 PM, Edison Su edison...@citrix.com wrote: I checked in a commit: 5e80e5d33d9a295b91cdba9377f52d9d963d802a, which will fix some of the mess of vlan id. -Original Message- From: Marcus [mailto:shadow...@gmail.com] Sent: Tuesday, June 03, 2014 9:57 AM To: Daan Hoogland Cc: dev Subject: Re: VPC's VR missing public NIC eth1 Ok, thanks. It seems there are other cases where the Command being passed from the mgmt server has inconsistent broadcastUri as well, this should blanket fix them. In the meantime there's a growing group of 4.3 upgraders who are getting pitchforks out over at CLOUDSTACK-6464, so we may want to have something in 4.3.1 too. On Tue, Jun 3, 2014 at 12:30 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: one clarification, I was not suggesting changing vlan://x back to x, just the case where x==untagged. I had a little analog discussion with Hugo and he convinced me that untagged has no special meaning in SDN cases, maybe for vxlan. So the problem I saw is at least smaller then in my mind. I have committed the db change to update 4.3.0 to 4.4.0. It will need heavy testing. And I didn't extensively look into other tables that need such a change. networks is the likely candidate but there may be others. On Mon, Jun 2, 2014 at 6:38 PM, Marcus shadow...@gmail.com wrote: Just to recap... I was trying to review the issue in my head and thought it might be useful to write it down. in 4.3 we got the BroadcastDomainType enum introduced, and many parts of the code were changed to use that when dealing with the vlan id. This code, among other things, returns a vlan id in URI format, describing both the technology used to provide the virtual lan, along with the id. Along the way this seems to have caused the value itself to be stored as a URI (still not sure where, by whom, or if it was intentional). That was fine and seemed to work after some fixing, until there was an upgrade done where the existing database value was NOT in URI format. We had a few places where the code was never changed to use BroadcastDomainType to 'normalize' the info from the database (e.g. the IpAssocVpcCommand the mgmt server constructs), so upgrades are broken. Most places in the code as it is now are working with a live value of 'vlan://x', regardless of whether the database has 'vlan://x' or just 'x', thanks to this code it returns the same 'vlan://' for either stored value. For these places it shouldn't matter if we fix the old databases to store 'vlan://x' or the 4.3 installs to go back to 'x'. However, there are a few places that are broken, like this IpAssocVpcCommand the mgmt server creates and CLOUDSTACK-5505. If we switch the db value back, we have to identify all of the outstanding ones and fix them. In addition, new code since then may have perhaps assumed that the db value is 'vlan://', and might have bothered to pass through the interpolation, so they may break as well. If we had full coverage on the test suite it would be easy to change the value back in the DB of a 4.3 or 4.4 install and see what breaks. If we don't switch the value back, and instead update old databases to the current way, it fixes the immediate issue but we end up with code doing the same thing in two different ways. Some places will be using the raw db value and other places will be asking for it to be normalized, and both will have the same result, which is kind of messy and prone to causing issues down the road if something changes again to separate these two. On Mon, Jun 2, 2014 at 10:01 AM, Marcus shadow...@gmail.com wrote: I'm not sure the KVM code needs to be changed, you're asking it to deal with an inconsistency from the mgmt server. Don't you find it odd that one Command from the mgmt server provides broadcastUri=vlan://untagged and another provides broadcastUri=untagged? I'm not sure I understand why changing 'untagged' into a URI format changes its meaning, but it seems like that doesn't make any sense to you, so perhaps we can break that out into a separate column so that we can still capture the info, if needed. If we don't like URI format for the vlan id, that's fine, but we
Re: VPC's VR missing public NIC eth1
Marcus, I didn't do the db thing for 4.3 but it is idem-potent and can go in a Upgrade430to431.java as well. This one doesn't exist yet. On Wed, Jun 4, 2014 at 4:27 PM, Marcus shadow...@gmail.com wrote: That wasn't the patch I thought it was. Regarding 5e80e5d33d9a295b91cdba9377f52d9d963d802a, we should probably do that for IpAssocCommand as well. I'm not sure we have the db fix in 4.3 yet, and so a fix like this would be required for IpAssocCommand (and perhaps other unfound things). On Tue, Jun 3, 2014 at 3:22 PM, Marcus shadow...@gmail.com wrote: Hmm.. ok. I guess we can apply the bandaid patch as well On Tue, Jun 3, 2014 at 12:16 PM, Edison Su edison...@citrix.com wrote: I checked in a commit: 5e80e5d33d9a295b91cdba9377f52d9d963d802a, which will fix some of the mess of vlan id. -Original Message- From: Marcus [mailto:shadow...@gmail.com] Sent: Tuesday, June 03, 2014 9:57 AM To: Daan Hoogland Cc: dev Subject: Re: VPC's VR missing public NIC eth1 Ok, thanks. It seems there are other cases where the Command being passed from the mgmt server has inconsistent broadcastUri as well, this should blanket fix them. In the meantime there's a growing group of 4.3 upgraders who are getting pitchforks out over at CLOUDSTACK-6464, so we may want to have something in 4.3.1 too. On Tue, Jun 3, 2014 at 12:30 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: one clarification, I was not suggesting changing vlan://x back to x, just the case where x==untagged. I had a little analog discussion with Hugo and he convinced me that untagged has no special meaning in SDN cases, maybe for vxlan. So the problem I saw is at least smaller then in my mind. I have committed the db change to update 4.3.0 to 4.4.0. It will need heavy testing. And I didn't extensively look into other tables that need such a change. networks is the likely candidate but there may be others. On Mon, Jun 2, 2014 at 6:38 PM, Marcus shadow...@gmail.com wrote: Just to recap... I was trying to review the issue in my head and thought it might be useful to write it down. in 4.3 we got the BroadcastDomainType enum introduced, and many parts of the code were changed to use that when dealing with the vlan id. This code, among other things, returns a vlan id in URI format, describing both the technology used to provide the virtual lan, along with the id. Along the way this seems to have caused the value itself to be stored as a URI (still not sure where, by whom, or if it was intentional). That was fine and seemed to work after some fixing, until there was an upgrade done where the existing database value was NOT in URI format. We had a few places where the code was never changed to use BroadcastDomainType to 'normalize' the info from the database (e.g. the IpAssocVpcCommand the mgmt server constructs), so upgrades are broken. Most places in the code as it is now are working with a live value of 'vlan://x', regardless of whether the database has 'vlan://x' or just 'x', thanks to this code it returns the same 'vlan://' for either stored value. For these places it shouldn't matter if we fix the old databases to store 'vlan://x' or the 4.3 installs to go back to 'x'. However, there are a few places that are broken, like this IpAssocVpcCommand the mgmt server creates and CLOUDSTACK-5505. If we switch the db value back, we have to identify all of the outstanding ones and fix them. In addition, new code since then may have perhaps assumed that the db value is 'vlan://', and might have bothered to pass through the interpolation, so they may break as well. If we had full coverage on the test suite it would be easy to change the value back in the DB of a 4.3 or 4.4 install and see what breaks. If we don't switch the value back, and instead update old databases to the current way, it fixes the immediate issue but we end up with code doing the same thing in two different ways. Some places will be using the raw db value and other places will be asking for it to be normalized, and both will have the same result, which is kind of messy and prone to causing issues down the road if something changes again to separate these two. On Mon, Jun 2, 2014 at 10:01 AM, Marcus shadow...@gmail.com wrote: I'm not sure the KVM code needs to be changed, you're asking it to deal with an inconsistency from the mgmt server. Don't you find it odd that one Command from the mgmt server provides broadcastUri=vlan://untagged and another provides broadcastUri=untagged? I'm not sure I understand why changing 'untagged' into a URI format changes its meaning, but it seems like that doesn't make
Re: VPC's VR missing public NIC eth1
On Wed, Jun 4, 2014 at 4:27 PM, Marcus shadow...@gmail.com wrote: Regarding 5e80e5d33d9a295b91cdba9377f52d9d963d802a, we should probably do that for IpAssocCommand as well. I am feeling like in a spiral down. If the user input is fixed up coming in it should probably be fixed down going out. when we save a uri in the db we could still present the user with an id. Not when the id can actually be an non vlan uri, however. This is going to take a lot of work to get right. I doubt ipAssocCommand is going to be the last one. -- Daan
Re: VPC's VR missing public NIC eth1
one clarification, I was not suggesting changing vlan://x back to x, just the case where x==untagged. I had a little analog discussion with Hugo and he convinced me that untagged has no special meaning in SDN cases, maybe for vxlan. So the problem I saw is at least smaller then in my mind. I have committed the db change to update 4.3.0 to 4.4.0. It will need heavy testing. And I didn't extensively look into other tables that need such a change. networks is the likely candidate but there may be others. On Mon, Jun 2, 2014 at 6:38 PM, Marcus shadow...@gmail.com wrote: Just to recap... I was trying to review the issue in my head and thought it might be useful to write it down. in 4.3 we got the BroadcastDomainType enum introduced, and many parts of the code were changed to use that when dealing with the vlan id. This code, among other things, returns a vlan id in URI format, describing both the technology used to provide the virtual lan, along with the id. Along the way this seems to have caused the value itself to be stored as a URI (still not sure where, by whom, or if it was intentional). That was fine and seemed to work after some fixing, until there was an upgrade done where the existing database value was NOT in URI format. We had a few places where the code was never changed to use BroadcastDomainType to 'normalize' the info from the database (e.g. the IpAssocVpcCommand the mgmt server constructs), so upgrades are broken. Most places in the code as it is now are working with a live value of 'vlan://x', regardless of whether the database has 'vlan://x' or just 'x', thanks to this code it returns the same 'vlan://' for either stored value. For these places it shouldn't matter if we fix the old databases to store 'vlan://x' or the 4.3 installs to go back to 'x'. However, there are a few places that are broken, like this IpAssocVpcCommand the mgmt server creates and CLOUDSTACK-5505. If we switch the db value back, we have to identify all of the outstanding ones and fix them. In addition, new code since then may have perhaps assumed that the db value is 'vlan://', and might have bothered to pass through the interpolation, so they may break as well. If we had full coverage on the test suite it would be easy to change the value back in the DB of a 4.3 or 4.4 install and see what breaks. If we don't switch the value back, and instead update old databases to the current way, it fixes the immediate issue but we end up with code doing the same thing in two different ways. Some places will be using the raw db value and other places will be asking for it to be normalized, and both will have the same result, which is kind of messy and prone to causing issues down the road if something changes again to separate these two. On Mon, Jun 2, 2014 at 10:01 AM, Marcus shadow...@gmail.com wrote: I'm not sure the KVM code needs to be changed, you're asking it to deal with an inconsistency from the mgmt server. Don't you find it odd that one Command from the mgmt server provides broadcastUri=vlan://untagged and another provides broadcastUri=untagged? I'm not sure I understand why changing 'untagged' into a URI format changes its meaning, but it seems like that doesn't make any sense to you, so perhaps we can break that out into a separate column so that we can still capture the info, if needed. If we don't like URI format for the vlan id, that's fine, but we need to do changes to the 4.3 installs and fix 4.4. As mentioned, I remember there being a decent amount of work to handle the vlan:// when it was introduced, and that will need to be done again to change it back. I'm not against that, but I'm not going to be the one doing that work, either :-) On Mon, Jun 2, 2014 at 3:47 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: I don't think this should be solved this way afterall. 'untagged' actually means no vlan, so it should not be prepended with 'vlan://'. I think the kvm code should be fixed for this not the generic code. On Fri, May 30, 2014 at 10:59 PM, Daan Hoogland daan.hoogl...@gmail.com wrote: On Fri, May 30, 2014 at 10:51 PM, Marcus shadow...@gmail.com wrote: Looks good to me, aside from he debug statement. Ah, the first line was not in my line of sight. -- Daan -- Daan -- Daan
Re: VPC's VR missing public NIC eth1
Ok, thanks. It seems there are other cases where the Command being passed from the mgmt server has inconsistent broadcastUri as well, this should blanket fix them. In the meantime there's a growing group of 4.3 upgraders who are getting pitchforks out over at CLOUDSTACK-6464, so we may want to have something in 4.3.1 too. On Tue, Jun 3, 2014 at 12:30 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: one clarification, I was not suggesting changing vlan://x back to x, just the case where x==untagged. I had a little analog discussion with Hugo and he convinced me that untagged has no special meaning in SDN cases, maybe for vxlan. So the problem I saw is at least smaller then in my mind. I have committed the db change to update 4.3.0 to 4.4.0. It will need heavy testing. And I didn't extensively look into other tables that need such a change. networks is the likely candidate but there may be others. On Mon, Jun 2, 2014 at 6:38 PM, Marcus shadow...@gmail.com wrote: Just to recap... I was trying to review the issue in my head and thought it might be useful to write it down. in 4.3 we got the BroadcastDomainType enum introduced, and many parts of the code were changed to use that when dealing with the vlan id. This code, among other things, returns a vlan id in URI format, describing both the technology used to provide the virtual lan, along with the id. Along the way this seems to have caused the value itself to be stored as a URI (still not sure where, by whom, or if it was intentional). That was fine and seemed to work after some fixing, until there was an upgrade done where the existing database value was NOT in URI format. We had a few places where the code was never changed to use BroadcastDomainType to 'normalize' the info from the database (e.g. the IpAssocVpcCommand the mgmt server constructs), so upgrades are broken. Most places in the code as it is now are working with a live value of 'vlan://x', regardless of whether the database has 'vlan://x' or just 'x', thanks to this code it returns the same 'vlan://' for either stored value. For these places it shouldn't matter if we fix the old databases to store 'vlan://x' or the 4.3 installs to go back to 'x'. However, there are a few places that are broken, like this IpAssocVpcCommand the mgmt server creates and CLOUDSTACK-5505. If we switch the db value back, we have to identify all of the outstanding ones and fix them. In addition, new code since then may have perhaps assumed that the db value is 'vlan://', and might have bothered to pass through the interpolation, so they may break as well. If we had full coverage on the test suite it would be easy to change the value back in the DB of a 4.3 or 4.4 install and see what breaks. If we don't switch the value back, and instead update old databases to the current way, it fixes the immediate issue but we end up with code doing the same thing in two different ways. Some places will be using the raw db value and other places will be asking for it to be normalized, and both will have the same result, which is kind of messy and prone to causing issues down the road if something changes again to separate these two. On Mon, Jun 2, 2014 at 10:01 AM, Marcus shadow...@gmail.com wrote: I'm not sure the KVM code needs to be changed, you're asking it to deal with an inconsistency from the mgmt server. Don't you find it odd that one Command from the mgmt server provides broadcastUri=vlan://untagged and another provides broadcastUri=untagged? I'm not sure I understand why changing 'untagged' into a URI format changes its meaning, but it seems like that doesn't make any sense to you, so perhaps we can break that out into a separate column so that we can still capture the info, if needed. If we don't like URI format for the vlan id, that's fine, but we need to do changes to the 4.3 installs and fix 4.4. As mentioned, I remember there being a decent amount of work to handle the vlan:// when it was introduced, and that will need to be done again to change it back. I'm not against that, but I'm not going to be the one doing that work, either :-) On Mon, Jun 2, 2014 at 3:47 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: I don't think this should be solved this way afterall. 'untagged' actually means no vlan, so it should not be prepended with 'vlan://'. I think the kvm code should be fixed for this not the generic code. On Fri, May 30, 2014 at 10:59 PM, Daan Hoogland daan.hoogl...@gmail.com wrote: On Fri, May 30, 2014 at 10:51 PM, Marcus shadow...@gmail.com wrote: Looks good to me, aside from he debug statement. Ah, the first line was not in my line of sight. -- Daan -- Daan -- Daan
RE: VPC's VR missing public NIC eth1
I checked in a commit: 5e80e5d33d9a295b91cdba9377f52d9d963d802a, which will fix some of the mess of vlan id. -Original Message- From: Marcus [mailto:shadow...@gmail.com] Sent: Tuesday, June 03, 2014 9:57 AM To: Daan Hoogland Cc: dev Subject: Re: VPC's VR missing public NIC eth1 Ok, thanks. It seems there are other cases where the Command being passed from the mgmt server has inconsistent broadcastUri as well, this should blanket fix them. In the meantime there's a growing group of 4.3 upgraders who are getting pitchforks out over at CLOUDSTACK-6464, so we may want to have something in 4.3.1 too. On Tue, Jun 3, 2014 at 12:30 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: one clarification, I was not suggesting changing vlan://x back to x, just the case where x==untagged. I had a little analog discussion with Hugo and he convinced me that untagged has no special meaning in SDN cases, maybe for vxlan. So the problem I saw is at least smaller then in my mind. I have committed the db change to update 4.3.0 to 4.4.0. It will need heavy testing. And I didn't extensively look into other tables that need such a change. networks is the likely candidate but there may be others. On Mon, Jun 2, 2014 at 6:38 PM, Marcus shadow...@gmail.com wrote: Just to recap... I was trying to review the issue in my head and thought it might be useful to write it down. in 4.3 we got the BroadcastDomainType enum introduced, and many parts of the code were changed to use that when dealing with the vlan id. This code, among other things, returns a vlan id in URI format, describing both the technology used to provide the virtual lan, along with the id. Along the way this seems to have caused the value itself to be stored as a URI (still not sure where, by whom, or if it was intentional). That was fine and seemed to work after some fixing, until there was an upgrade done where the existing database value was NOT in URI format. We had a few places where the code was never changed to use BroadcastDomainType to 'normalize' the info from the database (e.g. the IpAssocVpcCommand the mgmt server constructs), so upgrades are broken. Most places in the code as it is now are working with a live value of 'vlan://x', regardless of whether the database has 'vlan://x' or just 'x', thanks to this code it returns the same 'vlan://' for either stored value. For these places it shouldn't matter if we fix the old databases to store 'vlan://x' or the 4.3 installs to go back to 'x'. However, there are a few places that are broken, like this IpAssocVpcCommand the mgmt server creates and CLOUDSTACK-5505. If we switch the db value back, we have to identify all of the outstanding ones and fix them. In addition, new code since then may have perhaps assumed that the db value is 'vlan://', and might have bothered to pass through the interpolation, so they may break as well. If we had full coverage on the test suite it would be easy to change the value back in the DB of a 4.3 or 4.4 install and see what breaks. If we don't switch the value back, and instead update old databases to the current way, it fixes the immediate issue but we end up with code doing the same thing in two different ways. Some places will be using the raw db value and other places will be asking for it to be normalized, and both will have the same result, which is kind of messy and prone to causing issues down the road if something changes again to separate these two. On Mon, Jun 2, 2014 at 10:01 AM, Marcus shadow...@gmail.com wrote: I'm not sure the KVM code needs to be changed, you're asking it to deal with an inconsistency from the mgmt server. Don't you find it odd that one Command from the mgmt server provides broadcastUri=vlan://untagged and another provides broadcastUri=untagged? I'm not sure I understand why changing 'untagged' into a URI format changes its meaning, but it seems like that doesn't make any sense to you, so perhaps we can break that out into a separate column so that we can still capture the info, if needed. If we don't like URI format for the vlan id, that's fine, but we need to do changes to the 4.3 installs and fix 4.4. As mentioned, I remember there being a decent amount of work to handle the vlan:// when it was introduced, and that will need to be done again to change it back. I'm not against that, but I'm not going to be the one doing that work, either :-) On Mon, Jun 2, 2014 at 3:47 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: I don't think this should be solved this way afterall. 'untagged' actually means no vlan, so it should not be prepended with 'vlan://'. I think the kvm code should be fixed
Re: VPC's VR missing public NIC eth1
Hmm.. ok. I guess we can apply the bandaid patch as well On Tue, Jun 3, 2014 at 12:16 PM, Edison Su edison...@citrix.com wrote: I checked in a commit: 5e80e5d33d9a295b91cdba9377f52d9d963d802a, which will fix some of the mess of vlan id. -Original Message- From: Marcus [mailto:shadow...@gmail.com] Sent: Tuesday, June 03, 2014 9:57 AM To: Daan Hoogland Cc: dev Subject: Re: VPC's VR missing public NIC eth1 Ok, thanks. It seems there are other cases where the Command being passed from the mgmt server has inconsistent broadcastUri as well, this should blanket fix them. In the meantime there's a growing group of 4.3 upgraders who are getting pitchforks out over at CLOUDSTACK-6464, so we may want to have something in 4.3.1 too. On Tue, Jun 3, 2014 at 12:30 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: one clarification, I was not suggesting changing vlan://x back to x, just the case where x==untagged. I had a little analog discussion with Hugo and he convinced me that untagged has no special meaning in SDN cases, maybe for vxlan. So the problem I saw is at least smaller then in my mind. I have committed the db change to update 4.3.0 to 4.4.0. It will need heavy testing. And I didn't extensively look into other tables that need such a change. networks is the likely candidate but there may be others. On Mon, Jun 2, 2014 at 6:38 PM, Marcus shadow...@gmail.com wrote: Just to recap... I was trying to review the issue in my head and thought it might be useful to write it down. in 4.3 we got the BroadcastDomainType enum introduced, and many parts of the code were changed to use that when dealing with the vlan id. This code, among other things, returns a vlan id in URI format, describing both the technology used to provide the virtual lan, along with the id. Along the way this seems to have caused the value itself to be stored as a URI (still not sure where, by whom, or if it was intentional). That was fine and seemed to work after some fixing, until there was an upgrade done where the existing database value was NOT in URI format. We had a few places where the code was never changed to use BroadcastDomainType to 'normalize' the info from the database (e.g. the IpAssocVpcCommand the mgmt server constructs), so upgrades are broken. Most places in the code as it is now are working with a live value of 'vlan://x', regardless of whether the database has 'vlan://x' or just 'x', thanks to this code it returns the same 'vlan://' for either stored value. For these places it shouldn't matter if we fix the old databases to store 'vlan://x' or the 4.3 installs to go back to 'x'. However, there are a few places that are broken, like this IpAssocVpcCommand the mgmt server creates and CLOUDSTACK-5505. If we switch the db value back, we have to identify all of the outstanding ones and fix them. In addition, new code since then may have perhaps assumed that the db value is 'vlan://', and might have bothered to pass through the interpolation, so they may break as well. If we had full coverage on the test suite it would be easy to change the value back in the DB of a 4.3 or 4.4 install and see what breaks. If we don't switch the value back, and instead update old databases to the current way, it fixes the immediate issue but we end up with code doing the same thing in two different ways. Some places will be using the raw db value and other places will be asking for it to be normalized, and both will have the same result, which is kind of messy and prone to causing issues down the road if something changes again to separate these two. On Mon, Jun 2, 2014 at 10:01 AM, Marcus shadow...@gmail.com wrote: I'm not sure the KVM code needs to be changed, you're asking it to deal with an inconsistency from the mgmt server. Don't you find it odd that one Command from the mgmt server provides broadcastUri=vlan://untagged and another provides broadcastUri=untagged? I'm not sure I understand why changing 'untagged' into a URI format changes its meaning, but it seems like that doesn't make any sense to you, so perhaps we can break that out into a separate column so that we can still capture the info, if needed. If we don't like URI format for the vlan id, that's fine, but we need to do changes to the 4.3 installs and fix 4.4. As mentioned, I remember there being a decent amount of work to handle the vlan:// when it was introduced, and that will need to be done again to change it back. I'm not against that, but I'm not going to be the one doing that work, either :-) On Mon, Jun 2
Re: VPC's VR missing public NIC eth1
I don't think this should be solved this way afterall. 'untagged' actually means no vlan, so it should not be prepended with 'vlan://'. I think the kvm code should be fixed for this not the generic code. On Fri, May 30, 2014 at 10:59 PM, Daan Hoogland daan.hoogl...@gmail.com wrote: On Fri, May 30, 2014 at 10:51 PM, Marcus shadow...@gmail.com wrote: Looks good to me, aside from he debug statement. Ah, the first line was not in my line of sight. -- Daan -- Daan
Re: VPC's VR missing public NIC eth1
I'm not sure the KVM code needs to be changed, you're asking it to deal with an inconsistency from the mgmt server. Don't you find it odd that one Command from the mgmt server provides broadcastUri=vlan://untagged and another provides broadcastUri=untagged? I'm not sure I understand why changing 'untagged' into a URI format changes its meaning, but it seems like that doesn't make any sense to you, so perhaps we can break that out into a separate column so that we can still capture the info, if needed. If we don't like URI format for the vlan id, that's fine, but we need to do changes to the 4.3 installs and fix 4.4. As mentioned, I remember there being a decent amount of work to handle the vlan:// when it was introduced, and that will need to be done again to change it back. I'm not against that, but I'm not going to be the one doing that work, either :-) On Mon, Jun 2, 2014 at 3:47 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: I don't think this should be solved this way afterall. 'untagged' actually means no vlan, so it should not be prepended with 'vlan://'. I think the kvm code should be fixed for this not the generic code. On Fri, May 30, 2014 at 10:59 PM, Daan Hoogland daan.hoogl...@gmail.com wrote: On Fri, May 30, 2014 at 10:51 PM, Marcus shadow...@gmail.com wrote: Looks good to me, aside from he debug statement. Ah, the first line was not in my line of sight. -- Daan -- Daan
Re: VPC's VR missing public NIC eth1
Just to recap... I was trying to review the issue in my head and thought it might be useful to write it down. in 4.3 we got the BroadcastDomainType enum introduced, and many parts of the code were changed to use that when dealing with the vlan id. This code, among other things, returns a vlan id in URI format, describing both the technology used to provide the virtual lan, along with the id. Along the way this seems to have caused the value itself to be stored as a URI (still not sure where, by whom, or if it was intentional). That was fine and seemed to work after some fixing, until there was an upgrade done where the existing database value was NOT in URI format. We had a few places where the code was never changed to use BroadcastDomainType to 'normalize' the info from the database (e.g. the IpAssocVpcCommand the mgmt server constructs), so upgrades are broken. Most places in the code as it is now are working with a live value of 'vlan://x', regardless of whether the database has 'vlan://x' or just 'x', thanks to this code it returns the same 'vlan://' for either stored value. For these places it shouldn't matter if we fix the old databases to store 'vlan://x' or the 4.3 installs to go back to 'x'. However, there are a few places that are broken, like this IpAssocVpcCommand the mgmt server creates and CLOUDSTACK-5505. If we switch the db value back, we have to identify all of the outstanding ones and fix them. In addition, new code since then may have perhaps assumed that the db value is 'vlan://', and might have bothered to pass through the interpolation, so they may break as well. If we had full coverage on the test suite it would be easy to change the value back in the DB of a 4.3 or 4.4 install and see what breaks. If we don't switch the value back, and instead update old databases to the current way, it fixes the immediate issue but we end up with code doing the same thing in two different ways. Some places will be using the raw db value and other places will be asking for it to be normalized, and both will have the same result, which is kind of messy and prone to causing issues down the road if something changes again to separate these two. On Mon, Jun 2, 2014 at 10:01 AM, Marcus shadow...@gmail.com wrote: I'm not sure the KVM code needs to be changed, you're asking it to deal with an inconsistency from the mgmt server. Don't you find it odd that one Command from the mgmt server provides broadcastUri=vlan://untagged and another provides broadcastUri=untagged? I'm not sure I understand why changing 'untagged' into a URI format changes its meaning, but it seems like that doesn't make any sense to you, so perhaps we can break that out into a separate column so that we can still capture the info, if needed. If we don't like URI format for the vlan id, that's fine, but we need to do changes to the 4.3 installs and fix 4.4. As mentioned, I remember there being a decent amount of work to handle the vlan:// when it was introduced, and that will need to be done again to change it back. I'm not against that, but I'm not going to be the one doing that work, either :-) On Mon, Jun 2, 2014 at 3:47 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: I don't think this should be solved this way afterall. 'untagged' actually means no vlan, so it should not be prepended with 'vlan://'. I think the kvm code should be fixed for this not the generic code. On Fri, May 30, 2014 at 10:59 PM, Daan Hoogland daan.hoogl...@gmail.com wrote: On Fri, May 30, 2014 at 10:51 PM, Marcus shadow...@gmail.com wrote: Looks good to me, aside from he debug statement. Ah, the first line was not in my line of sight. -- Daan -- Daan
Re: VPC's VR missing public NIC eth1
Hi Andrija, Daan asked me to have a look at this as well. Looking at you issue I recall having seen something similar. Back then when upgrading 4.2.1 to 4.3 I though it had to do with out own custom build svm template. Let me fire off some questions before explaining what the cause was in our case. :) - what hypervisor (and version) are you using? - if XS, is the new VR a para-virtualised instance (PV) or hardware assisted (HVM)? Do a xe vm-param-list on the VR uuid and check that param PV-args is set and HVM-boot-policy is unset. - what is the OS type of the VR in ACS (guest_os_id in vm_instance table and match with table guest_os) - what is the OS type of the SVM template? Now for the explaining. :) In our case the OS type of the new template was not supported on the XenServer version we are running. Therefore the VR was started by XS as a HVM guest. System vms on XS rely on the arguments passed to them in the PV-args param (ends up on the guest in /var/cache/cloud/cmdline which in turn is used by cloud-early-config) in order to work. cmdline contains the NIC configuration information. So, long story short, if a VR gets started as a HVM it will not get the information needed to configure it's NICs. Workaround We corrected the os_type_id in the DB (yes I know editing the DB is something you usually don't want but there is no other way in this case) of the existing VR's and of the systemvmtemplate to something supported by XenServer. Kind regards, Joris van Lieshout Schuberg Philis On 29/05/14 12:18, Andrija Panic andrija.pa...@gmail.com wrote: They are 2 traffic types on 1 physical net (that is both tagged vlan 500, and untagged packets travel over same KVM bridge, and over eth1 to outside world)... On 29 May 2014 12:04, Daan Hoogland daan.hoogl...@gmail.com wrote: Are these two traffic types in one physical net? or two physical nets on the same interface (seems wrong). On Thu, May 29, 2014 at 11:35 AM, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: I don't think editing DB table will work. -Jayapal On 29-May-2014, at 2:52 PM, Andrija Panic andrija.pa...@gmail.com wrote: It's like this: I have public subnet /24. half is dedicated for Guest traffic (vlan 500) and the second half is dedicated to Public traffic/network (no vlan tags, that is untagged packets) Both vlan500 and untagged packets travel over physical eth1 interface on hypervisors and can reach Internet. Thanks, On 29 May 2014 11:06, Daan Hoogland daan.hoogl...@gmail.com wrote: On Thu, May 29, 2014 at 10:57 AM, Andrija Panic andrija.pa...@gmail.com wrote: 500 is 500 the vlan of your guestnetwork or your physical network? You wouldn't want to have two nets with vlan 500! -- Daan -- Andrija Panić -- http://admintweets.com -- -- Daan -- Andrija Panić -- http://admintweets.com --
Re: VPC's VR missing public NIC eth1
Hi Joris, thank you for taking time to address this issue :) So...: - I'm on KVM (stock CentOS 6.2 patched by Inktank for CEPH support), OS is Centos 6.5, libvirt 1.2.3 compiled. - ACS 4.3 having problems, ACS 4.2.1 was fine - not XS, so I guess no answers for this part :) - guest_os_id is 184 = Debian 7 x64 - SVM = systemvm-kvm-4.3 = os type 184 = Debian 7 x64 This worked previously on 4.2.1 = template was ofcourse systemvm-kvm-4.2 - but that was also Debian 7 x64 type... so this should not be the issues (guest not supported by host...) The only thing that might be out of standard = all SVMs are on CEPH - there are official docs on altering database to make some new System Offering as default for SSVM and CPVM - what I did, I also have done same config in DB, to make VR use another System Offering as default - which is NOT explained in the docs - you could use Change Offering... button on exiting, shutdown VR to change it per docs... But still this worked all fine on 4.2.1... - regarding /var/cache/cloud/cmdline the content is folowing at the moment root@r-801-VM:~# cat /var/cache/cloud/cmdline vpccidr=10.0.0.0/8 domain=cscloud.internal dns1=8.8.8.8 dns2= template=domP name=r-801-VM eth0ip=169.254.0.75 eth0mask=255.255.0.0 type=vpcrouter disable_rp_filter=true Also please note that only eth1 does not have IP info, eth0 (control 169.xxx) and all other eh2 and up that are used for Tiers get IP info fine. I could also manually add IP for eth1 (public NIC) and start ifup eth1 - and it works fine, but adding new IP Port Forwarding etc does not work... Daan or somebody said it could be realted to my Public network (in the Zones, Physical Network, eth1 listing) is NOT tagged (vlan://untagged)... Interestingly the only VR that does work fine is the VR used in Shared network, but that VR is using IP from Guest IP range (also efectively public IPs but on vlan 500) I was instructed to try to change Public IP range from untagged to vlan 500, but I'm not sure how to do this, if there is any way at all (editing vlan table and changing to vlan 500 does not work, after rebooting VR from ACS gui). :) So, not sure what is roughly expected date for 4.4, but right now, I'm pretty stuck with a big problem of all VPC not operational at all... Thanks, On 30 May 2014 08:27, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Daan asked me to have a look at this as well. Looking at you issue I recall having seen something similar. Back then when upgrading 4.2.1 to 4.3 I though it had to do with out own custom build svm template. Let me fire off some questions before explaining what the cause was in our case. :) - what hypervisor (and version) are you using? - if XS, is the new VR a para-virtualised instance (PV) or hardware assisted (HVM)? Do a xe vm-param-list on the VR uuid and check that param PV-args is set and HVM-boot-policy is unset. - what is the OS type of the VR in ACS (guest_os_id in vm_instance table and match with table guest_os) - what is the OS type of the SVM template? Now for the explaining. :) In our case the OS type of the new template was not supported on the XenServer version we are running. Therefore the VR was started by XS as a HVM guest. System vms on XS rely on the arguments passed to them in the PV-args param (ends up on the guest in /var/cache/cloud/cmdline which in turn is used by cloud-early-config) in order to work. cmdline contains the NIC configuration information. So, long story short, if a VR gets started as a HVM it will not get the information needed to configure it's NICs. Workaround We corrected the os_type_id in the DB (yes I know editing the DB is something you usually don't want but there is no other way in this case) of the existing VR's and of the systemvmtemplate to something supported by XenServer. Kind regards, Joris van Lieshout Schuberg Philis On 29/05/14 12:18, Andrija Panic andrija.pa...@gmail.com wrote: They are 2 traffic types on 1 physical net (that is both tagged vlan 500, and untagged packets travel over same KVM bridge, and over eth1 to outside world)... On 29 May 2014 12:04, Daan Hoogland daan.hoogl...@gmail.com wrote: Are these two traffic types in one physical net? or two physical nets on the same interface (seems wrong). On Thu, May 29, 2014 at 11:35 AM, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: I don't think editing DB table will work. -Jayapal On 29-May-2014, at 2:52 PM, Andrija Panic andrija.pa...@gmail.com wrote: It's like this: I have public subnet /24. half is dedicated for Guest traffic (vlan 500) and the second half is dedicated to Public traffic/network (no vlan tags, that is untagged packets) Both vlan500 and untagged packets travel over physical eth1 interface on hypervisors and can reach Internet. Thanks, On 29 May 2014 11:06, Daan Hoogland daan.hoogl...@gmail.com wrote:
Re: VPC's VR missing public NIC eth1
Andrija, Do not just assign a second net vlan://500 You have one like that and you don't want conflicting nets using the same vlan. I am wondering why 'untagged' comes out as 'vlan://untagged'. I think that is the bug. Did you find the string 'vlan://untagged' in your db? On Fri, May 30, 2014 at 10:20 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, thank you for taking time to address this issue :) So...: - I'm on KVM (stock CentOS 6.2 patched by Inktank for CEPH support), OS is Centos 6.5, libvirt 1.2.3 compiled. - ACS 4.3 having problems, ACS 4.2.1 was fine - not XS, so I guess no answers for this part :) - guest_os_id is 184 = Debian 7 x64 - SVM = systemvm-kvm-4.3 = os type 184 = Debian 7 x64 This worked previously on 4.2.1 = template was ofcourse systemvm-kvm-4.2 - but that was also Debian 7 x64 type... so this should not be the issues (guest not supported by host...) The only thing that might be out of standard = all SVMs are on CEPH - there are official docs on altering database to make some new System Offering as default for SSVM and CPVM - what I did, I also have done same config in DB, to make VR use another System Offering as default - which is NOT explained in the docs - you could use Change Offering... button on exiting, shutdown VR to change it per docs... But still this worked all fine on 4.2.1... - regarding /var/cache/cloud/cmdline the content is folowing at the moment root@r-801-VM:~# cat /var/cache/cloud/cmdline vpccidr=10.0.0.0/8 domain=cscloud.internal dns1=8.8.8.8 dns2= template=domP name=r-801-VM eth0ip=169.254.0.75 eth0mask=255.255.0.0 type=vpcrouter disable_rp_filter=true Also please note that only eth1 does not have IP info, eth0 (control 169.xxx) and all other eh2 and up that are used for Tiers get IP info fine. I could also manually add IP for eth1 (public NIC) and start ifup eth1 - and it works fine, but adding new IP Port Forwarding etc does not work... Daan or somebody said it could be realted to my Public network (in the Zones, Physical Network, eth1 listing) is NOT tagged (vlan://untagged)... Interestingly the only VR that does work fine is the VR used in Shared network, but that VR is using IP from Guest IP range (also efectively public IPs but on vlan 500) I was instructed to try to change Public IP range from untagged to vlan 500, but I'm not sure how to do this, if there is any way at all (editing vlan table and changing to vlan 500 does not work, after rebooting VR from ACS gui). :) So, not sure what is roughly expected date for 4.4, but right now, I'm pretty stuck with a big problem of all VPC not operational at all... Thanks, On 30 May 2014 08:27, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Daan asked me to have a look at this as well. Looking at you issue I recall having seen something similar. Back then when upgrading 4.2.1 to 4.3 I though it had to do with out own custom build svm template. Let me fire off some questions before explaining what the cause was in our case. :) - what hypervisor (and version) are you using? - if XS, is the new VR a para-virtualised instance (PV) or hardware assisted (HVM)? Do a xe vm-param-list on the VR uuid and check that param PV-args is set and HVM-boot-policy is unset. - what is the OS type of the VR in ACS (guest_os_id in vm_instance table and match with table guest_os) - what is the OS type of the SVM template? Now for the explaining. :) In our case the OS type of the new template was not supported on the XenServer version we are running. Therefore the VR was started by XS as a HVM guest. System vms on XS rely on the arguments passed to them in the PV-args param (ends up on the guest in /var/cache/cloud/cmdline which in turn is used by cloud-early-config) in order to work. cmdline contains the NIC configuration information. So, long story short, if a VR gets started as a HVM it will not get the information needed to configure it's NICs. Workaround We corrected the os_type_id in the DB (yes I know editing the DB is something you usually don't want but there is no other way in this case) of the existing VR's and of the systemvmtemplate to something supported by XenServer. Kind regards, Joris van Lieshout Schuberg Philis On 29/05/14 12:18, Andrija Panic andrija.pa...@gmail.com wrote: They are 2 traffic types on 1 physical net (that is both tagged vlan 500, and untagged packets travel over same KVM bridge, and over eth1 to outside world)... On 29 May 2014 12:04, Daan Hoogland daan.hoogl...@gmail.com wrote: Are these two traffic types in one physical net? or two physical nets on the same interface (seems wrong). On Thu, May 29, 2014 at 11:35 AM, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: I don't think editing DB table will work. -Jayapal On 29-May-2014, at 2:52 PM, Andrija Panic andrija.pa...@gmail.com wrote: It's like this: I
Re: VPC's VR missing public NIC eth1
Hi Andrija, Thanks for the answers. In deed your situation is different so PV/HVM is not the issue. When reading back the log output you have provided I noted that the VR messages log indicates that it's waiting for ethnull to be up. This raises the question where null was introduced instead of 1. The ACS management log output you send was, what I think, later down the road where ACS gives up trying to wait for the VR to come up. If you would capture the job-executor in the management log from startCommand till the exception, do you see anywhere a mention of ethnull? You might need to reed into the DirectAgent executing the startCommand to find a clue. The thing is that I only have experience with XS based environment so I cannot point you to the exact output to look for. On XS, at least, it is [c.c.h.x.r.CitrixResourceBase] (DirectAgent-351:ctx-4a51bb9e) Created a vif e4c362bd-764b-f651-dc9a-1abd5cb33c43 on 1 Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 10:48, Andrija Panic andrija.pa...@gmail.com wrote: Hi Deen, no, in DB there is field vlan_id with value untagged - that vlan://untagged is shown from ACS gui, and is used in API call (or better said commands that are seen in management server logs). Best, Andrija On 30 May 2014 10:37, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, Do not just assign a second net vlan://500 You have one like that and you don't want conflicting nets using the same vlan. I am wondering why 'untagged' comes out as 'vlan://untagged'. I think that is the bug. Did you find the string 'vlan://untagged' in your db? On Fri, May 30, 2014 at 10:20 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, thank you for taking time to address this issue :) So...: - I'm on KVM (stock CentOS 6.2 patched by Inktank for CEPH support), OS is Centos 6.5, libvirt 1.2.3 compiled. - ACS 4.3 having problems, ACS 4.2.1 was fine - not XS, so I guess no answers for this part :) - guest_os_id is 184 = Debian 7 x64 - SVM = systemvm-kvm-4.3 = os type 184 = Debian 7 x64 This worked previously on 4.2.1 = template was ofcourse systemvm-kvm-4.2 - but that was also Debian 7 x64 type... so this should not be the issues (guest not supported by host...) The only thing that might be out of standard = all SVMs are on CEPH - there are official docs on altering database to make some new System Offering as default for SSVM and CPVM - what I did, I also have done same config in DB, to make VR use another System Offering as default - which is NOT explained in the docs - you could use Change Offering... button on exiting, shutdown VR to change it per docs... But still this worked all fine on 4.2.1... - regarding /var/cache/cloud/cmdline the content is folowing at the moment root@r-801-VM:~# cat /var/cache/cloud/cmdline vpccidr=10.0.0.0/8 domain=cscloud.internal dns1=8.8.8.8 dns2= template=domP name=r-801-VM eth0ip=169.254.0.75 eth0mask=255.255.0.0 type=vpcrouter disable_rp_filter=true Also please note that only eth1 does not have IP info, eth0 (control 169.xxx) and all other eh2 and up that are used for Tiers get IP info fine. I could also manually add IP for eth1 (public NIC) and start ifup eth1 - and it works fine, but adding new IP Port Forwarding etc does not work... Daan or somebody said it could be realted to my Public network (in the Zones, Physical Network, eth1 listing) is NOT tagged (vlan://untagged)... Interestingly the only VR that does work fine is the VR used in Shared network, but that VR is using IP from Guest IP range (also efectively public IPs but on vlan 500) I was instructed to try to change Public IP range from untagged to vlan 500, but I'm not sure how to do this, if there is any way at all (editing vlan table and changing to vlan 500 does not work, after rebooting VR from ACS gui). :) So, not sure what is roughly expected date for 4.4, but right now, I'm pretty stuck with a big problem of all VPC not operational at all... Thanks, On 30 May 2014 08:27, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Daan asked me to have a look at this as well. Looking at you issue I recall having seen something similar. Back then when upgrading 4.2.1 to 4.3 I though it had to do with out own custom build svm template. Let me fire off some questions before explaining what the cause was in our case. :) - what hypervisor (and version) are you using? - if XS, is the new VR a para-virtualised instance (PV) or hardware assisted (HVM)? Do a xe vm-param-list on the VR uuid and check that param PV-args is set and HVM-boot-policy is unset. - what is the OS type of the VR in ACS (guest_os_id in vm_instance table and match with table guest_os) - what is the OS type of the SVM template? Now for the explaining. :) In our case the OS type of the new template was not supported
Re: VPC's VR missing public NIC eth1
Hi Joris, just to be sure - you want me to capture the log from the moment I reboot router - or you want me to stop it, then start capturing log, and start it (and continue capture untill ethnull errors inside VR) ? Thanks, On 30 May 2014 13:39, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Thanks for the answers. In deed your situation is different so PV/HVM is not the issue. When reading back the log output you have provided I noted that the VR messages log indicates that it's waiting for ethnull to be up. This raises the question where null was introduced instead of 1. The ACS management log output you send was, what I think, later down the road where ACS gives up trying to wait for the VR to come up. If you would capture the job-executor in the management log from startCommand till the exception, do you see anywhere a mention of ethnull? You might need to reed into the DirectAgent executing the startCommand to find a clue. The thing is that I only have experience with XS based environment so I cannot point you to the exact output to look for. On XS, at least, it is [c.c.h.x.r.CitrixResourceBase] (DirectAgent-351:ctx-4a51bb9e) Created a vif e4c362bd-764b-f651-dc9a-1abd5cb33c43 on 1 Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 10:48, Andrija Panic andrija.pa...@gmail.com wrote: Hi Deen, no, in DB there is field vlan_id with value untagged - that vlan://untagged is shown from ACS gui, and is used in API call (or better said commands that are seen in management server logs). Best, Andrija On 30 May 2014 10:37, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, Do not just assign a second net vlan://500 You have one like that and you don't want conflicting nets using the same vlan. I am wondering why 'untagged' comes out as 'vlan://untagged'. I think that is the bug. Did you find the string 'vlan://untagged' in your db? On Fri, May 30, 2014 at 10:20 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, thank you for taking time to address this issue :) So...: - I'm on KVM (stock CentOS 6.2 patched by Inktank for CEPH support), OS is Centos 6.5, libvirt 1.2.3 compiled. - ACS 4.3 having problems, ACS 4.2.1 was fine - not XS, so I guess no answers for this part :) - guest_os_id is 184 = Debian 7 x64 - SVM = systemvm-kvm-4.3 = os type 184 = Debian 7 x64 This worked previously on 4.2.1 = template was ofcourse systemvm-kvm-4.2 - but that was also Debian 7 x64 type... so this should not be the issues (guest not supported by host...) The only thing that might be out of standard = all SVMs are on CEPH - there are official docs on altering database to make some new System Offering as default for SSVM and CPVM - what I did, I also have done same config in DB, to make VR use another System Offering as default - which is NOT explained in the docs - you could use Change Offering... button on exiting, shutdown VR to change it per docs... But still this worked all fine on 4.2.1... - regarding /var/cache/cloud/cmdline the content is folowing at the moment root@r-801-VM:~# cat /var/cache/cloud/cmdline vpccidr=10.0.0.0/8 domain=cscloud.internal dns1=8.8.8.8 dns2= template=domP name=r-801-VM eth0ip=169.254.0.75 eth0mask=255.255.0.0 type=vpcrouter disable_rp_filter=true Also please note that only eth1 does not have IP info, eth0 (control 169.xxx) and all other eh2 and up that are used for Tiers get IP info fine. I could also manually add IP for eth1 (public NIC) and start ifup eth1 - and it works fine, but adding new IP Port Forwarding etc does not work... Daan or somebody said it could be realted to my Public network (in the Zones, Physical Network, eth1 listing) is NOT tagged (vlan://untagged)... Interestingly the only VR that does work fine is the VR used in Shared network, but that VR is using IP from Guest IP range (also efectively public IPs but on vlan 500) I was instructed to try to change Public IP range from untagged to vlan 500, but I'm not sure how to do this, if there is any way at all (editing vlan table and changing to vlan 500 does not work, after rebooting VR from ACS gui). :) So, not sure what is roughly expected date for 4.4, but right now, I'm pretty stuck with a big problem of all VPC not operational at all... Thanks, On 30 May 2014 08:27, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Daan asked me to have a look at this as well. Looking at you issue I recall having seen something similar. Back then when upgrading 4.2.1 to 4.3 I though it had to do with out own custom build svm template. Let me fire off some questions before explaining what the cause was in our case. :) - what hypervisor (and version) are
Re: VPC's VR missing public NIC eth1
Hi Andrija, Just the start of the VR should be sufficient. Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 13:48, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, just to be sure - you want me to capture the log from the moment I reboot router - or you want me to stop it, then start capturing log, and start it (and continue capture untill ethnull errors inside VR) ? Thanks, On 30 May 2014 13:39, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Thanks for the answers. In deed your situation is different so PV/HVM is not the issue. When reading back the log output you have provided I noted that the VR messages log indicates that it's waiting for ethnull to be up. This raises the question where null was introduced instead of 1. The ACS management log output you send was, what I think, later down the road where ACS gives up trying to wait for the VR to come up. If you would capture the job-executor in the management log from startCommand till the exception, do you see anywhere a mention of ethnull? You might need to reed into the DirectAgent executing the startCommand to find a clue. The thing is that I only have experience with XS based environment so I cannot point you to the exact output to look for. On XS, at least, it is [c.c.h.x.r.CitrixResourceBase] (DirectAgent-351:ctx-4a51bb9e) Created a vif e4c362bd-764b-f651-dc9a-1abd5cb33c43 on 1 Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 10:48, Andrija Panic andrija.pa...@gmail.com wrote: Hi Deen, no, in DB there is field vlan_id with value untagged - that vlan://untagged is shown from ACS gui, and is used in API call (or better said commands that are seen in management server logs). Best, Andrija On 30 May 2014 10:37, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, Do not just assign a second net vlan://500 You have one like that and you don't want conflicting nets using the same vlan. I am wondering why 'untagged' comes out as 'vlan://untagged'. I think that is the bug. Did you find the string 'vlan://untagged' in your db? On Fri, May 30, 2014 at 10:20 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, thank you for taking time to address this issue :) So...: - I'm on KVM (stock CentOS 6.2 patched by Inktank for CEPH support), OS is Centos 6.5, libvirt 1.2.3 compiled. - ACS 4.3 having problems, ACS 4.2.1 was fine - not XS, so I guess no answers for this part :) - guest_os_id is 184 = Debian 7 x64 - SVM = systemvm-kvm-4.3 = os type 184 = Debian 7 x64 This worked previously on 4.2.1 = template was ofcourse systemvm-kvm-4.2 - but that was also Debian 7 x64 type... so this should not be the issues (guest not supported by host...) The only thing that might be out of standard = all SVMs are on CEPH - there are official docs on altering database to make some new System Offering as default for SSVM and CPVM - what I did, I also have done same config in DB, to make VR use another System Offering as default - which is NOT explained in the docs - you could use Change Offering... button on exiting, shutdown VR to change it per docs... But still this worked all fine on 4.2.1... - regarding /var/cache/cloud/cmdline the content is folowing at the moment root@r-801-VM:~# cat /var/cache/cloud/cmdline vpccidr=10.0.0.0/8 domain=cscloud.internal dns1=8.8.8.8 dns2= template=domP name=r-801-VM eth0ip=169.254.0.75 eth0mask=255.255.0.0 type=vpcrouter disable_rp_filter=true Also please note that only eth1 does not have IP info, eth0 (control 169.xxx) and all other eh2 and up that are used for Tiers get IP info fine. I could also manually add IP for eth1 (public NIC) and start ifup eth1 - and it works fine, but adding new IP Port Forwarding etc does not work... Daan or somebody said it could be realted to my Public network (in the Zones, Physical Network, eth1 listing) is NOT tagged (vlan://untagged)... Interestingly the only VR that does work fine is the VR used in Shared network, but that VR is using IP from Guest IP range (also efectively public IPs but on vlan 500) I was instructed to try to change Public IP range from untagged to vlan 500, but I'm not sure how to do this, if there is any way at all (editing vlan table and changing to vlan 500 does not work, after rebooting VR from ACS gui). :) So, not sure what is roughly expected date for 4.4, but right now, I'm pretty stuck with a big problem of all VPC not operational at all... Thanks, On 30 May 2014 08:27, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Daan asked me to have a look at this as well. Looking at you issue I recall
Re: VPC's VR missing public NIC eth1
Hi Joris, here is the management log: http://pastebin.com/zxnKxFhk Interesting parts (to me): in bold 2014-05-30 13:56:21,899 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) copyAsync inspecting src type TEMPLATE copyAsync inspecting dest type VOLUME 2014-05-30 13:56:21,905 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Sending { Cmd , MgmtId: 161344838950, via: 4(cs2.x.net), Ver: v1, Flags: 100011, [{org.apache.cloudstack.storage.command.CopyCommand:{srcTO:{org.apache.cloudstack.storage.to.TemplateObjectTO:{path:1adc1d2e-56ae-4a0f-b0b4-5e351e7cae55,origUrl: http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 ,uuid:1adc1d2e-56ae-4a0f-b0b4-5e351e7cae55,id:414,format:QCOW2,accountId:2,checksum:85a1bed07bf43cbf022451cb2ecae4ff, *hvm:true* ,displayText:systemvm-kvm-4.3,imageDataStore:{org.apache.cloudstack.storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1a66-353d-88a8-2203f79b1dc6,id:209,poolType:RBD,host: cephmon.x.net,path:cloudstack,port:6789,url:RBD:// cephmon.x.net/cloudstack/?ROLE=PrimarySTOREUUID=5b93422e-1a66-353d-88a8-2203f79b1dc6 }},name:414-2-ec331e74-5858-3153-91a9-1d706d9c533e,hypervisorType:KVM}},destTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uuid:9c440d3b-cba5-4960-b8bf-dca90291cd2b,volumeType:ROOT,dataStore:{org.apache.cloudstack.storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1a66-353d-88a8-2203f79b1dc6,id:209,poolType:RBD,host: cephmon.x.net,path:cloudstack,port:6789,url:RBD:// cephmon.x.net/cloudstack/?ROLE=PrimarySTOREUUID=5b93422e-1a66-353d-88a8-2203f79b1dc6}},name:ROOT-801,size:262144,volumeId:1064,vmName:r-801-VM,accountId:11,format:RAW,id:1064,deviceId:0,hypervisorType:KVM}},executeInSequence:false,options:{},wait:0}}] } 2014-05-30 13:56:23,742 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) Seq 4-1248669612: Processing: { Ans: , MgmtId: 161344838950, via: 4, Ver: v1, Flags: 10, [{org.apache.cloudstack.storage.command.CopyCmdAnswer:{newData:{org.apache.cloudstack.storage.to.VolumeObjectTO:{size:262144,path:9c440d3b-cba5-4960-b8bf-dca90291cd2b,accountId:0,format:RAW,id:0}},result:true,wait:0}}] } 2014-05-30 13:56:23,742 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Received: { Ans: , MgmtId: 161344838950, via: 4, Ver: v1, Flags: 10, { CopyCmdAnswer } } 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[ *1092-801-null*-46.232.xxx.246-vlan://untagged of type Public from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1093-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.1.1-vlan://44 of type Guest from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1094-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.3.1-vlan://43 of type Guest from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1095-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.4.1-vlan://3004 of type Guest from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1095-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.4.1-vlan://3004 of type Guest from the nics passed on vm start Thanks, On 30 May 2014 13:54, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Just the start of the VR should be sufficient. Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 13:48, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, just to be sure - you want me to capture the log from the moment I reboot router - or you want me to stop it, then start capturing log, and start it (and continue capture untill ethnull errors inside VR) ? Thanks, On 30 May 2014 13:39, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Thanks for the answers. In deed your situation is different so PV/HVM is not the issue. When reading back the log output you have provided I noted that the VR messages log indicates that it's waiting for ethnull to be up. This raises the question where null was introduced instead of 1. The ACS management log output you send was, what I think, later down the road where ACS gives up trying to wait for the VR to come up. If you would capture the
Re: VPC's VR missing public NIC eth1
Hi Andrija, Bold formatting does not come trough on the dev list. :) But u might need a bit more info. At a certain point I see this line 2014-05-30 13:56:23,935 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 1-609104082: Sending { Cmd , MgmtId: 161344838950, via: 1(cs1.x.net), Ver: v1, Flags: 100111, [{com.cloud.agent.api.StartCommand:{vm:{id:801,name:r-801-VM... This is where the information is passed on to the agent handles. For XS this would initiate an agent handler on the management server but for KVM, if I remember correctly, it passed the command on to the cloudstack agent service on the hypervisor. Can you check the cloud service log on the KVM hypervisor executing the request? it's this server cs1.x.net and then search top down for 609104082 in the log. See if you can provide the log from the agent handler thread started by that sequence. Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:08, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, here is the management log: http://pastebin.com/zxnKxFhk Interesting parts (to me): in bold 2014-05-30 13:56:21,899 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) copyAsync inspecting src type TEMPLATE copyAsync inspecting dest type VOLUME 2014-05-30 13:56:21,905 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Sending { Cmd , MgmtId: 161344838950, via: 4(cs2.x.net), Ver: v1, Flags: 100011, [{org.apache.cloudstack.storage.command.CopyCommand:{srcTO:{org.apach e.cloudstack.storage.to.TemplateObjectTO:{path:1adc1d2e-56ae-4a0f-b0b4 -5e351e7cae55,origUrl: http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-mast er-kvm.qcow2.bz2 ,uuid:1adc1d2e-56ae-4a0f-b0b4-5e351e7cae55,id:414,format:QCOW2, accountId:2,checksum:85a1bed07bf43cbf022451cb2ecae4ff, *hvm:true* ,displayText:systemvm-kvm-4.3,imageDataStore:{org.apache.cloudstack .storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1a66-353d-88a8-2203f79b1 dc6,id:209,poolType:RBD,host: cephmon.x.net,path:cloudstack,port:6789,url:RBD:// cephmon.x.net/cloudstack/?ROLE=PrimarySTOREUUID=5b93422e-1a66-353d-88 a8-2203f79b1dc6 }},name:414-2-ec331e74-5858-3153-91a9-1d706d9c533e,hypervisorType: KVM}},destTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uuid :9c440d3b-cba5-4960-b8bf-dca90291cd2b,volumeType:ROOT,dataStore:{ org.apache.cloudstack.storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1a6 6-353d-88a8-2203f79b1dc6,id:209,poolType:RBD,host: cephmon.x.net,path:cloudstack,port:6789,url:RBD:// cephmon.x.net/cloudstack/?ROLE=PrimarySTOREUUID=5b93422e-1a66-353d-88 a8-2203f79b1dc6}},name:ROOT-801,size:262144,volumeId:1064,vm Name:r-801-VM,accountId:11,format:RAW,id:1064,deviceId:0,hyp ervisorType:KVM}},executeInSequence:false,options:{},wait:0}}] } 2014-05-30 13:56:23,742 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) Seq 4-1248669612: Processing: { Ans: , MgmtId: 161344838950, via: 4, Ver: v1, Flags: 10, [{org.apache.cloudstack.storage.command.CopyCmdAnswer:{newData:{org.a pache.cloudstack.storage.to.VolumeObjectTO:{size:262144,path:9c4 40d3b-cba5-4960-b8bf-dca90291cd2b,accountId:0,format:RAW,id:0}}, result:true,wait:0}}] } 2014-05-30 13:56:23,742 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Received: { Ans: , MgmtId: 161344838950, via: 4, Ver: v1, Flags: 10, { CopyCmdAnswer } } 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[ *1092-801-null*-46.232.xxx.246-vlan://untagged of type Public from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1093-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.1.1-vlan://4 4 of type Guest from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1094-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.3.1-vlan://4 3 of type Guest from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1095-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.4.1-vlan://3 004 of type Guest from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1095-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.4.1-vlan://3 004 of type Guest from the nics passed on vm start Thanks, On 30 May 2014 13:54, Joris van
Re: VPC's VR missing public NIC eth1
Hi Joris, I have turned on DEBUG loging in agent.log on cs1.xxx/net host: So, management logs again: http://pastebin.com/F6BRf7Y9 Agent logs on cs1.xxx: http://pastebin.com/BJauKbaC Not playing smart, but there is some error: [kvm.resource.KVMGuestOsMapper] (agentRequest-Handler-3:) Can't find the mapping of guest os: Debian GNU/Linux 7(64-bit) Best, Andrija On 30 May 2014 14:26, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Bold formatting does not come trough on the dev list. :) But u might need a bit more info. At a certain point I see this line 2014-05-30 13:56:23,935 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 1-609104082: Sending { Cmd , MgmtId: 161344838950, via: 1(cs1.x.net), Ver: v1, Flags: 100111, [{com.cloud.agent.api.StartCommand:{vm:{id:801,name:r-801-VM... This is where the information is passed on to the agent handles. For XS this would initiate an agent handler on the management server but for KVM, if I remember correctly, it passed the command on to the cloudstack agent service on the hypervisor. Can you check the cloud service log on the KVM hypervisor executing the request? it's this server cs1.x.net and then search top down for 609104082 in the log. See if you can provide the log from the agent handler thread started by that sequence. Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:08, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, here is the management log: http://pastebin.com/zxnKxFhk Interesting parts (to me): in bold 2014-05-30 13:56:21,899 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) copyAsync inspecting src type TEMPLATE copyAsync inspecting dest type VOLUME 2014-05-30 13:56:21,905 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Sending { Cmd , MgmtId: 161344838950, via: 4(cs2.x.net), Ver: v1, Flags: 100011, [{org.apache.cloudstack.storage.command.CopyCommand:{srcTO:{org.apach e.cloudstack.storage.to.TemplateObjectTO:{path:1adc1d2e-56ae-4a0f-b0b4 -5e351e7cae55,origUrl: http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-mast er-kvm.qcow2.bz2 ,uuid:1adc1d2e-56ae-4a0f-b0b4-5e351e7cae55,id:414,format:QCOW2, accountId:2,checksum:85a1bed07bf43cbf022451cb2ecae4ff, *hvm:true* ,displayText:systemvm-kvm-4.3,imageDataStore:{org.apache.cloudstack .storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1a66-353d-88a8-2203f79b1 dc6,id:209,poolType:RBD,host: cephmon.x.net,path:cloudstack,port:6789,url:RBD:// cephmon.x.net/cloudstack/?ROLE=PrimarySTOREUUID=5b93422e-1a66-353d-88 a8-2203f79b1dc6 }},name:414-2-ec331e74-5858-3153-91a9-1d706d9c533e,hypervisorType: KVM}},destTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uuid :9c440d3b-cba5-4960-b8bf-dca90291cd2b,volumeType:ROOT,dataStore:{ org.apache.cloudstack.storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1a6 6-353d-88a8-2203f79b1dc6,id:209,poolType:RBD,host: cephmon.x.net,path:cloudstack,port:6789,url:RBD:// cephmon.x.net/cloudstack/?ROLE=PrimarySTOREUUID=5b93422e-1a66-353d-88 a8-2203f79b1dc6}},name:ROOT-801,size:262144,volumeId:1064,vm Name:r-801-VM,accountId:11,format:RAW,id:1064,deviceId:0,hyp ervisorType:KVM}},executeInSequence:false,options:{},wait:0}}] } 2014-05-30 13:56:23,742 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) Seq 4-1248669612: Processing: { Ans: , MgmtId: 161344838950, via: 4, Ver: v1, Flags: 10, [{org.apache.cloudstack.storage.command.CopyCmdAnswer:{newData:{org.a pache.cloudstack.storage.to.VolumeObjectTO:{size:262144,path:9c4 40d3b-cba5-4960-b8bf-dca90291cd2b,accountId:0,format:RAW,id:0}}, result:true,wait:0}}] } 2014-05-30 13:56:23,742 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Received: { Ans: , MgmtId: 161344838950, via: 4, Ver: v1, Flags: 10, { CopyCmdAnswer } } 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[ *1092-801-null*-46.232.xxx.246-vlan://untagged of type Public from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1093-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.1.1-vlan://4 4 of type Guest from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Removing nic NicProfile[1094-801-cd9fd29a-0573-4715-8742-00ecb9f82c9d-10.0.3.1-vlan://4 3 of type Guest from the nics passed on vm start. The nic will be plugged later 2014-05-30 13:56:23,773 DEBUG
Re: VPC's VR missing public NIC eth1
Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the translation from the nic profile to the actual route_proxy.sh command ACS failes to find the nic id and returns null. Let me dig a bit deeper and see what I can find but this is where we might need some help from someone with knowledge of this pice of the code. :) Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:49, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, I have turned on DEBUG loging in agent.log on cs1.xxx/net host: So, management logs again: http://pastebin.com/F6BRf7Y9 Agent logs on cs1.xxx: http://pastebin.com/BJauKbaC Not playing smart, but there is some error: [kvm.resource.KVMGuestOsMapper] (agentRequest-Handler-3:) Can't find the mapping of guest os: Debian GNU/Linux 7(64-bit) Best, Andrija On 30 May 2014 14:26, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Bold formatting does not come trough on the dev list. :) But u might need a bit more info. At a certain point I see this line 2014-05-30 13:56:23,935 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 1-609104082: Sending { Cmd , MgmtId: 161344838950, via: 1(cs1.x.net), Ver: v1, Flags: 100111, [{com.cloud.agent.api.StartCommand:{vm:{id:801,name:r-801-VM... This is where the information is passed on to the agent handles. For XS this would initiate an agent handler on the management server but for KVM, if I remember correctly, it passed the command on to the cloudstack agent service on the hypervisor. Can you check the cloud service log on the KVM hypervisor executing the request? it's this server cs1.x.net and then search top down for 609104082 in the log. See if you can provide the log from the agent handler thread started by that sequence. Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:08, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, here is the management log: http://pastebin.com/zxnKxFhk Interesting parts (to me): in bold 2014-05-30 13:56:21,899 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) copyAsync inspecting src type TEMPLATE copyAsync inspecting dest type VOLUME 2014-05-30 13:56:21,905 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Sending { Cmd , MgmtId: 161344838950, via: 4(cs2.x.net), Ver: v1, Flags: 100011, [{org.apache.cloudstack.storage.command.CopyCommand:{srcTO:{org.apa ch e.cloudstack.storage.to.TemplateObjectTO:{path:1adc1d2e-56ae-4a0f-b0 b4 -5e351e7cae55,origUrl: http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-mas t er-kvm.qcow2.bz2 ,uuid:1adc1d2e-56ae-4a0f-b0b4-5e351e7cae55,id:414,format:QCOW2 , accountId:2,checksum:85a1bed07bf43cbf022451cb2ecae4ff, *hvm:true* ,displayText:systemvm-kvm-4.3,imageDataStore:{org.apache.cloudsta ck .storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1a66-353d-88a8-2203f79 b1 dc6,id:209,poolType:RBD,host: cephmon.x.net,path:cloudstack,port:6789,url:RBD:// cephmon.x.net/cloudstack/?ROLE=PrimarySTOREUUID=5b93422e-1a66-353d-8 8 a8-2203f79b1dc6 }},name:414-2-ec331e74-5858-3153-91a9-1d706d9c533e,hypervisorType : KVM}},destTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uui d :9c440d3b-cba5-4960-b8bf-dca90291cd2b,volumeType:ROOT,dataStore: { org.apache.cloudstack.storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1 a6 6-353d-88a8-2203f79b1dc6,id:209,poolType:RBD,host: cephmon.x.net,path:cloudstack,port:6789,url:RBD:// cephmon.x.net/cloudstack/?ROLE=PrimarySTOREUUID=5b93422e-1a66-353d-8 8 a8-2203f79b1dc6}},name:ROOT-801,size:262144,volumeId:1064, vm Name:r-801-VM,accountId:11,format:RAW,id:1064,deviceId:0,h yp ervisorType:KVM}},executeInSequence:false,options:{},wait:0}}] }
Re: VPC's VR missing public NIC eth1
OK, thanks Joris. I will try playing with OS version option, on the systemvm-kvm-4.3 template... Let me know if I can help with anything more. Thanks. Andrija On 30 May 2014 15:19, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the translation from the nic profile to the actual route_proxy.sh command ACS failes to find the nic id and returns null. Let me dig a bit deeper and see what I can find but this is where we might need some help from someone with knowledge of this pice of the code. :) Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:49, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, I have turned on DEBUG loging in agent.log on cs1.xxx/net host: So, management logs again: http://pastebin.com/F6BRf7Y9 Agent logs on cs1.xxx: http://pastebin.com/BJauKbaC Not playing smart, but there is some error: [kvm.resource.KVMGuestOsMapper] (agentRequest-Handler-3:) Can't find the mapping of guest os: Debian GNU/Linux 7(64-bit) Best, Andrija On 30 May 2014 14:26, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Bold formatting does not come trough on the dev list. :) But u might need a bit more info. At a certain point I see this line 2014-05-30 13:56:23,935 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 1-609104082: Sending { Cmd , MgmtId: 161344838950, via: 1(cs1.x.net), Ver: v1, Flags: 100111, [{com.cloud.agent.api.StartCommand:{vm:{id:801,name:r-801-VM... This is where the information is passed on to the agent handles. For XS this would initiate an agent handler on the management server but for KVM, if I remember correctly, it passed the command on to the cloudstack agent service on the hypervisor. Can you check the cloud service log on the KVM hypervisor executing the request? it's this server cs1.x.net and then search top down for 609104082 in the log. See if you can provide the log from the agent handler thread started by that sequence. Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:08, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, here is the management log: http://pastebin.com/zxnKxFhk Interesting parts (to me): in bold 2014-05-30 13:56:21,899 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) copyAsync inspecting src type TEMPLATE copyAsync inspecting dest type VOLUME 2014-05-30 13:56:21,905 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Sending { Cmd , MgmtId: 161344838950, via: 4(cs2.x.net), Ver: v1, Flags: 100011, [{org.apache.cloudstack.storage.command.CopyCommand:{srcTO:{org.apa ch e.cloudstack.storage.to.TemplateObjectTO:{path:1adc1d2e-56ae-4a0f-b0 b4 -5e351e7cae55,origUrl: http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-mas t er-kvm.qcow2.bz2 ,uuid:1adc1d2e-56ae-4a0f-b0b4-5e351e7cae55,id:414,format:QCOW2 , accountId:2,checksum:85a1bed07bf43cbf022451cb2ecae4ff, *hvm:true* ,displayText:systemvm-kvm-4.3,imageDataStore:{org.apache.cloudsta ck .storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1a66-353d-88a8-2203f79 b1 dc6,id:209,poolType:RBD,host: cephmon.x.net,path:cloudstack,port:6789,url:RBD:// cephmon.x.net/cloudstack/?ROLE=PrimarySTOREUUID=5b93422e-1a66-353d-8 8 a8-2203f79b1dc6 }},name:414-2-ec331e74-5858-3153-91a9-1d706d9c533e,hypervisorType : KVM}},destTO:{org.apache.cloudstack.storage.to.VolumeObjectTO:{uui d :9c440d3b-cba5-4960-b8bf-dca90291cd2b,volumeType:ROOT,dataStore: { org.apache.cloudstack.storage.to.PrimaryDataStoreTO:{uuid:5b93422e-1 a6
Re: VPC's VR missing public NIC eth1
I've read back a bit in the code and if you look at BridgeVifDriver.java (this is where the log message with the nic profile is generated) you can see that the nic information might be off already once ACS hits the LibvirtVMDef.InterfaceDef plug function. This leads be to believer that the HVM/PV OS mismatch issue might still be related. Try fixing that first. At least it will allow us to exclude this from the list. Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:26, Andrija Panic andrija.pa...@gmail.com wrote: OK, thanks Joris. I will try playing with OS version option, on the systemvm-kvm-4.3 template... Let me know if I can help with anything more. Thanks. Andrija On 30 May 2014 15:19, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the translation from the nic profile to the actual route_proxy.sh command ACS failes to find the nic id and returns null. Let me dig a bit deeper and see what I can find but this is where we might need some help from someone with knowledge of this pice of the code. :) Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:49, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, I have turned on DEBUG loging in agent.log on cs1.xxx/net host: So, management logs again: http://pastebin.com/F6BRf7Y9 Agent logs on cs1.xxx: http://pastebin.com/BJauKbaC Not playing smart, but there is some error: [kvm.resource.KVMGuestOsMapper] (agentRequest-Handler-3:) Can't find the mapping of guest os: Debian GNU/Linux 7(64-bit) Best, Andrija On 30 May 2014 14:26, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Bold formatting does not come trough on the dev list. :) But u might need a bit more info. At a certain point I see this line 2014-05-30 13:56:23,935 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 1-609104082: Sending { Cmd , MgmtId: 161344838950, via: 1(cs1.x.net), Ver: v1, Flags: 100111, [{com.cloud.agent.api.StartCommand:{vm:{id:801,name:r-801-VM. .. This is where the information is passed on to the agent handles. For XS this would initiate an agent handler on the management server but for KVM, if I remember correctly, it passed the command on to the cloudstack agent service on the hypervisor. Can you check the cloud service log on the KVM hypervisor executing the request? it's this server cs1.x.net and then search top down for 609104082 in the log. See if you can provide the log from the agent handler thread started by that sequence. Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:08, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, here is the management log: http://pastebin.com/zxnKxFhk Interesting parts (to me): in bold 2014-05-30 13:56:21,899 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) copyAsync inspecting src type TEMPLATE copyAsync inspecting dest type VOLUME 2014-05-30 13:56:21,905 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Sending { Cmd , MgmtId: 161344838950, via: 4(cs2.x.net), Ver: v1, Flags: 100011, [{org.apache.cloudstack.storage.command.CopyCommand:{srcTO:{org.a pa ch e.cloudstack.storage.to.TemplateObjectTO:{path:1adc1d2e-56ae-4a0f- b0 b4 -5e351e7cae55,origUrl: http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-mas t er-kvm.qcow2.bz2 ,uuid:1adc1d2e-56ae-4a0f-b0b4-5e351e7cae55,id:414,format:QCO W2 , accountId:2,checksum:85a1bed07bf43cbf022451cb2ecae4ff, *hvm:true* ,displayText:systemvm-kvm-4.3,imageDataStore:{org.apache.clouds ta ck
Re: VPC's VR missing public NIC eth1
Joris, do you have recommendation on how in particular to try ? I'm not sure how to fix that, except playing with editing systemvm-4.3 template to define it as another OS type... ? Thanks again, Andrija On 30 May 2014 15:30, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: I've read back a bit in the code and if you look at BridgeVifDriver.java (this is where the log message with the nic profile is generated) you can see that the nic information might be off already once ACS hits the LibvirtVMDef.InterfaceDef plug function. This leads be to believer that the HVM/PV OS mismatch issue might still be related. Try fixing that first. At least it will allow us to exclude this from the list. Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:26, Andrija Panic andrija.pa...@gmail.com wrote: OK, thanks Joris. I will try playing with OS version option, on the systemvm-kvm-4.3 template... Let me know if I can help with anything more. Thanks. Andrija On 30 May 2014 15:19, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the translation from the nic profile to the actual route_proxy.sh command ACS failes to find the nic id and returns null. Let me dig a bit deeper and see what I can find but this is where we might need some help from someone with knowledge of this pice of the code. :) Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:49, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, I have turned on DEBUG loging in agent.log on cs1.xxx/net host: So, management logs again: http://pastebin.com/F6BRf7Y9 Agent logs on cs1.xxx: http://pastebin.com/BJauKbaC Not playing smart, but there is some error: [kvm.resource.KVMGuestOsMapper] (agentRequest-Handler-3:) Can't find the mapping of guest os: Debian GNU/Linux 7(64-bit) Best, Andrija On 30 May 2014 14:26, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Bold formatting does not come trough on the dev list. :) But u might need a bit more info. At a certain point I see this line 2014-05-30 13:56:23,935 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 1-609104082: Sending { Cmd , MgmtId: 161344838950, via: 1(cs1.x.net), Ver: v1, Flags: 100111, [{com.cloud.agent.api.StartCommand:{vm:{id:801,name:r-801-VM. .. This is where the information is passed on to the agent handles. For XS this would initiate an agent handler on the management server but for KVM, if I remember correctly, it passed the command on to the cloudstack agent service on the hypervisor. Can you check the cloud service log on the KVM hypervisor executing the request? it's this server cs1.x.net and then search top down for 609104082 in the log. See if you can provide the log from the agent handler thread started by that sequence. Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:08, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, here is the management log: http://pastebin.com/zxnKxFhk Interesting parts (to me): in bold 2014-05-30 13:56:21,899 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) copyAsync inspecting src type TEMPLATE copyAsync inspecting dest type VOLUME 2014-05-30 13:56:21,905 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 4-1248669612: Sending { Cmd , MgmtId: 161344838950, via: 4(cs2.x.net), Ver: v1, Flags: 100011,
Re: VPC's VR missing public NIC eth1
Andrija, The thing is I don't know who the os matching on KVM works. There must be a way to list supported os types. I also did some queuing on the guest_os_hypervisor table (ACS 4.3) and I don't see Debian 7 for KVM listed. select * from guest_os join guest_os_hypervisor on guest_os.id=guest_os_hypervisor.guest_os_id where guest_os_hypervisor.hypervisor_type='KVM' and guest_os_hypervisor.guest_os_name like '%Debian%'; What was the guest_os_id you where using? Could you try id 72 (Debian 5 64-bit)? Adjust both os_type_id in vm_instance and vm_template (where type='SYSTEM' and hypervisor_type='KVM'). Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:33, Andrija Panic andrija.pa...@gmail.com wrote: Joris, do you have recommendation on how in particular to try ? I'm not sure how to fix that, except playing with editing systemvm-4.3 template to define it as another OS type... ? Thanks again, Andrija On 30 May 2014 15:30, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: I've read back a bit in the code and if you look at BridgeVifDriver.java (this is where the log message with the nic profile is generated) you can see that the nic information might be off already once ACS hits the LibvirtVMDef.InterfaceDef plug function. This leads be to believer that the HVM/PV OS mismatch issue might still be related. Try fixing that first. At least it will allow us to exclude this from the list. Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:26, Andrija Panic andrija.pa...@gmail.com wrote: OK, thanks Joris. I will try playing with OS version option, on the systemvm-kvm-4.3 template... Let me know if I can help with anything more. Thanks. Andrija On 30 May 2014 15:19, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the translation from the nic profile to the actual route_proxy.sh command ACS failes to find the nic id and returns null. Let me dig a bit deeper and see what I can find but this is where we might need some help from someone with knowledge of this pice of the code. :) Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:49, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, I have turned on DEBUG loging in agent.log on cs1.xxx/net host: So, management logs again: http://pastebin.com/F6BRf7Y9 Agent logs on cs1.xxx: http://pastebin.com/BJauKbaC Not playing smart, but there is some error: [kvm.resource.KVMGuestOsMapper] (agentRequest-Handler-3:) Can't find the mapping of guest os: Debian GNU/Linux 7(64-bit) Best, Andrija On 30 May 2014 14:26, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Bold formatting does not come trough on the dev list. :) But u might need a bit more info. At a certain point I see this line 2014-05-30 13:56:23,935 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 1-609104082: Sending { Cmd , MgmtId: 161344838950, via: 1(cs1.x.net), Ver: v1, Flags: 100111, [{com.cloud.agent.api.StartCommand:{vm:{id:801,name:r-801-VM . .. This is where the information is passed on to the agent handles. For XS this would initiate an agent handler on the management server but for KVM, if I remember correctly, it passed the command on to the cloudstack agent service on the hypervisor. Can you check the cloud service log on the KVM hypervisor executing the request? it's this server cs1.x.net and then search top down for 609104082 in the log. See if you can provide the log from the agent handler thread started by that sequence. Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk
Re: VPC's VR missing public NIC eth1
I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and vmware templatest) Will change now DB to point to debian 5, and let you know. This is guest_os_id: 184 | 2 | NULL | 986a0e98-39d6-11e3-8f93-0025904e4412 | Debian GNU/Linux 7(64-bit) On 30 May 2014 15:45, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Andrija, The thing is I don't know who the os matching on KVM works. There must be a way to list supported os types. I also did some queuing on the guest_os_hypervisor table (ACS 4.3) and I don't see Debian 7 for KVM listed. select * from guest_os join guest_os_hypervisor on guest_os.id=guest_os_hypervisor.guest_os_id where guest_os_hypervisor.hypervisor_type='KVM' and guest_os_hypervisor.guest_os_name like '%Debian%'; What was the guest_os_id you where using? Could you try id 72 (Debian 5 64-bit)? Adjust both os_type_id in vm_instance and vm_template (where type='SYSTEM' and hypervisor_type='KVM'). Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:33, Andrija Panic andrija.pa...@gmail.com wrote: Joris, do you have recommendation on how in particular to try ? I'm not sure how to fix that, except playing with editing systemvm-4.3 template to define it as another OS type... ? Thanks again, Andrija On 30 May 2014 15:30, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: I've read back a bit in the code and if you look at BridgeVifDriver.java (this is where the log message with the nic profile is generated) you can see that the nic information might be off already once ACS hits the LibvirtVMDef.InterfaceDef plug function. This leads be to believer that the HVM/PV OS mismatch issue might still be related. Try fixing that first. At least it will allow us to exclude this from the list. Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:26, Andrija Panic andrija.pa...@gmail.com wrote: OK, thanks Joris. I will try playing with OS version option, on the systemvm-kvm-4.3 template... Let me know if I can help with anything more. Thanks. Andrija On 30 May 2014 15:19, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the translation from the nic profile to the actual route_proxy.sh command ACS failes to find the nic id and returns null. Let me dig a bit deeper and see what I can find but this is where we might need some help from someone with knowledge of this pice of the code. :) Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:49, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, I have turned on DEBUG loging in agent.log on cs1.xxx/net host: So, management logs again: http://pastebin.com/F6BRf7Y9 Agent logs on cs1.xxx: http://pastebin.com/BJauKbaC Not playing smart, but there is some error: [kvm.resource.KVMGuestOsMapper] (agentRequest-Handler-3:) Can't find the mapping of guest os: Debian GNU/Linux 7(64-bit) Best, Andrija On 30 May 2014 14:26, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, Bold formatting does not come trough on the dev list. :) But u might need a bit more info. At a certain point I see this line 2014-05-30 13:56:23,935 DEBUG [c.c.a.t.Request] (Job-Executor-77:ctx-ec3d358e ctx-f35b12af) Seq 1-609104082: Sending { Cmd , MgmtId: 161344838950, via: 1(cs1.x.net), Ver: v1, Flags: 100111,
Re: VPC's VR missing public NIC eth1
Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type boot dev='cdrom'/ boot dev='hd'/ /os :( On 30 May 2014 16:24, Andrija Panic andrija.pa...@gmail.com wrote: I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and vmware templatest) Will change now DB to point to debian 5, and let you know. This is guest_os_id: 184 | 2 | NULL | 986a0e98-39d6-11e3-8f93-0025904e4412 | Debian GNU/Linux 7(64-bit) On 30 May 2014 15:45, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Andrija, The thing is I don't know who the os matching on KVM works. There must be a way to list supported os types. I also did some queuing on the guest_os_hypervisor table (ACS 4.3) and I don't see Debian 7 for KVM listed. select * from guest_os join guest_os_hypervisor on guest_os.id=guest_os_hypervisor.guest_os_id where guest_os_hypervisor.hypervisor_type='KVM' and guest_os_hypervisor.guest_os_name like '%Debian%'; What was the guest_os_id you where using? Could you try id 72 (Debian 5 64-bit)? Adjust both os_type_id in vm_instance and vm_template (where type='SYSTEM' and hypervisor_type='KVM'). Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:33, Andrija Panic andrija.pa...@gmail.com wrote: Joris, do you have recommendation on how in particular to try ? I'm not sure how to fix that, except playing with editing systemvm-4.3 template to define it as another OS type... ? Thanks again, Andrija On 30 May 2014 15:30, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: I've read back a bit in the code and if you look at BridgeVifDriver.java (this is where the log message with the nic profile is generated) you can see that the nic information might be off already once ACS hits the LibvirtVMDef.InterfaceDef plug function. This leads be to believer that the HVM/PV OS mismatch issue might still be related. Try fixing that first. At least it will allow us to exclude this from the list. Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:26, Andrija Panic andrija.pa...@gmail.com wrote: OK, thanks Joris. I will try playing with OS version option, on the systemvm-kvm-4.3 template... Let me know if I can help with anything more. Thanks. Andrija On 30 May 2014 15:19, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the translation from the nic profile to the actual route_proxy.sh command ACS failes to find the nic id and returns null. Let me dig a bit deeper and see what I can find but this is where we might need some help from someone with knowledge of this pice of the code. :) Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk schubergphilis.com +31 20-7506672 +31 6-51428188 On 30/05/14 14:49, Andrija Panic andrija.pa...@gmail.com wrote: Hi Joris, I have turned on DEBUG loging in agent.log on cs1.xxx/net host: So, management logs again: http://pastebin.com/F6BRf7Y9 Agent logs on cs1.xxx: http://pastebin.com/BJauKbaC Not playing smart, but there is some error: [kvm.resource.KVMGuestOsMapper] (agentRequest-Handler-3:) Can't find the mapping of guest os: Debian GNU/Linux 7(64-bit) Best, Andrija On 30 May 2014 14:26, Joris van Lieshout jvanliesh...@schubergphilis.com
Re: VPC's VR missing public NIC eth1
Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type boot dev='cdrom'/ boot dev='hd'/ /os :( On 30 May 2014 16:24, Andrija Panic andrija.pa...@gmail.com wrote: I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and vmware templatest) Will change now DB to point to debian 5, and let you know. This is guest_os_id: 184 | 2 | NULL | 986a0e98-39d6-11e3-8f93-0025904e4412 | Debian GNU/Linux 7(64-bit) On 30 May 2014 15:45, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Andrija, The thing is I don't know who the os matching on KVM works. There must be a way to list supported os types. I also did some queuing on the guest_os_hypervisor table (ACS 4.3) and I don't see Debian 7 for KVM listed. select * from guest_os join guest_os_hypervisor on guest_os.id=guest_os_hypervisor.guest_os_id where guest_os_hypervisor.hypervisor_type='KVM' and guest_os_hypervisor.guest_os_name like '%Debian%'; What was the guest_os_id you where using? Could you try id 72 (Debian 5 64-bit)? Adjust both os_type_id in vm_instance and vm_template (where type='SYSTEM' and hypervisor_type='KVM'). Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:33, Andrija Panic andrija.pa...@gmail.com wrote: Joris, do you have recommendation on how in particular to try ? I'm not sure how to fix that, except playing with editing systemvm-4.3 template to define it as another OS type... ? Thanks again, Andrija On 30 May 2014 15:30, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: I've read back a bit in the code and if you look at BridgeVifDriver.java (this is where the log message with the nic profile is generated) you can see that the nic information might be off already once ACS hits the LibvirtVMDef.InterfaceDef plug function. This leads be to believer that the HVM/PV OS mismatch issue might still be related. Try fixing that first. At least it will allow us to exclude this from the list. Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:26, Andrija Panic andrija.pa...@gmail.com wrote: OK, thanks Joris. I will try playing with OS version option, on the systemvm-kvm-4.3 template... Let me know if I can help with anything more. Thanks. Andrija On 30 May 2014 15:19, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the translation from the nic profile to the actual route_proxy.sh command ACS failes to find the nic id and returns null. Let me dig a bit deeper and see what I can find but this is where we might need some help from someone with knowledge of this pice of the code. :) Kind regards, Joris van Lieshout Schuberg Philis Boeingavenue 271 1119 PD Schiphol-Rijk
Re: VPC's VR missing public NIC eth1
yes, correct, eth1 is present, and can be started by static IP configuration... On 30 May 2014 17:25, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type boot dev='cdrom'/ boot dev='hd'/ /os :( On 30 May 2014 16:24, Andrija Panic andrija.pa...@gmail.com wrote: I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and vmware templatest) Will change now DB to point to debian 5, and let you know. This is guest_os_id: 184 | 2 | NULL | 986a0e98-39d6-11e3-8f93-0025904e4412 | Debian GNU/Linux 7(64-bit) On 30 May 2014 15:45, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Andrija, The thing is I don't know who the os matching on KVM works. There must be a way to list supported os types. I also did some queuing on the guest_os_hypervisor table (ACS 4.3) and I don't see Debian 7 for KVM listed. select * from guest_os join guest_os_hypervisor on guest_os.id=guest_os_hypervisor.guest_os_id where guest_os_hypervisor.hypervisor_type='KVM' and guest_os_hypervisor.guest_os_name like '%Debian%'; What was the guest_os_id you where using? Could you try id 72 (Debian 5 64-bit)? Adjust both os_type_id in vm_instance and vm_template (where type='SYSTEM' and hypervisor_type='KVM'). Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:33, Andrija Panic andrija.pa...@gmail.com wrote: Joris, do you have recommendation on how in particular to try ? I'm not sure how to fix that, except playing with editing systemvm-4.3 template to define it as another OS type... ? Thanks again, Andrija On 30 May 2014 15:30, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: I've read back a bit in the code and if you look at BridgeVifDriver.java (this is where the log message with the nic profile is generated) you can see that the nic information might be off already once ACS hits the LibvirtVMDef.InterfaceDef plug function. This leads be to believer that the HVM/PV OS mismatch issue might still be related. Try fixing that first. At least it will allow us to exclude this from the list. Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:26, Andrija Panic andrija.pa...@gmail.com wrote: OK, thanks Joris. I will try playing with OS version option, on the systemvm-kvm-4.3 template... Let me know if I can help with anything more. Thanks. Andrija On 30 May 2014 15:19, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the translation from the nic profile to the actual route_proxy.sh command ACS failes to find the nic id
Re: VPC's VR missing public NIC eth1
I believe I've found the issue. In 4.3, some changes were made to BroadcastDomainType, to standardize Broadcast URIs to prepend vlan://. The issue is that your IpAssocVpcCommand doesn't use this new format for the broadcastUri it passes, so it fails to map the plugged device into the broadcastUriToNicNum map, resulting in ethnull. On Fri, May 30, 2014 at 9:25 AM, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type boot dev='cdrom'/ boot dev='hd'/ /os :( On 30 May 2014 16:24, Andrija Panic andrija.pa...@gmail.com wrote: I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and vmware templatest) Will change now DB to point to debian 5, and let you know. This is guest_os_id: 184 | 2 | NULL | 986a0e98-39d6-11e3-8f93-0025904e4412 | Debian GNU/Linux 7(64-bit) On 30 May 2014 15:45, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Andrija, The thing is I don't know who the os matching on KVM works. There must be a way to list supported os types. I also did some queuing on the guest_os_hypervisor table (ACS 4.3) and I don't see Debian 7 for KVM listed. select * from guest_os join guest_os_hypervisor on guest_os.id=guest_os_hypervisor.guest_os_id where guest_os_hypervisor.hypervisor_type='KVM' and guest_os_hypervisor.guest_os_name like '%Debian%'; What was the guest_os_id you where using? Could you try id 72 (Debian 5 64-bit)? Adjust both os_type_id in vm_instance and vm_template (where type='SYSTEM' and hypervisor_type='KVM'). Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:33, Andrija Panic andrija.pa...@gmail.com wrote: Joris, do you have recommendation on how in particular to try ? I'm not sure how to fix that, except playing with editing systemvm-4.3 template to define it as another OS type... ? Thanks again, Andrija On 30 May 2014 15:30, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: I've read back a bit in the code and if you look at BridgeVifDriver.java (this is where the log message with the nic profile is generated) you can see that the nic information might be off already once ACS hits the LibvirtVMDef.InterfaceDef plug function. This leads be to believer that the HVM/PV OS mismatch issue might still be related. Try fixing that first. At least it will allow us to exclude this from the list. Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:26, Andrija Panic andrija.pa...@gmail.com wrote: OK, thanks Joris. I will try playing with OS version option, on the systemvm-kvm-4.3 template... Let me know if I can help with anything more. Thanks. Andrija On 30 May 2014 15:19, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Hi Andrija, That does sound familiar and in the start xml of KVM you can see type arch='x86_64' machine='pc'hvm/type. I don't know KVM+ACS well enough to judge if this is the cause but I thing focusing on getting the VR started as PV guest might be worth trying. On the other hand I do see patchviasocket.pl being executed successfully... The other thing I see is, and now we're getting into java code, is this: 2014-05-30 14:41:01,386{GMT} DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) nic=[Nic:Public-46.232.xxx.246-vlan://untagged] 2014-05-30 14:41:01,502{GMT} DEBUG [cloud.agent.Agent] (agentRequest-Handler-3:) Processing command: com.cloud.agent.api.routing.IpAssocVpcCommand 2014-05-30 14:41:01,506{GMT} DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 My suspicion is that somewhere in the
Re: VPC's VR missing public NIC eth1
Note the differences in broadcastUri, here is your plug command: { com.cloud.agent.api.PlugNicCommand: { nic: { deviceId: 1, networkRateMbps: 9, defaultNic: true, uuid: 6c782af3-2071-4543-acdc-cb30096e89ff, ip: 46.232.xxx.246, netmask: 255.255.255.0, gateway: 46.232.xxx.1, mac: 06:53:82:00:00:25, broadcastType: Vlan, type: Public, broadcastUri: vlan://untagged, isolationUri: vlan://untagged, isSecurityGroupEnabled: false, name: breth1-500 }, instanceName: r-801-VM, vmType: DomainRouter, wait: 0 } } and here is your ip associate command: { com.cloud.agent.api.routing.IpAssocVpcCommand: { ipAddresses: [ { accountId: 11, publicIp: 46.232.xxx.246, sourceNat: true, add: true, oneToOneNat: false, firstIP: false, broadcastUri: untagged, vlanGateway: 46.232.xxx.1, vlanNetmask: 255.255.255.0, vifMacAddress: 06:53:82:00:00:25, networkRate: 9, trafficType: Public, networkName: breth1-500 } ], accessDetails: { router.guest.ip: 46.232.xxx.246, zone.network.type: Advanced, router.name: r-801-VM, router.ip: 169.254.0.52 }, wait: 0 } } On Fri, May 30, 2014 at 9:27 AM, Andrija Panic andrija.pa...@gmail.com wrote: yes, correct, eth1 is present, and can be started by static IP configuration... On 30 May 2014 17:25, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type boot dev='cdrom'/ boot dev='hd'/ /os :( On 30 May 2014 16:24, Andrija Panic andrija.pa...@gmail.com wrote: I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and vmware templatest) Will change now DB to point to debian 5, and let you know. This is guest_os_id: 184 | 2 | NULL | 986a0e98-39d6-11e3-8f93-0025904e4412 | Debian GNU/Linux 7(64-bit) On 30 May 2014 15:45, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Andrija, The thing is I don't know who the os matching on KVM works. There must be a way to list supported os types. I also did some queuing on the guest_os_hypervisor table (ACS 4.3) and I don't see Debian 7 for KVM listed. select * from guest_os join guest_os_hypervisor on guest_os.id=guest_os_hypervisor.guest_os_id where guest_os_hypervisor.hypervisor_type='KVM' and guest_os_hypervisor.guest_os_name like '%Debian%'; What was the guest_os_id you where using? Could you try id 72 (Debian 5 64-bit)? Adjust both os_type_id in vm_instance and vm_template (where type='SYSTEM' and hypervisor_type='KVM'). Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:33, Andrija Panic andrija.pa...@gmail.com wrote: Joris, do you have recommendation on how in particular to try ? I'm not sure how to fix that, except playing with editing systemvm-4.3 template to define it as another OS type... ? Thanks again, Andrija On 30 May 2014 15:30, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: I've read back a bit in the code and if you look at BridgeVifDriver.java (this is where the log message with the nic profile is generated) you can see that the nic information might be off already once ACS hits the LibvirtVMDef.InterfaceDef plug function. This leads be to believer that the HVM/PV OS mismatch issue might still be related. Try fixing that first. At least it
Re: VPC's VR missing public NIC eth1
I thinnk the commit that caused the change was this or related to it. Is there any way you could test a fix? Do you need me to build 4.3 RPMs or can I just provide a patch? What works for you? commit 53d09c6f1843f04c5f1ab76be9419f5584302d1e Date: Mon Aug 5 11:52:40 2013 +0200 uri code per broadcast/isolation type , default is to accept anything as uri , vlan and lswitch need some extra tlc On Fri, May 30, 2014 at 9:36 AM, Marcus shadow...@gmail.com wrote: Note the differences in broadcastUri, here is your plug command: { com.cloud.agent.api.PlugNicCommand: { nic: { deviceId: 1, networkRateMbps: 9, defaultNic: true, uuid: 6c782af3-2071-4543-acdc-cb30096e89ff, ip: 46.232.xxx.246, netmask: 255.255.255.0, gateway: 46.232.xxx.1, mac: 06:53:82:00:00:25, broadcastType: Vlan, type: Public, broadcastUri: vlan://untagged, isolationUri: vlan://untagged, isSecurityGroupEnabled: false, name: breth1-500 }, instanceName: r-801-VM, vmType: DomainRouter, wait: 0 } } and here is your ip associate command: { com.cloud.agent.api.routing.IpAssocVpcCommand: { ipAddresses: [ { accountId: 11, publicIp: 46.232.xxx.246, sourceNat: true, add: true, oneToOneNat: false, firstIP: false, broadcastUri: untagged, vlanGateway: 46.232.xxx.1, vlanNetmask: 255.255.255.0, vifMacAddress: 06:53:82:00:00:25, networkRate: 9, trafficType: Public, networkName: breth1-500 } ], accessDetails: { router.guest.ip: 46.232.xxx.246, zone.network.type: Advanced, router.name: r-801-VM, router.ip: 169.254.0.52 }, wait: 0 } } On Fri, May 30, 2014 at 9:27 AM, Andrija Panic andrija.pa...@gmail.com wrote: yes, correct, eth1 is present, and can be started by static IP configuration... On 30 May 2014 17:25, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type boot dev='cdrom'/ boot dev='hd'/ /os :( On 30 May 2014 16:24, Andrija Panic andrija.pa...@gmail.com wrote: I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and vmware templatest) Will change now DB to point to debian 5, and let you know. This is guest_os_id: 184 | 2 | NULL | 986a0e98-39d6-11e3-8f93-0025904e4412 | Debian GNU/Linux 7(64-bit) On 30 May 2014 15:45, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Andrija, The thing is I don't know who the os matching on KVM works. There must be a way to list supported os types. I also did some queuing on the guest_os_hypervisor table (ACS 4.3) and I don't see Debian 7 for KVM listed. select * from guest_os join guest_os_hypervisor on guest_os.id=guest_os_hypervisor.guest_os_id where guest_os_hypervisor.hypervisor_type='KVM' and guest_os_hypervisor.guest_os_name like '%Debian%'; What was the guest_os_id you where using? Could you try id 72 (Debian 5 64-bit)? Adjust both os_type_id in vm_instance and vm_template (where type='SYSTEM' and hypervisor_type='KVM'). Kind regards, Joris van Lieshout Schuberg Philis On 30/05/14 15:33, Andrija Panic andrija.pa...@gmail.com wrote: Joris, do you have recommendation on how in particular to try ? I'm not sure how to fix that, except playing with editing systemvm-4.3 template to define it as another OS type... ? Thanks again, Andrija
Re: VPC's VR missing public NIC eth1
Actually, if you're in the position to play a bit... New deployments seem to work, and I believe it's because that broadcastUri is stored in the db in the new format: mysql select id,vlan_id from vlan where network_id = (select id from networks where traffic_type=Public); ++-+ | id | vlan_id | ++-+ | 1 | vlan://untagged | ++-+ 1 row in set (0.00 sec) I believe 4.2 and earlier would say just 'untagged' there. If you want to attempt changing that value to include vlan:// (if it is in fact missing), then restarting everything, that may fix the issue. On Fri, May 30, 2014 at 9:38 AM, Marcus shadow...@gmail.com wrote: I thinnk the commit that caused the change was this or related to it. Is there any way you could test a fix? Do you need me to build 4.3 RPMs or can I just provide a patch? What works for you? commit 53d09c6f1843f04c5f1ab76be9419f5584302d1e Date: Mon Aug 5 11:52:40 2013 +0200 uri code per broadcast/isolation type , default is to accept anything as uri , vlan and lswitch need some extra tlc On Fri, May 30, 2014 at 9:36 AM, Marcus shadow...@gmail.com wrote: Note the differences in broadcastUri, here is your plug command: { com.cloud.agent.api.PlugNicCommand: { nic: { deviceId: 1, networkRateMbps: 9, defaultNic: true, uuid: 6c782af3-2071-4543-acdc-cb30096e89ff, ip: 46.232.xxx.246, netmask: 255.255.255.0, gateway: 46.232.xxx.1, mac: 06:53:82:00:00:25, broadcastType: Vlan, type: Public, broadcastUri: vlan://untagged, isolationUri: vlan://untagged, isSecurityGroupEnabled: false, name: breth1-500 }, instanceName: r-801-VM, vmType: DomainRouter, wait: 0 } } and here is your ip associate command: { com.cloud.agent.api.routing.IpAssocVpcCommand: { ipAddresses: [ { accountId: 11, publicIp: 46.232.xxx.246, sourceNat: true, add: true, oneToOneNat: false, firstIP: false, broadcastUri: untagged, vlanGateway: 46.232.xxx.1, vlanNetmask: 255.255.255.0, vifMacAddress: 06:53:82:00:00:25, networkRate: 9, trafficType: Public, networkName: breth1-500 } ], accessDetails: { router.guest.ip: 46.232.xxx.246, zone.network.type: Advanced, router.name: r-801-VM, router.ip: 169.254.0.52 }, wait: 0 } } On Fri, May 30, 2014 at 9:27 AM, Andrija Panic andrija.pa...@gmail.com wrote: yes, correct, eth1 is present, and can be started by static IP configuration... On 30 May 2014 17:25, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type boot dev='cdrom'/ boot dev='hd'/ /os :( On 30 May 2014 16:24, Andrija Panic andrija.pa...@gmail.com wrote: I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and vmware templatest) Will change now DB to point to debian 5, and let you know. This is guest_os_id: 184 | 2 | NULL | 986a0e98-39d6-11e3-8f93-0025904e4412 | Debian GNU/Linux 7(64-bit) On 30 May 2014 15:45, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Andrija, The thing is I don't know who the os matching on KVM works. There must be a way to list supported os types. I also did some queuing on the guest_os_hypervisor table (ACS 4.3) and I don't see Debian 7 for KVM listed. select * from guest_os join guest_os_hypervisor on guest_os.id=guest_os_hypervisor.guest_os_id where
Re: VPC's VR missing public NIC eth1
Marcus, I don't think it should be 'vlan://untagged'. If it works as you describe this seems a bug. 'untagged' should be broadcastdomaintype agnostic, shouldn't it? If it should work that way then it is a bug/omission in the db upgrade. On Fri, May 30, 2014 at 5:57 PM, Marcus shadow...@gmail.com wrote: Actually, if you're in the position to play a bit... New deployments seem to work, and I believe it's because that broadcastUri is stored in the db in the new format: mysql select id,vlan_id from vlan where network_id = (select id from networks where traffic_type=Public); ++-+ | id | vlan_id | ++-+ | 1 | vlan://untagged | ++-+ 1 row in set (0.00 sec) I believe 4.2 and earlier would say just 'untagged' there. If you want to attempt changing that value to include vlan:// (if it is in fact missing), then restarting everything, that may fix the issue. On Fri, May 30, 2014 at 9:38 AM, Marcus shadow...@gmail.com wrote: I thinnk the commit that caused the change was this or related to it. Is there any way you could test a fix? Do you need me to build 4.3 RPMs or can I just provide a patch? What works for you? commit 53d09c6f1843f04c5f1ab76be9419f5584302d1e Date: Mon Aug 5 11:52:40 2013 +0200 uri code per broadcast/isolation type , default is to accept anything as uri , vlan and lswitch need some extra tlc On Fri, May 30, 2014 at 9:36 AM, Marcus shadow...@gmail.com wrote: Note the differences in broadcastUri, here is your plug command: { com.cloud.agent.api.PlugNicCommand: { nic: { deviceId: 1, networkRateMbps: 9, defaultNic: true, uuid: 6c782af3-2071-4543-acdc-cb30096e89ff, ip: 46.232.xxx.246, netmask: 255.255.255.0, gateway: 46.232.xxx.1, mac: 06:53:82:00:00:25, broadcastType: Vlan, type: Public, broadcastUri: vlan://untagged, isolationUri: vlan://untagged, isSecurityGroupEnabled: false, name: breth1-500 }, instanceName: r-801-VM, vmType: DomainRouter, wait: 0 } } and here is your ip associate command: { com.cloud.agent.api.routing.IpAssocVpcCommand: { ipAddresses: [ { accountId: 11, publicIp: 46.232.xxx.246, sourceNat: true, add: true, oneToOneNat: false, firstIP: false, broadcastUri: untagged, vlanGateway: 46.232.xxx.1, vlanNetmask: 255.255.255.0, vifMacAddress: 06:53:82:00:00:25, networkRate: 9, trafficType: Public, networkName: breth1-500 } ], accessDetails: { router.guest.ip: 46.232.xxx.246, zone.network.type: Advanced, router.name: r-801-VM, router.ip: 169.254.0.52 }, wait: 0 } } On Fri, May 30, 2014 at 9:27 AM, Andrija Panic andrija.pa...@gmail.com wrote: yes, correct, eth1 is present, and can be started by static IP configuration... On 30 May 2014 17:25, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type boot dev='cdrom'/ boot dev='hd'/ /os :( On 30 May 2014 16:24, Andrija Panic andrija.pa...@gmail.com wrote: I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and vmware templatest) Will change now DB to point to debian 5, and let you know. This is guest_os_id: 184 | 2 | NULL | 986a0e98-39d6-11e3-8f93-0025904e4412 | Debian GNU/Linux 7(64-bit) On 30 May 2014 15:45, Joris van Lieshout jvanliesh...@schubergphilis.com wrote: Andrija, The thing is I don't know who the os
Re: VPC's VR missing public NIC eth1
If that works, then I think the fix is better to update the database upgrade script to look for this and change the vlan_id. That way new installs and upgrades have consistent data. We could fix it in the code by filtering it through BroadcastDomainType.Vlan.toUri(ipAddr.getVlanTag()), but then later when someone expects the vlan id to be in URI format somewhere else it will randomly break on people who did an upgrade. The fix to CLOUDSTACK-5505 did this, avoiding the mismatch in format in the DB by fetching the data from elsewhere that filtered it into URI format, and then we just ended up hitting it again here. On Fri, May 30, 2014 at 9:57 AM, Marcus shadow...@gmail.com wrote: Actually, if you're in the position to play a bit... New deployments seem to work, and I believe it's because that broadcastUri is stored in the db in the new format: mysql select id,vlan_id from vlan where network_id = (select id from networks where traffic_type=Public); ++-+ | id | vlan_id | ++-+ | 1 | vlan://untagged | ++-+ 1 row in set (0.00 sec) I believe 4.2 and earlier would say just 'untagged' there. If you want to attempt changing that value to include vlan:// (if it is in fact missing), then restarting everything, that may fix the issue. On Fri, May 30, 2014 at 9:38 AM, Marcus shadow...@gmail.com wrote: I thinnk the commit that caused the change was this or related to it. Is there any way you could test a fix? Do you need me to build 4.3 RPMs or can I just provide a patch? What works for you? commit 53d09c6f1843f04c5f1ab76be9419f5584302d1e Date: Mon Aug 5 11:52:40 2013 +0200 uri code per broadcast/isolation type , default is to accept anything as uri , vlan and lswitch need some extra tlc On Fri, May 30, 2014 at 9:36 AM, Marcus shadow...@gmail.com wrote: Note the differences in broadcastUri, here is your plug command: { com.cloud.agent.api.PlugNicCommand: { nic: { deviceId: 1, networkRateMbps: 9, defaultNic: true, uuid: 6c782af3-2071-4543-acdc-cb30096e89ff, ip: 46.232.xxx.246, netmask: 255.255.255.0, gateway: 46.232.xxx.1, mac: 06:53:82:00:00:25, broadcastType: Vlan, type: Public, broadcastUri: vlan://untagged, isolationUri: vlan://untagged, isSecurityGroupEnabled: false, name: breth1-500 }, instanceName: r-801-VM, vmType: DomainRouter, wait: 0 } } and here is your ip associate command: { com.cloud.agent.api.routing.IpAssocVpcCommand: { ipAddresses: [ { accountId: 11, publicIp: 46.232.xxx.246, sourceNat: true, add: true, oneToOneNat: false, firstIP: false, broadcastUri: untagged, vlanGateway: 46.232.xxx.1, vlanNetmask: 255.255.255.0, vifMacAddress: 06:53:82:00:00:25, networkRate: 9, trafficType: Public, networkName: breth1-500 } ], accessDetails: { router.guest.ip: 46.232.xxx.246, zone.network.type: Advanced, router.name: r-801-VM, router.ip: 169.254.0.52 }, wait: 0 } } On Fri, May 30, 2014 at 9:27 AM, Andrija Panic andrija.pa...@gmail.com wrote: yes, correct, eth1 is present, and can be started by static IP configuration... On 30 May 2014 17:25, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type boot dev='cdrom'/ boot dev='hd'/ /os :( On 30 May 2014 16:24, Andrija Panic andrija.pa...@gmail.com wrote: I confirm, the highest is Debian 5 64bit Per docs http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3/rnotes.html#upgrade-from-4-2-x-to-4-3 , you should use Debian 7.0 64bit as OS type for system-kvm-4.3 template... (same for xen and
Re: VPC's VR missing public NIC eth1
CLOUDSTACK-5505 looks alright. As for the solution; Isn't 'untagged' a valid uri in itself? I would expect it would have always the value without the 'vlan://'. That said a solution is better then no solution, maybe the db upgrade path is best. Andrija, can you try as Marcus suggests, editing the db to change 'untagged' to 'vlan://untagged'? thanks, Daan On Fri, May 30, 2014 at 6:03 PM, Marcus shadow...@gmail.com wrote: If that works, then I think the fix is better to update the database upgrade script to look for this and change the vlan_id. That way new installs and upgrades have consistent data. We could fix it in the code by filtering it through BroadcastDomainType.Vlan.toUri(ipAddr.getVlanTag()), but then later when someone expects the vlan id to be in URI format somewhere else it will randomly break on people who did an upgrade. The fix to CLOUDSTACK-5505 did this, avoiding the mismatch in format in the DB by fetching the data from elsewhere that filtered it into URI format, and then we just ended up hitting it again here. On Fri, May 30, 2014 at 9:57 AM, Marcus shadow...@gmail.com wrote: Actually, if you're in the position to play a bit... New deployments seem to work, and I believe it's because that broadcastUri is stored in the db in the new format: mysql select id,vlan_id from vlan where network_id = (select id from networks where traffic_type=Public); ++-+ | id | vlan_id | ++-+ | 1 | vlan://untagged | ++-+ 1 row in set (0.00 sec) I believe 4.2 and earlier would say just 'untagged' there. If you want to attempt changing that value to include vlan:// (if it is in fact missing), then restarting everything, that may fix the issue. On Fri, May 30, 2014 at 9:38 AM, Marcus shadow...@gmail.com wrote: I thinnk the commit that caused the change was this or related to it. Is there any way you could test a fix? Do you need me to build 4.3 RPMs or can I just provide a patch? What works for you? commit 53d09c6f1843f04c5f1ab76be9419f5584302d1e Date: Mon Aug 5 11:52:40 2013 +0200 uri code per broadcast/isolation type , default is to accept anything as uri , vlan and lswitch need some extra tlc On Fri, May 30, 2014 at 9:36 AM, Marcus shadow...@gmail.com wrote: Note the differences in broadcastUri, here is your plug command: { com.cloud.agent.api.PlugNicCommand: { nic: { deviceId: 1, networkRateMbps: 9, defaultNic: true, uuid: 6c782af3-2071-4543-acdc-cb30096e89ff, ip: 46.232.xxx.246, netmask: 255.255.255.0, gateway: 46.232.xxx.1, mac: 06:53:82:00:00:25, broadcastType: Vlan, type: Public, broadcastUri: vlan://untagged, isolationUri: vlan://untagged, isSecurityGroupEnabled: false, name: breth1-500 }, instanceName: r-801-VM, vmType: DomainRouter, wait: 0 } } and here is your ip associate command: { com.cloud.agent.api.routing.IpAssocVpcCommand: { ipAddresses: [ { accountId: 11, publicIp: 46.232.xxx.246, sourceNat: true, add: true, oneToOneNat: false, firstIP: false, broadcastUri: untagged, vlanGateway: 46.232.xxx.1, vlanNetmask: 255.255.255.0, vifMacAddress: 06:53:82:00:00:25, networkRate: 9, trafficType: Public, networkName: breth1-500 } ], accessDetails: { router.guest.ip: 46.232.xxx.246, zone.network.type: Advanced, router.name: r-801-VM, router.ip: 169.254.0.52 }, wait: 0 } } On Fri, May 30, 2014 at 9:27 AM, Andrija Panic andrija.pa...@gmail.com wrote: yes, correct, eth1 is present, and can be started by static IP configuration... On 30 May 2014 17:25, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ... os type arch='x86_64' machine='rhel6.5.0'hvm/type
Re: VPC's VR missing public NIC eth1
I agree, I actually brought this issue up several months ago when we had all of the 'untagged' discussions. I believe I pointed out that vlan:// was now in the db. My impression was that the change was intentional, if you're telling me you didn't intend to do that then we can change it back. At this point we have to either write a fix for all 4.3.1+ to change the database value back if it has vlan:// AND make sure all of the comparisons still work (though perhaps upgrades like this are catching those points for us), or stick with the vlan:// in the DB and move forward. Either way is fine with me, as long as we can make all installations consistent. On Fri, May 30, 2014 at 10:02 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: Marcus, I don't think it should be 'vlan://untagged'. If it works as you describe this seems a bug. 'untagged' should be broadcastdomaintype agnostic, shouldn't it? If it should work that way then it is a bug/omission in the db upgrade. On Fri, May 30, 2014 at 5:57 PM, Marcus shadow...@gmail.com wrote: Actually, if you're in the position to play a bit... New deployments seem to work, and I believe it's because that broadcastUri is stored in the db in the new format: mysql select id,vlan_id from vlan where network_id = (select id from networks where traffic_type=Public); ++-+ | id | vlan_id | ++-+ | 1 | vlan://untagged | ++-+ 1 row in set (0.00 sec) I believe 4.2 and earlier would say just 'untagged' there. If you want to attempt changing that value to include vlan:// (if it is in fact missing), then restarting everything, that may fix the issue. On Fri, May 30, 2014 at 9:38 AM, Marcus shadow...@gmail.com wrote: I thinnk the commit that caused the change was this or related to it. Is there any way you could test a fix? Do you need me to build 4.3 RPMs or can I just provide a patch? What works for you? commit 53d09c6f1843f04c5f1ab76be9419f5584302d1e Date: Mon Aug 5 11:52:40 2013 +0200 uri code per broadcast/isolation type , default is to accept anything as uri , vlan and lswitch need some extra tlc On Fri, May 30, 2014 at 9:36 AM, Marcus shadow...@gmail.com wrote: Note the differences in broadcastUri, here is your plug command: { com.cloud.agent.api.PlugNicCommand: { nic: { deviceId: 1, networkRateMbps: 9, defaultNic: true, uuid: 6c782af3-2071-4543-acdc-cb30096e89ff, ip: 46.232.xxx.246, netmask: 255.255.255.0, gateway: 46.232.xxx.1, mac: 06:53:82:00:00:25, broadcastType: Vlan, type: Public, broadcastUri: vlan://untagged, isolationUri: vlan://untagged, isSecurityGroupEnabled: false, name: breth1-500 }, instanceName: r-801-VM, vmType: DomainRouter, wait: 0 } } and here is your ip associate command: { com.cloud.agent.api.routing.IpAssocVpcCommand: { ipAddresses: [ { accountId: 11, publicIp: 46.232.xxx.246, sourceNat: true, add: true, oneToOneNat: false, firstIP: false, broadcastUri: untagged, vlanGateway: 46.232.xxx.1, vlanNetmask: 255.255.255.0, vifMacAddress: 06:53:82:00:00:25, networkRate: 9, trafficType: Public, networkName: breth1-500 } ], accessDetails: { router.guest.ip: 46.232.xxx.246, zone.network.type: Advanced, router.name: r-801-VM, router.ip: 169.254.0.52 }, wait: 0 } } On Fri, May 30, 2014 at 9:27 AM, Andrija Panic andrija.pa...@gmail.com wrote: yes, correct, eth1 is present, and can be started by static IP configuration... On 30 May 2014 17:25, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic andrija.pa...@gmail.com wrote: Nope, started, did check, it is reported now as Debian 5 VM, but still doesn't work... Rebooted VPC (destroyed VR, new one created...) $ virsh dumpxml r-812-VM ... descriptionDebian GNU/Linux 5(64-bit)/description ...
Re: VPC's VR missing public NIC eth1
It's not valid if you've got code that says does string 'vlan://untagged' equal 'untagged'. On Fri, May 30, 2014 at 10:17 AM, Daan Hoogland daan.hoogl...@gmail.com wrote: CLOUDSTACK-5505 looks alright. As for the solution; Isn't 'untagged' a valid uri in itself? I would expect it would have always the value without the 'vlan://'. That said a solution is better then no solution, maybe the db upgrade path is best. Andrija, can you try as Marcus suggests, editing the db to change 'untagged' to 'vlan://untagged'? thanks, Daan On Fri, May 30, 2014 at 6:03 PM, Marcus shadow...@gmail.com wrote: If that works, then I think the fix is better to update the database upgrade script to look for this and change the vlan_id. That way new installs and upgrades have consistent data. We could fix it in the code by filtering it through BroadcastDomainType.Vlan.toUri(ipAddr.getVlanTag()), but then later when someone expects the vlan id to be in URI format somewhere else it will randomly break on people who did an upgrade. The fix to CLOUDSTACK-5505 did this, avoiding the mismatch in format in the DB by fetching the data from elsewhere that filtered it into URI format, and then we just ended up hitting it again here. On Fri, May 30, 2014 at 9:57 AM, Marcus shadow...@gmail.com wrote: Actually, if you're in the position to play a bit... New deployments seem to work, and I believe it's because that broadcastUri is stored in the db in the new format: mysql select id,vlan_id from vlan where network_id = (select id from networks where traffic_type=Public); ++-+ | id | vlan_id | ++-+ | 1 | vlan://untagged | ++-+ 1 row in set (0.00 sec) I believe 4.2 and earlier would say just 'untagged' there. If you want to attempt changing that value to include vlan:// (if it is in fact missing), then restarting everything, that may fix the issue. On Fri, May 30, 2014 at 9:38 AM, Marcus shadow...@gmail.com wrote: I thinnk the commit that caused the change was this or related to it. Is there any way you could test a fix? Do you need me to build 4.3 RPMs or can I just provide a patch? What works for you? commit 53d09c6f1843f04c5f1ab76be9419f5584302d1e Date: Mon Aug 5 11:52:40 2013 +0200 uri code per broadcast/isolation type , default is to accept anything as uri , vlan and lswitch need some extra tlc On Fri, May 30, 2014 at 9:36 AM, Marcus shadow...@gmail.com wrote: Note the differences in broadcastUri, here is your plug command: { com.cloud.agent.api.PlugNicCommand: { nic: { deviceId: 1, networkRateMbps: 9, defaultNic: true, uuid: 6c782af3-2071-4543-acdc-cb30096e89ff, ip: 46.232.xxx.246, netmask: 255.255.255.0, gateway: 46.232.xxx.1, mac: 06:53:82:00:00:25, broadcastType: Vlan, type: Public, broadcastUri: vlan://untagged, isolationUri: vlan://untagged, isSecurityGroupEnabled: false, name: breth1-500 }, instanceName: r-801-VM, vmType: DomainRouter, wait: 0 } } and here is your ip associate command: { com.cloud.agent.api.routing.IpAssocVpcCommand: { ipAddresses: [ { accountId: 11, publicIp: 46.232.xxx.246, sourceNat: true, add: true, oneToOneNat: false, firstIP: false, broadcastUri: untagged, vlanGateway: 46.232.xxx.1, vlanNetmask: 255.255.255.0, vifMacAddress: 06:53:82:00:00:25, networkRate: 9, trafficType: Public, networkName: breth1-500 } ], accessDetails: { router.guest.ip: 46.232.xxx.246, zone.network.type: Advanced, router.name: r-801-VM, router.ip: 169.254.0.52 }, wait: 0 } } On Fri, May 30, 2014 at 9:27 AM, Andrija Panic andrija.pa...@gmail.com wrote: yes, correct, eth1 is present, and can be started by static IP configuration... On 30 May 2014 17:25, Marcus shadow...@gmail.com wrote: Let me make sure I understand... the 'Plug' of the nic works fine, as it seems you do have an eth1 and can manually assign the IP to get it to work? If that's the case then there's probably not an issue in BridgeVifDriver or the XML. It is definitely in fetching/matching the eth device here vpc_ipassoc.sh 169.254.0.52 -A -l 46.232.xxx.246 -c ethnull -g 46.232.xxx.1 -m 24 -n 46.232.xxx.0 On Fri, May 30, 2014 at 8:33 AM, Andrija Panic
Re: VPC's VR missing public NIC eth1
On Fri, May 30, 2014 at 10:51 PM, Marcus shadow...@gmail.com wrote: Looks good to me, aside from he debug statement. Ah, the first line was not in my line of sight. -- Daan
Re: VPC's VR missing public NIC eth1
Will try, thx. Not sure good question - when is 4.4 scheduled to be releases - few months, or more ? Thanks On 29 May 2014 06:15, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.comwrote: Hi Andrija, Same issue with public vlan tagged got fixed, CLOUDSTACK-5505. Thanks, Jayapal On 29-May-2014, at 9:38 AM, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: Hi Adrija, From the logs, the public subnet is untagged. I think this issue is coming for the untagged public vlan in 4.3. 1. {com.cloud.agent.api.PlugNicCommand:{nic:{deviceId:1,networkRateMbps:9,defaultNic:true,uuid:e6b734d4-3302-4113-8ec7-5c205c90959a,ip:46.232.180.248,netmask:255.255.255.0,gateway:46.232.180.1,mac:06:5e:e8:00:00:27,broadcastType:Vlan,type:Public,broadcastUri:vlan://untagged,isolationUri:vlan://untagged,isSecurityGroupEnabled:false,name:breth1-500}, 2. 3. instanceName:r-779-VM,vmType:DomainRouter,wait:0}},{com.cloud.agent.api.routing.IpAssocVpcCommand:{ipAddresses:[{accountId:2,publicIp:46.232.180.248,sourceNat:true,add:true,oneToOneNat:false,firstIP:false,broadcastUri:untagged,vlanGateway:46.232.180.1,vlanNetmask:255.255.255.0,vifMacAddress:06:5e:e8:00:00:27,networkRate:9,trafficType:Public,networkName:breth1-500}],accessDetails: From the logs VR logs, the ipassoc script got the interface id as null. May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds Thanks, Jayapal On 29-May-2014, at 1:08 AM, Andrija Panic andrija.pa...@gmail.com mailto:andrija.pa...@gmail.com wrote: Thanks Daan, my problem is that I'm in production for 3rd day now, and restoring DB and downgrading back to 4.2.1 doesn't seem as option for me at the moment, since I would loose new acounts and single VMs, etc... Thanks, Andrija On 28 May 2014 21:34, Daan Hoogland daan.hoogl...@gmail.commailto: daan.hoogl...@gmail.com wrote: Andrija, nevertheless it sounds familiar. I will be back in the office on monday and ask around. On Wed, May 28, 2014 at 9:23 PM, Andrija Panic andrija.pa...@gmail.com mailto:andrija.pa...@gmail.com wrote: Hi Daan, I don't think this is my issue, at least I don't make use of private gateway - this is just simple as: create new VPC from scratch - Public IP is not assigned to VR eth1 interface inside VR... I have filed the bug: https://issues.apache.org/jira/browse/CLOUDSTACK-6801 This same thing happened previously to Andrei Mikhailovsky: http://mail-archives.apache.org/mod_mbox/cloudstack-users/201405.mbox/%3C33347835.250.1399336340785.JavaMail.andrei@tuchka%3Eand it is not resolved Thanks, Andrija On 28 May 2014 21:01, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, this sound like something we seen as well. can you check if this is it : https://issues.apache.org/jira/browse/CLOUDSTACK-6485 thanks, Daan On Wed, May 28, 2014 at 3:30 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi there, I'm having big time problems with Public IP missing from VPC VR's eth1, after upgrade to ACS 4.3.1 - did not found this filed as bug so far...and it worked all fine on ACS 4.2.1. No help so far from user mailing list... Below is a detailed explanation, and logs from inside VR, and from management (all fine with management logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 0 11590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 544000 0 0 0 2084 24000 0 0 0 eth1: 0 0000 0 0 0 0 0000 0 0 0 eth0: 1502071319000 0 0 0 264232 1180000 0 0 0 On 28 May 2014 14:47, Andrija Panic andrija.pa...@gmail.com wrote: Also, from /var/log/messages/ inside VR: This is a major show stopper - all our VPCs are unusable complete. Anybody... ? May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds May 28 12:37:34 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull
Re: VPC's VR missing public NIC eth1
Andrija, we plan to release it soon. It depends on how quickly issues in it are fixed. The branch has been created, so you could test it and report on it if you have an env to spare. On Thu, May 29, 2014 at 10:09 AM, Andrija Panic andrija.pa...@gmail.com wrote: Will try, thx. Not sure good question - when is 4.4 scheduled to be releases - few months, or more ? Thanks On 29 May 2014 06:15, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.comwrote: Hi Andrija, Same issue with public vlan tagged got fixed, CLOUDSTACK-5505. Thanks, Jayapal On 29-May-2014, at 9:38 AM, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: Hi Adrija, From the logs, the public subnet is untagged. I think this issue is coming for the untagged public vlan in 4.3. 1. {com.cloud.agent.api.PlugNicCommand:{nic:{deviceId:1,networkRateMbps:9,defaultNic:true,uuid:e6b734d4-3302-4113-8ec7-5c205c90959a,ip:46.232.180.248,netmask:255.255.255.0,gateway:46.232.180.1,mac:06:5e:e8:00:00:27,broadcastType:Vlan,type:Public,broadcastUri:vlan://untagged,isolationUri:vlan://untagged,isSecurityGroupEnabled:false,name:breth1-500}, 2. 3. instanceName:r-779-VM,vmType:DomainRouter,wait:0}},{com.cloud.agent.api.routing.IpAssocVpcCommand:{ipAddresses:[{accountId:2,publicIp:46.232.180.248,sourceNat:true,add:true,oneToOneNat:false,firstIP:false,broadcastUri:untagged,vlanGateway:46.232.180.1,vlanNetmask:255.255.255.0,vifMacAddress:06:5e:e8:00:00:27,networkRate:9,trafficType:Public,networkName:breth1-500}],accessDetails: From the logs VR logs, the ipassoc script got the interface id as null. May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds Thanks, Jayapal On 29-May-2014, at 1:08 AM, Andrija Panic andrija.pa...@gmail.com mailto:andrija.pa...@gmail.com wrote: Thanks Daan, my problem is that I'm in production for 3rd day now, and restoring DB and downgrading back to 4.2.1 doesn't seem as option for me at the moment, since I would loose new acounts and single VMs, etc... Thanks, Andrija On 28 May 2014 21:34, Daan Hoogland daan.hoogl...@gmail.commailto: daan.hoogl...@gmail.com wrote: Andrija, nevertheless it sounds familiar. I will be back in the office on monday and ask around. On Wed, May 28, 2014 at 9:23 PM, Andrija Panic andrija.pa...@gmail.com mailto:andrija.pa...@gmail.com wrote: Hi Daan, I don't think this is my issue, at least I don't make use of private gateway - this is just simple as: create new VPC from scratch - Public IP is not assigned to VR eth1 interface inside VR... I have filed the bug: https://issues.apache.org/jira/browse/CLOUDSTACK-6801 This same thing happened previously to Andrei Mikhailovsky: http://mail-archives.apache.org/mod_mbox/cloudstack-users/201405.mbox/%3C33347835.250.1399336340785.JavaMail.andrei@tuchka%3Eand it is not resolved Thanks, Andrija On 28 May 2014 21:01, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, this sound like something we seen as well. can you check if this is it : https://issues.apache.org/jira/browse/CLOUDSTACK-6485 thanks, Daan On Wed, May 28, 2014 at 3:30 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi there, I'm having big time problems with Public IP missing from VPC VR's eth1, after upgrade to ACS 4.3.1 - did not found this filed as bug so far...and it worked all fine on ACS 4.2.1. No help so far from user mailing list... Below is a detailed explanation, and logs from inside VR, and from management (all fine with management logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 0 11590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 544000 0 0 0 2084 24000 0 0 0 eth1: 0 0000 0 0 0 0 0000 0 0 0 eth0: 1502071319000 0 0 0 264232 1180000 0 0 0 On 28 May 2014 14:47, Andrija Panic andrija.pa...@gmail.com wrote: Also, from /var/log/messages/ inside VR
Re: VPC's VR missing public NIC eth1
Daan, thanks for info. And one final question - is it possible to change Public vlan/range from untagged to tagged - editing vlan table and making change, does not really make changes after I restart VPC router... THanks On 29 May 2014 10:15, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, we plan to release it soon. It depends on how quickly issues in it are fixed. The branch has been created, so you could test it and report on it if you have an env to spare. On Thu, May 29, 2014 at 10:09 AM, Andrija Panic andrija.pa...@gmail.com wrote: Will try, thx. Not sure good question - when is 4.4 scheduled to be releases - few months, or more ? Thanks On 29 May 2014 06:15, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: Hi Andrija, Same issue with public vlan tagged got fixed, CLOUDSTACK-5505. Thanks, Jayapal On 29-May-2014, at 9:38 AM, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: Hi Adrija, From the logs, the public subnet is untagged. I think this issue is coming for the untagged public vlan in 4.3. 1. {com.cloud.agent.api.PlugNicCommand:{nic:{deviceId:1,networkRateMbps:9,defaultNic:true,uuid:e6b734d4-3302-4113-8ec7-5c205c90959a,ip:46.232.180.248,netmask:255.255.255.0,gateway:46.232.180.1,mac:06:5e:e8:00:00:27,broadcastType:Vlan,type:Public,broadcastUri:vlan://untagged,isolationUri:vlan://untagged,isSecurityGroupEnabled:false,name:breth1-500}, 2. 3. instanceName:r-779-VM,vmType:DomainRouter,wait:0}},{com.cloud.agent.api.routing.IpAssocVpcCommand:{ipAddresses:[{accountId:2,publicIp:46.232.180.248,sourceNat:true,add:true,oneToOneNat:false,firstIP:false,broadcastUri:untagged,vlanGateway:46.232.180.1,vlanNetmask:255.255.255.0,vifMacAddress:06:5e:e8:00:00:27,networkRate:9,trafficType:Public,networkName:breth1-500}],accessDetails: From the logs VR logs, the ipassoc script got the interface id as null. May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds Thanks, Jayapal On 29-May-2014, at 1:08 AM, Andrija Panic andrija.pa...@gmail.com mailto:andrija.pa...@gmail.com wrote: Thanks Daan, my problem is that I'm in production for 3rd day now, and restoring DB and downgrading back to 4.2.1 doesn't seem as option for me at the moment, since I would loose new acounts and single VMs, etc... Thanks, Andrija On 28 May 2014 21:34, Daan Hoogland daan.hoogl...@gmail.commailto: daan.hoogl...@gmail.com wrote: Andrija, nevertheless it sounds familiar. I will be back in the office on monday and ask around. On Wed, May 28, 2014 at 9:23 PM, Andrija Panic andrija.pa...@gmail.com mailto:andrija.pa...@gmail.com wrote: Hi Daan, I don't think this is my issue, at least I don't make use of private gateway - this is just simple as: create new VPC from scratch - Public IP is not assigned to VR eth1 interface inside VR... I have filed the bug: https://issues.apache.org/jira/browse/CLOUDSTACK-6801 This same thing happened previously to Andrei Mikhailovsky: http://mail-archives.apache.org/mod_mbox/cloudstack-users/201405.mbox/%3C33347835.250.1399336340785.JavaMail.andrei@tuchka%3Eand it is not resolved Thanks, Andrija On 28 May 2014 21:01, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, this sound like something we seen as well. can you check if this is it : https://issues.apache.org/jira/browse/CLOUDSTACK-6485 thanks, Daan On Wed, May 28, 2014 at 3:30 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi there, I'm having big time problems with Public IP missing from VPC VR's eth1, after upgrade to ACS 4.3.1 - did not found this filed as bug so far...and it worked all fine on ACS 4.2.1. No help so far from user mailing list... Below is a detailed explanation, and logs from inside VR, and from management (all fine with management logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 0 11590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 54400
Re: VPC's VR missing public NIC eth1
logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 0 11590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 544000 0 0 0 2084 24000 0 0 0 eth1: 0 0000 0 0 0 0 0000 0 0 0 eth0: 1502071319000 0 0 0 264232 1180000 0 0 0 On 28 May 2014 14:47, Andrija Panic andrija.pa...@gmail.com wrote: Also, from /var/log/messages/ inside VR: This is a major show stopper - all our VPCs are unusable complete. Anybody... ? May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds May 28 12:37:34 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 1 seconds May 28 12:37:35 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 2 seconds May 28 12:37:36 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 3 seconds May 28 12:37:37 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 4 seconds May 28 12:37:38 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 5 seconds May 28 12:37:39 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 6 seconds May 28 12:37:40 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 7 seconds May 28 12:37:41 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 8 seconds May 28 12:37:42 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 9 seconds May 28 12:37:43 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 10 seconds May 28 12:37:44 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 11 seconds May 28 12:37:45 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 12 seconds May 28 12:37:46 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 13 seconds May 28 12:37:47 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 14 seconds May 28 12:37:48 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 15 seconds May 28 12:37:49 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 16 seconds May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:interface ethnull never appeared May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:Adding ip 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:Add routing 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_privateGateway.sh:Added SourceNAT 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_snat.sh:Added SourceNAT 46.232.180.246 on interface eth1 On 28 May 2014 12:59, Andrija Panic andrija.pa...@gmail.com wrote: Defined eth1 manually inside /etc/network/interfaces inside VPC's VR. iface eth1 inet static address 46.232.180.246 netmask 255.255.255.0 ifup eth1 ip route add default via 46.232.180.1 so now VR works fine (have access to internet) But again, adding new IP to VR, and enabling static NAT is failing... That is, geting new IP works fine (just associated with account) But enabling static NAT fails, due to resource unavailable Here are management logs: 2014-05-28 12:57:00,716 WARN [c.c.n.r.RulesManagerImpl] (catalina-exec-22:ctx-537ac57b ctx-8c44c786) Failed to create static nat rule due to com.cloud.exception.ResourceUnavailableException: Resource [DataCenter:1] is unreachable: Unable to apply static nat rules on router at com.cloud.network.router.VirtualNetworkApplianceManagerImpl.applyRules(VirtualNetworkApplianceManagerImpl.java:3915
Re: VPC's VR missing public NIC eth1
On Thu, May 29, 2014 at 10:57 AM, Andrija Panic andrija.pa...@gmail.com wrote: 500 is 500 the vlan of your guestnetwork or your physical network? You wouldn't want to have two nets with vlan 500! -- Daan
Re: VPC's VR missing public NIC eth1
It's like this: I have public subnet /24. half is dedicated for Guest traffic (vlan 500) and the second half is dedicated to Public traffic/network (no vlan tags, that is untagged packets) Both vlan500 and untagged packets travel over physical eth1 interface on hypervisors and can reach Internet. Thanks, On 29 May 2014 11:06, Daan Hoogland daan.hoogl...@gmail.com wrote: On Thu, May 29, 2014 at 10:57 AM, Andrija Panic andrija.pa...@gmail.com wrote: 500 is 500 the vlan of your guestnetwork or your physical network? You wouldn't want to have two nets with vlan 500! -- Daan -- Andrija Panić -- http://admintweets.com --
Re: VPC's VR missing public NIC eth1
I don't think editing DB table will work. -Jayapal On 29-May-2014, at 2:52 PM, Andrija Panic andrija.pa...@gmail.com wrote: It's like this: I have public subnet /24. half is dedicated for Guest traffic (vlan 500) and the second half is dedicated to Public traffic/network (no vlan tags, that is untagged packets) Both vlan500 and untagged packets travel over physical eth1 interface on hypervisors and can reach Internet. Thanks, On 29 May 2014 11:06, Daan Hoogland daan.hoogl...@gmail.com wrote: On Thu, May 29, 2014 at 10:57 AM, Andrija Panic andrija.pa...@gmail.com wrote: 500 is 500 the vlan of your guestnetwork or your physical network? You wouldn't want to have two nets with vlan 500! -- Daan -- Andrija Panić -- http://admintweets.com --
Re: VPC's VR missing public NIC eth1
Are these two traffic types in one physical net? or two physical nets on the same interface (seems wrong). On Thu, May 29, 2014 at 11:35 AM, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: I don't think editing DB table will work. -Jayapal On 29-May-2014, at 2:52 PM, Andrija Panic andrija.pa...@gmail.com wrote: It's like this: I have public subnet /24. half is dedicated for Guest traffic (vlan 500) and the second half is dedicated to Public traffic/network (no vlan tags, that is untagged packets) Both vlan500 and untagged packets travel over physical eth1 interface on hypervisors and can reach Internet. Thanks, On 29 May 2014 11:06, Daan Hoogland daan.hoogl...@gmail.com wrote: On Thu, May 29, 2014 at 10:57 AM, Andrija Panic andrija.pa...@gmail.com wrote: 500 is 500 the vlan of your guestnetwork or your physical network? You wouldn't want to have two nets with vlan 500! -- Daan -- Andrija Panić -- http://admintweets.com -- -- Daan
Re: VPC's VR missing public NIC eth1
They are 2 traffic types on 1 physical net (that is both tagged vlan 500, and untagged packets travel over same KVM bridge, and over eth1 to outside world)... On 29 May 2014 12:04, Daan Hoogland daan.hoogl...@gmail.com wrote: Are these two traffic types in one physical net? or two physical nets on the same interface (seems wrong). On Thu, May 29, 2014 at 11:35 AM, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: I don't think editing DB table will work. -Jayapal On 29-May-2014, at 2:52 PM, Andrija Panic andrija.pa...@gmail.com wrote: It's like this: I have public subnet /24. half is dedicated for Guest traffic (vlan 500) and the second half is dedicated to Public traffic/network (no vlan tags, that is untagged packets) Both vlan500 and untagged packets travel over physical eth1 interface on hypervisors and can reach Internet. Thanks, On 29 May 2014 11:06, Daan Hoogland daan.hoogl...@gmail.com wrote: On Thu, May 29, 2014 at 10:57 AM, Andrija Panic andrija.pa...@gmail.com wrote: 500 is 500 the vlan of your guestnetwork or your physical network? You wouldn't want to have two nets with vlan 500! -- Daan -- Andrija Panić -- http://admintweets.com -- -- Daan -- Andrija Panić -- http://admintweets.com --
Re: VPC's VR missing public NIC eth1
Andrija, this sound like something we seen as well. can you check if this is it : https://issues.apache.org/jira/browse/CLOUDSTACK-6485 thanks, Daan On Wed, May 28, 2014 at 3:30 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi there, I'm having big time problems with Public IP missing from VPC VR's eth1, after upgrade to ACS 4.3.1 - did not found this filed as bug so far...and it worked all fine on ACS 4.2.1. No help so far from user mailing list... Below is a detailed explanation, and logs from inside VR, and from management (all fine with management logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 011590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 544000 0 0 0 2084 24000 0 0 0 eth1: 0 0000 0 0 00 0000 0 0 0 eth0: 1502071319000 0 0 0 264232 1180000 0 0 0 On 28 May 2014 14:47, Andrija Panic andrija.pa...@gmail.com wrote: Also, from /var/log/messages/ inside VR: This is a major show stopper - all our VPCs are unusable complete. Anybody... ? May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds May 28 12:37:34 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 1 seconds May 28 12:37:35 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 2 seconds May 28 12:37:36 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 3 seconds May 28 12:37:37 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 4 seconds May 28 12:37:38 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 5 seconds May 28 12:37:39 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 6 seconds May 28 12:37:40 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 7 seconds May 28 12:37:41 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 8 seconds May 28 12:37:42 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 9 seconds May 28 12:37:43 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 10 seconds May 28 12:37:44 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 11 seconds May 28 12:37:45 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 12 seconds May 28 12:37:46 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 13 seconds May 28 12:37:47 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 14 seconds May 28 12:37:48 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 15 seconds May 28 12:37:49 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 16 seconds May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:interface ethnull never appeared May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:Adding ip 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:Add routing 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_privateGateway.sh:Added SourceNAT 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_snat.sh:Added SourceNAT 46.232.180.246 on interface eth1 On 28 May 2014 12:59, Andrija Panic andrija.pa...@gmail.com wrote: Defined eth1 manually inside /etc/network/interfaces inside VPC's VR. iface eth1 inet static address 46.232.180.246 netmask 255.255.255.0 ifup eth1 ip route add default via 46.232.180.1 so now VR works fine (have access to internet) But again, adding new IP to VR, and enabling static NAT is failing... That is, geting new IP works fine (just associated with account) But enabling static NAT fails, due to resource unavailable Here are management logs: 2014-05-28 12:57:00,716 WARN [c.c.n.r.RulesManagerImpl] (catalina-exec-22:ctx-537ac57b ctx-8c44c786) Failed to create static nat rule due to com.cloud.exception.ResourceUnavailableException: Resource [DataCenter:1] is unreachable: Unable to apply static nat rules on router
Re: VPC's VR missing public NIC eth1
Hi Daan, I don't think this is my issue, at least I don't make use of private gateway - this is just simple as: create new VPC from scratch - Public IP is not assigned to VR eth1 interface inside VR... I have filed the bug: https://issues.apache.org/jira/browse/CLOUDSTACK-6801 This same thing happened previously to Andrei Mikhailovsky: http://mail-archives.apache.org/mod_mbox/cloudstack-users/201405.mbox/%3C33347835.250.1399336340785.JavaMail.andrei@tuchka%3Eand it is not resolved Thanks, Andrija On 28 May 2014 21:01, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, this sound like something we seen as well. can you check if this is it : https://issues.apache.org/jira/browse/CLOUDSTACK-6485 thanks, Daan On Wed, May 28, 2014 at 3:30 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi there, I'm having big time problems with Public IP missing from VPC VR's eth1, after upgrade to ACS 4.3.1 - did not found this filed as bug so far...and it worked all fine on ACS 4.2.1. No help so far from user mailing list... Below is a detailed explanation, and logs from inside VR, and from management (all fine with management logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 0 11590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 544000 0 0 0 2084 24000 0 0 0 eth1: 0 0000 0 0 0 0 0000 0 0 0 eth0: 1502071319000 0 0 0 264232 1180000 0 0 0 On 28 May 2014 14:47, Andrija Panic andrija.pa...@gmail.com wrote: Also, from /var/log/messages/ inside VR: This is a major show stopper - all our VPCs are unusable complete. Anybody... ? May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds May 28 12:37:34 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 1 seconds May 28 12:37:35 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 2 seconds May 28 12:37:36 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 3 seconds May 28 12:37:37 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 4 seconds May 28 12:37:38 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 5 seconds May 28 12:37:39 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 6 seconds May 28 12:37:40 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 7 seconds May 28 12:37:41 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 8 seconds May 28 12:37:42 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 9 seconds May 28 12:37:43 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 10 seconds May 28 12:37:44 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 11 seconds May 28 12:37:45 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 12 seconds May 28 12:37:46 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 13 seconds May 28 12:37:47 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 14 seconds May 28 12:37:48 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 15 seconds May 28 12:37:49 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 16 seconds May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:interface ethnull never appeared May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:Adding ip 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:Add routing 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_privateGateway.sh:Added SourceNAT 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_snat.sh:Added SourceNAT 46.232.180.246 on interface eth1 On 28 May 2014 12:59, Andrija Panic andrija.pa...@gmail.com wrote: Defined eth1 manually inside /etc/network/interfaces inside VPC's VR. iface
Re: VPC's VR missing public NIC eth1
Andrija, nevertheless it sounds familiar. I will be back in the office on monday and ask around. On Wed, May 28, 2014 at 9:23 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi Daan, I don't think this is my issue, at least I don't make use of private gateway - this is just simple as: create new VPC from scratch - Public IP is not assigned to VR eth1 interface inside VR... I have filed the bug: https://issues.apache.org/jira/browse/CLOUDSTACK-6801 This same thing happened previously to Andrei Mikhailovsky: http://mail-archives.apache.org/mod_mbox/cloudstack-users/201405.mbox/%3C33347835.250.1399336340785.JavaMail.andrei@tuchka%3Eand it is not resolved Thanks, Andrija On 28 May 2014 21:01, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, this sound like something we seen as well. can you check if this is it : https://issues.apache.org/jira/browse/CLOUDSTACK-6485 thanks, Daan On Wed, May 28, 2014 at 3:30 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi there, I'm having big time problems with Public IP missing from VPC VR's eth1, after upgrade to ACS 4.3.1 - did not found this filed as bug so far...and it worked all fine on ACS 4.2.1. No help so far from user mailing list... Below is a detailed explanation, and logs from inside VR, and from management (all fine with management logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 0 11590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 544000 0 0 0 2084 24000 0 0 0 eth1: 0 0000 0 0 0 0 0000 0 0 0 eth0: 1502071319000 0 0 0 264232 1180000 0 0 0 On 28 May 2014 14:47, Andrija Panic andrija.pa...@gmail.com wrote: Also, from /var/log/messages/ inside VR: This is a major show stopper - all our VPCs are unusable complete. Anybody... ? May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds May 28 12:37:34 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 1 seconds May 28 12:37:35 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 2 seconds May 28 12:37:36 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 3 seconds May 28 12:37:37 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 4 seconds May 28 12:37:38 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 5 seconds May 28 12:37:39 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 6 seconds May 28 12:37:40 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 7 seconds May 28 12:37:41 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 8 seconds May 28 12:37:42 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 9 seconds May 28 12:37:43 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 10 seconds May 28 12:37:44 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 11 seconds May 28 12:37:45 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 12 seconds May 28 12:37:46 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 13 seconds May 28 12:37:47 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 14 seconds May 28 12:37:48 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 15 seconds May 28 12:37:49 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 16 seconds May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:interface ethnull never appeared May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:Adding ip 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_ipassoc.sh:Add routing 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_privateGateway.sh:Added SourceNAT 46.232.180.246 on interface ethnull May 28 12:37:50 r-794-VM cloud: vpc_snat.sh:Added SourceNAT
Re: VPC's VR missing public NIC eth1
Thanks Daan, my problem is that I'm in production for 3rd day now, and restoring DB and downgrading back to 4.2.1 doesn't seem as option for me at the moment, since I would loose new acounts and single VMs, etc... Thanks, Andrija On 28 May 2014 21:34, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, nevertheless it sounds familiar. I will be back in the office on monday and ask around. On Wed, May 28, 2014 at 9:23 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi Daan, I don't think this is my issue, at least I don't make use of private gateway - this is just simple as: create new VPC from scratch - Public IP is not assigned to VR eth1 interface inside VR... I have filed the bug: https://issues.apache.org/jira/browse/CLOUDSTACK-6801 This same thing happened previously to Andrei Mikhailovsky: http://mail-archives.apache.org/mod_mbox/cloudstack-users/201405.mbox/%3C33347835.250.1399336340785.JavaMail.andrei@tuchka%3Eand it is not resolved Thanks, Andrija On 28 May 2014 21:01, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, this sound like something we seen as well. can you check if this is it : https://issues.apache.org/jira/browse/CLOUDSTACK-6485 thanks, Daan On Wed, May 28, 2014 at 3:30 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi there, I'm having big time problems with Public IP missing from VPC VR's eth1, after upgrade to ACS 4.3.1 - did not found this filed as bug so far...and it worked all fine on ACS 4.2.1. No help so far from user mailing list... Below is a detailed explanation, and logs from inside VR, and from management (all fine with management logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 0 11590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 544000 0 0 0 2084 24000 0 0 0 eth1: 0 0000 0 0 0 0 0000 0 0 0 eth0: 1502071319000 0 0 0 264232 1180000 0 0 0 On 28 May 2014 14:47, Andrija Panic andrija.pa...@gmail.com wrote: Also, from /var/log/messages/ inside VR: This is a major show stopper - all our VPCs are unusable complete. Anybody... ? May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds May 28 12:37:34 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 1 seconds May 28 12:37:35 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 2 seconds May 28 12:37:36 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 3 seconds May 28 12:37:37 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 4 seconds May 28 12:37:38 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 5 seconds May 28 12:37:39 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 6 seconds May 28 12:37:40 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 7 seconds May 28 12:37:41 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 8 seconds May 28 12:37:42 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 9 seconds May 28 12:37:43 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 10 seconds May 28 12:37:44 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 11 seconds May 28 12:37:45 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 12 seconds May 28 12:37:46 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 13 seconds May 28 12:37:47 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 14 seconds May 28 12:37:48 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 15 seconds May 28 12:37:49 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 16 seconds May 28 12:37:50 r-794-VM
Re: VPC's VR missing public NIC eth1
Hi Adrija, From the logs, the public subnet is untagged. I think this issue is coming for the untagged public vlan in 4.3. 1. {com.cloud.agent.api.PlugNicCommand:{nic:{deviceId:1,networkRateMbps:9,defaultNic:true,uuid:e6b734d4-3302-4113-8ec7-5c205c90959a,ip:46.232.180.248,netmask:255.255.255.0,gateway:46.232.180.1,mac:06:5e:e8:00:00:27,broadcastType:Vlan,type:Public,broadcastUri:vlan://untagged,isolationUri:vlan://untagged,isSecurityGroupEnabled:false,name:breth1-500}, 2. 3. instanceName:r-779-VM,vmType:DomainRouter,wait:0}},{com.cloud.agent.api.routing.IpAssocVpcCommand:{ipAddresses:[{accountId:2,publicIp:46.232.180.248,sourceNat:true,add:true,oneToOneNat:false,firstIP:false,broadcastUri:untagged,vlanGateway:46.232.180.1,vlanNetmask:255.255.255.0,vifMacAddress:06:5e:e8:00:00:27,networkRate:9,trafficType:Public,networkName:breth1-500}],accessDetails: From the logs VR logs, the ipassoc script got the interface id as null. May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds Thanks, Jayapal On 29-May-2014, at 1:08 AM, Andrija Panic andrija.pa...@gmail.commailto:andrija.pa...@gmail.com wrote: Thanks Daan, my problem is that I'm in production for 3rd day now, and restoring DB and downgrading back to 4.2.1 doesn't seem as option for me at the moment, since I would loose new acounts and single VMs, etc... Thanks, Andrija On 28 May 2014 21:34, Daan Hoogland daan.hoogl...@gmail.commailto:daan.hoogl...@gmail.com wrote: Andrija, nevertheless it sounds familiar. I will be back in the office on monday and ask around. On Wed, May 28, 2014 at 9:23 PM, Andrija Panic andrija.pa...@gmail.commailto:andrija.pa...@gmail.com wrote: Hi Daan, I don't think this is my issue, at least I don't make use of private gateway - this is just simple as: create new VPC from scratch - Public IP is not assigned to VR eth1 interface inside VR... I have filed the bug: https://issues.apache.org/jira/browse/CLOUDSTACK-6801 This same thing happened previously to Andrei Mikhailovsky: http://mail-archives.apache.org/mod_mbox/cloudstack-users/201405.mbox/%3C33347835.250.1399336340785.JavaMail.andrei@tuchka%3Eand it is not resolved Thanks, Andrija On 28 May 2014 21:01, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, this sound like something we seen as well. can you check if this is it : https://issues.apache.org/jira/browse/CLOUDSTACK-6485 thanks, Daan On Wed, May 28, 2014 at 3:30 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi there, I'm having big time problems with Public IP missing from VPC VR's eth1, after upgrade to ACS 4.3.1 - did not found this filed as bug so far...and it worked all fine on ACS 4.2.1. No help so far from user mailing list... Below is a detailed explanation, and logs from inside VR, and from management (all fine with management logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 0 11590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 544000 0 0 0 2084 24000 0 0 0 eth1: 0 0000 0 0 0 0 0000 0 0 0 eth0: 1502071319000 0 0 0 264232 1180000 0 0 0 On 28 May 2014 14:47, Andrija Panic andrija.pa...@gmail.com wrote: Also, from /var/log/messages/ inside VR: This is a major show stopper - all our VPCs are unusable complete. Anybody... ? May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds May 28 12:37:34 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 1 seconds May 28 12:37:35 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 2 seconds May 28 12:37:36 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 3 seconds May 28 12:37:37 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 4 seconds May 28 12:37:38 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 5 seconds May 28 12:37:39 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 6 seconds May 28 12:37:40 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 7
Re: VPC's VR missing public NIC eth1
Hi Andrija, Same issue with public vlan tagged got fixed, CLOUDSTACK-5505. Thanks, Jayapal On 29-May-2014, at 9:38 AM, Jayapal Reddy Uradi jayapalreddy.ur...@citrix.com wrote: Hi Adrija, From the logs, the public subnet is untagged. I think this issue is coming for the untagged public vlan in 4.3. 1. {com.cloud.agent.api.PlugNicCommand:{nic:{deviceId:1,networkRateMbps:9,defaultNic:true,uuid:e6b734d4-3302-4113-8ec7-5c205c90959a,ip:46.232.180.248,netmask:255.255.255.0,gateway:46.232.180.1,mac:06:5e:e8:00:00:27,broadcastType:Vlan,type:Public,broadcastUri:vlan://untagged,isolationUri:vlan://untagged,isSecurityGroupEnabled:false,name:breth1-500}, 2. 3. instanceName:r-779-VM,vmType:DomainRouter,wait:0}},{com.cloud.agent.api.routing.IpAssocVpcCommand:{ipAddresses:[{accountId:2,publicIp:46.232.180.248,sourceNat:true,add:true,oneToOneNat:false,firstIP:false,broadcastUri:untagged,vlanGateway:46.232.180.1,vlanNetmask:255.255.255.0,vifMacAddress:06:5e:e8:00:00:27,networkRate:9,trafficType:Public,networkName:breth1-500}],accessDetails: From the logs VR logs, the ipassoc script got the interface id as null. May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds Thanks, Jayapal On 29-May-2014, at 1:08 AM, Andrija Panic andrija.pa...@gmail.commailto:andrija.pa...@gmail.com wrote: Thanks Daan, my problem is that I'm in production for 3rd day now, and restoring DB and downgrading back to 4.2.1 doesn't seem as option for me at the moment, since I would loose new acounts and single VMs, etc... Thanks, Andrija On 28 May 2014 21:34, Daan Hoogland daan.hoogl...@gmail.commailto:daan.hoogl...@gmail.com wrote: Andrija, nevertheless it sounds familiar. I will be back in the office on monday and ask around. On Wed, May 28, 2014 at 9:23 PM, Andrija Panic andrija.pa...@gmail.commailto:andrija.pa...@gmail.com wrote: Hi Daan, I don't think this is my issue, at least I don't make use of private gateway - this is just simple as: create new VPC from scratch - Public IP is not assigned to VR eth1 interface inside VR... I have filed the bug: https://issues.apache.org/jira/browse/CLOUDSTACK-6801 This same thing happened previously to Andrei Mikhailovsky: http://mail-archives.apache.org/mod_mbox/cloudstack-users/201405.mbox/%3C33347835.250.1399336340785.JavaMail.andrei@tuchka%3Eand it is not resolved Thanks, Andrija On 28 May 2014 21:01, Daan Hoogland daan.hoogl...@gmail.com wrote: Andrija, this sound like something we seen as well. can you check if this is it : https://issues.apache.org/jira/browse/CLOUDSTACK-6485 thanks, Daan On Wed, May 28, 2014 at 3:30 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi there, I'm having big time problems with Public IP missing from VPC VR's eth1, after upgrade to ACS 4.3.1 - did not found this filed as bug so far...and it worked all fine on ACS 4.2.1. No help so far from user mailing list... Below is a detailed explanation, and logs from inside VR, and from management (all fine with management logs...) If anybody can help, I would very much appriciate this, since now I have bunch fo VPC unoperational... Thanks -- Forwarded message -- From: Andrija Panic andrija.pa...@gmail.com Date: 28 May 2014 14:50 Subject: Re: VPC's VR missing public NIC eth1 To: us...@cloudstack.apache.org and as I said eth1 is present: root@r-794-VM:~# cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth3: 11484 131000 0 0 0 11590 131000 0 0 0 lo: 214 2000 0 0 0 214 2000 0 0 0 eth2: 32970 544000 0 0 0 2084 24000 0 0 0 eth1: 0 0000 0 0 0 0 0000 0 0 0 eth0: 1502071319000 0 0 0 264232 1180000 0 0 0 On 28 May 2014 14:47, Andrija Panic andrija.pa...@gmail.com wrote: Also, from /var/log/messages/ inside VR: This is a major show stopper - all our VPCs are unusable complete. Anybody... ? May 28 12:37:33 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 0 seconds May 28 12:37:34 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 1 seconds May 28 12:37:35 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 2 seconds May 28 12:37:36 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface ethnull to appear, 3 seconds May 28 12:37:37 r-794-VM cloud: vpc_ipassoc.sh:Waiting for interface