date:20190823

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Curtis E. Combs Jr.

On Fri, Aug 23, 2019 at 8:50 PM Tony Pearce  wrote:
>
> Is the nic to the network staying up or going down for a period?

Which nic? The one on the pserver or the virtual machine? For clarity,
I've only ever referred to the one on the pserver. I can't even reach
the VM when the pserver becomes unresponse during a migration, so I
can't reach the console for the VM even if the proxy is not involved
in the link down event.



> I'm just thinking, if the network has been configured to block unknown 
> unicast traffic, I think the VM would need to send a layer 2 frame to the 
> network before the network would send any frames to that switch port destined 
> for the VM.

Please, any and all ideas are welcome, and I'll try all of them. I
love this product, and I want to see it working well. I don't know
that it's configured to block unknown unicast traffic, I don't know
that it isn't - NOC doesn't tell me squat. Just inductively, I can,
maybe, think that if the switch somehow recognizes a mac on more than
one port, it shuts them both down - but that's just speculation on a
good day; I have no evidence of that. I can test again to be sure, but
I believe that both the sending and receiving migration server go down
during this event. Even more curious is that it happens over the 10g
FC link as well as the 1g copper, TP port.

Would it be a good test of your theory to set up an indefinite ping on
a VM when I can reach it and then migrate it to see if the outage
happens with the VM migrating?


> After migration, could you use the VM console to send a packet and then see 
> if you can SSH in? Is the default Gateway for the VM on the network side? A 
> ping to the Gateway should be good enough in that case.

During times when it's up after it's thrown its little fit, I can
change the migration network to either the 1g or 10g networks. So I
could set up a ping and let it go like I was saying before. It's worth
a shot






>
> On Sat., 24 Aug. 2019, 04:20 Curtis E. Combs Jr.,  wrote:
>>
>> It took a while for my servers to come back on the network this time.
>> I think it's due to ovirt continuing to try to migrate the VMs around
>> like I requested. The 3 servers' names are "swm-01, swm-02 and
>> swm-03". Eventually (about 2-3 minutes ago) they all came back online.
>>
>> So I disabled and stopped the lldpad service.
>>
>> Nope. Started some more migrations and swm-02 and swm-03 disappeared
>> again. No ping, SSH hung, same as before - almost as soon as the
>> migration started.
>>
>> If you wall have any ideas what switch-level setting might be enabled,
>> let me know, cause I'm stumped. I can add it to the ticket that's
>> requesting the port configurations. I've already added the port
>> numbers and switch name that I got from CDP.
>>
>> Thanks again, I really appreciate the help!
>> cecjr
>>
>>
>>
>> On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler  wrote:
>> >
>> >
>> >
>> > On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler  wrote:
>> >>
>> >>
>> >>
>> >> On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr.  
>> >> wrote:
>> >>>
>> >>> This little cluster isn't in production or anything like that yet.
>> >>>
>> >>> So, I went ahead and used your ethtool commands to disable pause
>> >>> frames on both interfaces of each server. I then, chose a few VMs to
>> >>> migrate around at random.
>> >>>
>> >>> swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't
>> >>> ssh, and the SSH session that I had open was unresponsive.
>> >>>
>> >>> Any other ideas?
>> >>>
>> >>
>> >> Sorry, no. Looks like two different NICs with different drivers and 
>> >> frimware goes down together.
>> >> This is a strong indication that the root cause is related to the switch.
>> >> Maybe you can get some information about the switch config by
>> >> 'lldptool get-tlv -n -i em1'
>> >>
>> >
>> > Another guess:
>> > After the optional 'lldptool get-tlv -n -i em1'
>> > 'systemctl stop lldpad'
>> > another try to migrate.
>> >
>> >
>> >>
>> >>
>> >>>
>> >>> On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler  
>> >>> wrote:
>> >>> >
>> >>> >
>> >>> >
>> >>> > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. 
>> >>> >  wrote:
>> >>> >>
>> >>> >> Unfortunately, I can't check on the switch. Trust me, I've tried.
>> >>> >> These servers are in a Co-Lo and I've put 5 tickets in asking about
>> >>> >> the port configuration. They just get ignored - but that's par for the
>> >>> >> coarse for IT here. Only about 2 out of 10 of our tickets get any
>> >>> >> response and usually the response doesn't help. Then the system they
>> >>> >> use auto-closes the ticket. That was why I was suspecting STP before.
>> >>> >>
>> >>> >> I can do ethtool. I do have root on these servers, though. Are you
>> >>> >> trying to get me to turn off link-speed auto-negotiation? Would you
>> >>> >> like me to try that?
>> >>> >>
>> >>> >
>> >>> > It is just a suspicion, that the reason is pause frames.
>> >>> > Let's start on a NIC which is not used for ovirtmgmt, I

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Tony Pearce

Is the nic to the network staying up or going down for a period?
I'm just thinking, if the network has been configured to block unknown
unicast traffic, I think the VM would need to send a layer 2 frame to the
network before the network would send any frames to that switch port
destined for the VM.

After migration, could you use the VM console to send a packet and then see
if you can SSH in? Is the default Gateway for the VM on the network side? A
ping to the Gateway should be good enough in that case.

On Sat., 24 Aug. 2019, 04:20 Curtis E. Combs Jr., 
wrote:

> It took a while for my servers to come back on the network this time.
> I think it's due to ovirt continuing to try to migrate the VMs around
> like I requested. The 3 servers' names are "swm-01, swm-02 and
> swm-03". Eventually (about 2-3 minutes ago) they all came back online.
>
> So I disabled and stopped the lldpad service.
>
> Nope. Started some more migrations and swm-02 and swm-03 disappeared
> again. No ping, SSH hung, same as before - almost as soon as the
> migration started.
>
> If you wall have any ideas what switch-level setting might be enabled,
> let me know, cause I'm stumped. I can add it to the ticket that's
> requesting the port configurations. I've already added the port
> numbers and switch name that I got from CDP.
>
> Thanks again, I really appreciate the help!
> cecjr
>
>
>
> On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler  wrote:
> >
> >
> >
> > On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler 
> wrote:
> >>
> >>
> >>
> >> On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
> >>>
> >>> This little cluster isn't in production or anything like that yet.
> >>>
> >>> So, I went ahead and used your ethtool commands to disable pause
> >>> frames on both interfaces of each server. I then, chose a few VMs to
> >>> migrate around at random.
> >>>
> >>> swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't
> >>> ssh, and the SSH session that I had open was unresponsive.
> >>>
> >>> Any other ideas?
> >>>
> >>
> >> Sorry, no. Looks like two different NICs with different drivers and
> frimware goes down together.
> >> This is a strong indication that the root cause is related to the
> switch.
> >> Maybe you can get some information about the switch config by
> >> 'lldptool get-tlv -n -i em1'
> >>
> >
> > Another guess:
> > After the optional 'lldptool get-tlv -n -i em1'
> > 'systemctl stop lldpad'
> > another try to migrate.
> >
> >
> >>
> >>
> >>>
> >>> On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler 
> wrote:
> >>> >
> >>> >
> >>> >
> >>> > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
> >>> >>
> >>> >> Unfortunately, I can't check on the switch. Trust me, I've tried.
> >>> >> These servers are in a Co-Lo and I've put 5 tickets in asking about
> >>> >> the port configuration. They just get ignored - but that's par for
> the
> >>> >> coarse for IT here. Only about 2 out of 10 of our tickets get any
> >>> >> response and usually the response doesn't help. Then the system they
> >>> >> use auto-closes the ticket. That was why I was suspecting STP
> before.
> >>> >>
> >>> >> I can do ethtool. I do have root on these servers, though. Are you
> >>> >> trying to get me to turn off link-speed auto-negotiation? Would you
> >>> >> like me to try that?
> >>> >>
> >>> >
> >>> > It is just a suspicion, that the reason is pause frames.
> >>> > Let's start on a NIC which is not used for ovirtmgmt, I guess em1.
> >>> > Does 'ethtool -S em1  | grep pause' show something?
> >>> > Does 'ethtool em1 | grep pause' indicates support for pause?
> >>> > The current config is shown by 'ethtool -a em1'.
> >>> > '-A autoneg' "Specifies whether pause autonegotiation should be
> enabled." according to ethtool doc.
> >>> > Assuming flow control is enabled by default, I would try to  disable
> it via
> >>> > 'ethtool -A em1 autoneg off rx off tx off'
> >>> > and check if it is applied via
> >>> > 'ethtool -a em1'
> >>> > and check if the behavior under load changes.
> >>> >
> >>> >
> >>> >
> >>> >>
> >>> >> On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler 
> wrote:
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
> >>> >> >>
> >>> >> >> Sure! Right now, I only have a 500gb partition on each node
> shared over NFS, added as storage domains. This is on each node - so,
> currently 3.
> >>> >> >>
> >>> >> >> How can the storage cause a node to drop out?
> >>> >> >>
> >>> >> >
> >>> >> > Thanks, I got it.
> >>> >> > All three links go down on load, which causes NFS to fail.
> >>> >> >
> >>> >> > Can you check in the switch port configuration if there is some
> kind of Ethernet flow control enabled?
> >>> >> > Can you try to modify the behavior by changing the settings of
> your host interfaces, e.g.
> >>> >> >
> >>> >> > ethtool -A em1 autoneg off rx off tx off
> >>> >> >
> >>> >> > or
> >>> >> > ethtool -A

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Curtis E. Combs Jr.

It took a while for my servers to come back on the network this time.
I think it's due to ovirt continuing to try to migrate the VMs around
like I requested. The 3 servers' names are "swm-01, swm-02 and
swm-03". Eventually (about 2-3 minutes ago) they all came back online.

So I disabled and stopped the lldpad service.

Nope. Started some more migrations and swm-02 and swm-03 disappeared
again. No ping, SSH hung, same as before - almost as soon as the
migration started.

If you wall have any ideas what switch-level setting might be enabled,
let me know, cause I'm stumped. I can add it to the ticket that's
requesting the port configurations. I've already added the port
numbers and switch name that I got from CDP.

Thanks again, I really appreciate the help!
cecjr



On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler  wrote:
>
>
>
> On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler  wrote:
>>
>>
>>
>> On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr.  
>> wrote:
>>>
>>> This little cluster isn't in production or anything like that yet.
>>>
>>> So, I went ahead and used your ethtool commands to disable pause
>>> frames on both interfaces of each server. I then, chose a few VMs to
>>> migrate around at random.
>>>
>>> swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't
>>> ssh, and the SSH session that I had open was unresponsive.
>>>
>>> Any other ideas?
>>>
>>
>> Sorry, no. Looks like two different NICs with different drivers and frimware 
>> goes down together.
>> This is a strong indication that the root cause is related to the switch.
>> Maybe you can get some information about the switch config by
>> 'lldptool get-tlv -n -i em1'
>>
>
> Another guess:
> After the optional 'lldptool get-tlv -n -i em1'
> 'systemctl stop lldpad'
> another try to migrate.
>
>
>>
>>
>>>
>>> On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler  wrote:
>>> >
>>> >
>>> >
>>> > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr.  
>>> > wrote:
>>> >>
>>> >> Unfortunately, I can't check on the switch. Trust me, I've tried.
>>> >> These servers are in a Co-Lo and I've put 5 tickets in asking about
>>> >> the port configuration. They just get ignored - but that's par for the
>>> >> coarse for IT here. Only about 2 out of 10 of our tickets get any
>>> >> response and usually the response doesn't help. Then the system they
>>> >> use auto-closes the ticket. That was why I was suspecting STP before.
>>> >>
>>> >> I can do ethtool. I do have root on these servers, though. Are you
>>> >> trying to get me to turn off link-speed auto-negotiation? Would you
>>> >> like me to try that?
>>> >>
>>> >
>>> > It is just a suspicion, that the reason is pause frames.
>>> > Let's start on a NIC which is not used for ovirtmgmt, I guess em1.
>>> > Does 'ethtool -S em1  | grep pause' show something?
>>> > Does 'ethtool em1 | grep pause' indicates support for pause?
>>> > The current config is shown by 'ethtool -a em1'.
>>> > '-A autoneg' "Specifies whether pause autonegotiation should be enabled." 
>>> > according to ethtool doc.
>>> > Assuming flow control is enabled by default, I would try to  disable it 
>>> > via
>>> > 'ethtool -A em1 autoneg off rx off tx off'
>>> > and check if it is applied via
>>> > 'ethtool -a em1'
>>> > and check if the behavior under load changes.
>>> >
>>> >
>>> >
>>> >>
>>> >> On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler  
>>> >> wrote:
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. 
>>> >> >  wrote:
>>> >> >>
>>> >> >> Sure! Right now, I only have a 500gb partition on each node shared 
>>> >> >> over NFS, added as storage domains. This is on each node - so, 
>>> >> >> currently 3.
>>> >> >>
>>> >> >> How can the storage cause a node to drop out?
>>> >> >>
>>> >> >
>>> >> > Thanks, I got it.
>>> >> > All three links go down on load, which causes NFS to fail.
>>> >> >
>>> >> > Can you check in the switch port configuration if there is some kind 
>>> >> > of Ethernet flow control enabled?
>>> >> > Can you try to modify the behavior by changing the settings of your 
>>> >> > host interfaces, e.g.
>>> >> >
>>> >> > ethtool -A em1 autoneg off rx off tx off
>>> >> >
>>> >> > or
>>> >> > ethtool -A em1 autoneg on rx on tx on
>>> >> > ?
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >>
>>> >> >> On Fri, Aug 23, 2019, 11:46 AM Dominik Holler  
>>> >> >> wrote:
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. 
>>> >> >>>  wrote:
>>> >> 
>>> >>  Also, if it helps, the hosts will sit there, quietly, for hours or
>>> >>  days before anything happens. They're up and working just fine. But
>>> >>  then, when I manually migrate a VM from one host to another, they
>>> >>  become completely inaccessible.
>>> >> 
>>> >> >>>
>>> >> >>> Can you share some details about your storage?
>>> >> >>> Maybe there is a feature used during live migration, which triggers 
>>> >> >>> the issue.
>>> >> >>>
>>> >> >>>
>>> >> 
>>> >>

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Dominik Holler

On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler  wrote:

>
>
> On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. 
> wrote:
>
>> This little cluster isn't in production or anything like that yet.
>>
>> So, I went ahead and used your ethtool commands to disable pause
>> frames on both interfaces of each server. I then, chose a few VMs to
>> migrate around at random.
>>
>> swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't
>> ssh, and the SSH session that I had open was unresponsive.
>>
>> Any other ideas?
>>
>>
> Sorry, no. Looks like two different NICs with different drivers and
> frimware goes down together.
> This is a strong indication that the root cause is related to the switch.
> Maybe you can get some information about the switch config by
> 'lldptool get-tlv -n -i em1'
>
>
Another guess:
After the optional 'lldptool get-tlv -n -i em1'
'systemctl stop lldpad'
another try to migrate.



>
>
>> On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler 
>> wrote:
>> >
>> >
>> >
>> > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <
>> ej.alb...@gmail.com> wrote:
>> >>
>> >> Unfortunately, I can't check on the switch. Trust me, I've tried.
>> >> These servers are in a Co-Lo and I've put 5 tickets in asking about
>> >> the port configuration. They just get ignored - but that's par for the
>> >> coarse for IT here. Only about 2 out of 10 of our tickets get any
>> >> response and usually the response doesn't help. Then the system they
>> >> use auto-closes the ticket. That was why I was suspecting STP before.
>> >>
>> >> I can do ethtool. I do have root on these servers, though. Are you
>> >> trying to get me to turn off link-speed auto-negotiation? Would you
>> >> like me to try that?
>> >>
>> >
>> > It is just a suspicion, that the reason is pause frames.
>> > Let's start on a NIC which is not used for ovirtmgmt, I guess em1.
>> > Does 'ethtool -S em1  | grep pause' show something?
>> > Does 'ethtool em1 | grep pause' indicates support for pause?
>> > The current config is shown by 'ethtool -a em1'.
>> > '-A autoneg' "Specifies whether pause autonegotiation should be
>> enabled." according to ethtool doc.
>> > Assuming flow control is enabled by default, I would try to  disable it
>> via
>> > 'ethtool -A em1 autoneg off rx off tx off'
>> > and check if it is applied via
>> > 'ethtool -a em1'
>> > and check if the behavior under load changes.
>> >
>> >
>> >
>> >>
>> >> On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler 
>> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <
>> ej.alb...@gmail.com> wrote:
>> >> >>
>> >> >> Sure! Right now, I only have a 500gb partition on each node shared
>> over NFS, added as storage domains. This is on each node - so, currently 3.
>> >> >>
>> >> >> How can the storage cause a node to drop out?
>> >> >>
>> >> >
>> >> > Thanks, I got it.
>> >> > All three links go down on load, which causes NFS to fail.
>> >> >
>> >> > Can you check in the switch port configuration if there is some kind
>> of Ethernet flow control enabled?
>> >> > Can you try to modify the behavior by changing the settings of your
>> host interfaces, e.g.
>> >> >
>> >> > ethtool -A em1 autoneg off rx off tx off
>> >> >
>> >> > or
>> >> > ethtool -A em1 autoneg on rx on tx on
>> >> > ?
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >>
>> >> >> On Fri, Aug 23, 2019, 11:46 AM Dominik Holler 
>> wrote:
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <
>> ej.alb...@gmail.com> wrote:
>> >> 
>> >>  Also, if it helps, the hosts will sit there, quietly, for hours or
>> >>  days before anything happens. They're up and working just fine.
>> But
>> >>  then, when I manually migrate a VM from one host to another, they
>> >>  become completely inaccessible.
>> >> 
>> >> >>>
>> >> >>> Can you share some details about your storage?
>> >> >>> Maybe there is a feature used during live migration, which
>> triggers the issue.
>> >> >>>
>> >> >>>
>> >> 
>> >>  These are vanilla-as-possible CentOS7 nodes. Very basic ovirt
>> install
>> >>  and configuration.
>> >> 
>> >>  On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr.
>> >>   wrote:
>> >>  >
>> >>  > Hey Dominik,
>> >>  >
>> >>  > Thanks for helping. I really want to try to use ovirt.
>> >>  >
>> >>  > When these events happen, I cannot even SSH to the nodes due to
>> the
>> >>  > link being down. After a little while, the hosts come back...
>> >>  >
>> >>  > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <
>> dhol...@redhat.com> wrote:
>> >>  > >
>> >>  > > Is you storage connected via NFS?
>> >>  > > Can you manually access the storage on the host?
>> >>  > >
>> >>  > >
>> >>  > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <
>> ej.alb...@gmail.com> wrote:
>> >>  > >>
>> >>  > >> Sorry to dead bump this, but I'm beginning to suspect that
>> maybe it's
>> >>

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Dominik Holler

On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. 
wrote:

> This little cluster isn't in production or anything like that yet.
>
> So, I went ahead and used your ethtool commands to disable pause
> frames on both interfaces of each server. I then, chose a few VMs to
> migrate around at random.
>
> swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't
> ssh, and the SSH session that I had open was unresponsive.
>
> Any other ideas?
>
>
Sorry, no. Looks like two different NICs with different drivers and
frimware goes down together.
This is a strong indication that the root cause is related to the switch.
Maybe you can get some information about the switch config by
'lldptool get-tlv -n -i em1'



> On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler  wrote:
> >
> >
> >
> > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. 
> wrote:
> >>
> >> Unfortunately, I can't check on the switch. Trust me, I've tried.
> >> These servers are in a Co-Lo and I've put 5 tickets in asking about
> >> the port configuration. They just get ignored - but that's par for the
> >> coarse for IT here. Only about 2 out of 10 of our tickets get any
> >> response and usually the response doesn't help. Then the system they
> >> use auto-closes the ticket. That was why I was suspecting STP before.
> >>
> >> I can do ethtool. I do have root on these servers, though. Are you
> >> trying to get me to turn off link-speed auto-negotiation? Would you
> >> like me to try that?
> >>
> >
> > It is just a suspicion, that the reason is pause frames.
> > Let's start on a NIC which is not used for ovirtmgmt, I guess em1.
> > Does 'ethtool -S em1  | grep pause' show something?
> > Does 'ethtool em1 | grep pause' indicates support for pause?
> > The current config is shown by 'ethtool -a em1'.
> > '-A autoneg' "Specifies whether pause autonegotiation should be
> enabled." according to ethtool doc.
> > Assuming flow control is enabled by default, I would try to  disable it
> via
> > 'ethtool -A em1 autoneg off rx off tx off'
> > and check if it is applied via
> > 'ethtool -a em1'
> > and check if the behavior under load changes.
> >
> >
> >
> >>
> >> On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler 
> wrote:
> >> >
> >> >
> >> >
> >> > On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
> >> >>
> >> >> Sure! Right now, I only have a 500gb partition on each node shared
> over NFS, added as storage domains. This is on each node - so, currently 3.
> >> >>
> >> >> How can the storage cause a node to drop out?
> >> >>
> >> >
> >> > Thanks, I got it.
> >> > All three links go down on load, which causes NFS to fail.
> >> >
> >> > Can you check in the switch port configuration if there is some kind
> of Ethernet flow control enabled?
> >> > Can you try to modify the behavior by changing the settings of your
> host interfaces, e.g.
> >> >
> >> > ethtool -A em1 autoneg off rx off tx off
> >> >
> >> > or
> >> > ethtool -A em1 autoneg on rx on tx on
> >> > ?
> >> >
> >> >
> >> >
> >> >
> >> >>
> >> >> On Fri, Aug 23, 2019, 11:46 AM Dominik Holler 
> wrote:
> >> >>>
> >> >>>
> >> >>>
> >> >>> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
> >> 
> >>  Also, if it helps, the hosts will sit there, quietly, for hours or
> >>  days before anything happens. They're up and working just fine. But
> >>  then, when I manually migrate a VM from one host to another, they
> >>  become completely inaccessible.
> >> 
> >> >>>
> >> >>> Can you share some details about your storage?
> >> >>> Maybe there is a feature used during live migration, which triggers
> the issue.
> >> >>>
> >> >>>
> >> 
> >>  These are vanilla-as-possible CentOS7 nodes. Very basic ovirt
> install
> >>  and configuration.
> >> 
> >>  On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr.
> >>   wrote:
> >>  >
> >>  > Hey Dominik,
> >>  >
> >>  > Thanks for helping. I really want to try to use ovirt.
> >>  >
> >>  > When these events happen, I cannot even SSH to the nodes due to
> the
> >>  > link being down. After a little while, the hosts come back...
> >>  >
> >>  > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <
> dhol...@redhat.com> wrote:
> >>  > >
> >>  > > Is you storage connected via NFS?
> >>  > > Can you manually access the storage on the host?
> >>  > >
> >>  > >
> >>  > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
> >>  > >>
> >>  > >> Sorry to dead bump this, but I'm beginning to suspect that
> maybe it's
> >>  > >> not STP that's the problem.
> >>  > >>
> >>  > >> 2 of my hosts just went down when a few VMs tried to migrate.
> >>  > >>
> >>  > >> Do any of you have any idea what might be going on here? I
> don't even
> >>  > >> know where to start. I'm going to include the dmesg in case
> it helps.
> >>  > >> This happens on both

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Curtis E. Combs Jr.

This little cluster isn't in production or anything like that yet.

So, I went ahead and used your ethtool commands to disable pause
frames on both interfaces of each server. I then, chose a few VMs to
migrate around at random.

swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't
ssh, and the SSH session that I had open was unresponsive.

Any other ideas?

On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler  wrote:
>
>
>
> On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr.  
> wrote:
>>
>> Unfortunately, I can't check on the switch. Trust me, I've tried.
>> These servers are in a Co-Lo and I've put 5 tickets in asking about
>> the port configuration. They just get ignored - but that's par for the
>> coarse for IT here. Only about 2 out of 10 of our tickets get any
>> response and usually the response doesn't help. Then the system they
>> use auto-closes the ticket. That was why I was suspecting STP before.
>>
>> I can do ethtool. I do have root on these servers, though. Are you
>> trying to get me to turn off link-speed auto-negotiation? Would you
>> like me to try that?
>>
>
> It is just a suspicion, that the reason is pause frames.
> Let's start on a NIC which is not used for ovirtmgmt, I guess em1.
> Does 'ethtool -S em1  | grep pause' show something?
> Does 'ethtool em1 | grep pause' indicates support for pause?
> The current config is shown by 'ethtool -a em1'.
> '-A autoneg' "Specifies whether pause autonegotiation should be enabled." 
> according to ethtool doc.
> Assuming flow control is enabled by default, I would try to  disable it via
> 'ethtool -A em1 autoneg off rx off tx off'
> and check if it is applied via
> 'ethtool -a em1'
> and check if the behavior under load changes.
>
>
>
>>
>> On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler  wrote:
>> >
>> >
>> >
>> > On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr.  
>> > wrote:
>> >>
>> >> Sure! Right now, I only have a 500gb partition on each node shared over 
>> >> NFS, added as storage domains. This is on each node - so, currently 3.
>> >>
>> >> How can the storage cause a node to drop out?
>> >>
>> >
>> > Thanks, I got it.
>> > All three links go down on load, which causes NFS to fail.
>> >
>> > Can you check in the switch port configuration if there is some kind of 
>> > Ethernet flow control enabled?
>> > Can you try to modify the behavior by changing the settings of your host 
>> > interfaces, e.g.
>> >
>> > ethtool -A em1 autoneg off rx off tx off
>> >
>> > or
>> > ethtool -A em1 autoneg on rx on tx on
>> > ?
>> >
>> >
>> >
>> >
>> >>
>> >> On Fri, Aug 23, 2019, 11:46 AM Dominik Holler  wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. 
>> >>>  wrote:
>> 
>>  Also, if it helps, the hosts will sit there, quietly, for hours or
>>  days before anything happens. They're up and working just fine. But
>>  then, when I manually migrate a VM from one host to another, they
>>  become completely inaccessible.
>> 
>> >>>
>> >>> Can you share some details about your storage?
>> >>> Maybe there is a feature used during live migration, which triggers the 
>> >>> issue.
>> >>>
>> >>>
>> 
>>  These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install
>>  and configuration.
>> 
>>  On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr.
>>   wrote:
>>  >
>>  > Hey Dominik,
>>  >
>>  > Thanks for helping. I really want to try to use ovirt.
>>  >
>>  > When these events happen, I cannot even SSH to the nodes due to the
>>  > link being down. After a little while, the hosts come back...
>>  >
>>  > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler  
>>  > wrote:
>>  > >
>>  > > Is you storage connected via NFS?
>>  > > Can you manually access the storage on the host?
>>  > >
>>  > >
>>  > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. 
>>  > >  wrote:
>>  > >>
>>  > >> Sorry to dead bump this, but I'm beginning to suspect that maybe 
>>  > >> it's
>>  > >> not STP that's the problem.
>>  > >>
>>  > >> 2 of my hosts just went down when a few VMs tried to migrate.
>>  > >>
>>  > >> Do any of you have any idea what might be going on here? I don't 
>>  > >> even
>>  > >> know where to start. I'm going to include the dmesg in case it 
>>  > >> helps.
>>  > >> This happens on both of the hosts whenever any migration attempts 
>>  > >> to start.
>>  > >>
>>  > >>
>>  > >>
>>  > >>
>>  > >>
>>  > >>
>>  > >>
>>  > >>
>>  > >>
>>  > >> [68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
>>  > >> [68099.246055] internal: port 1(em1) entered disabled state
>>  > >> [68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
>>  > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
>>  > >> [68184.177856] ovirtmgmt: topology change detected, propagating
>>  > >>

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Dominik Holler

On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. 
wrote:

> Unfortunately, I can't check on the switch. Trust me, I've tried.
> These servers are in a Co-Lo and I've put 5 tickets in asking about
> the port configuration. They just get ignored - but that's par for the
> coarse for IT here. Only about 2 out of 10 of our tickets get any
> response and usually the response doesn't help. Then the system they
> use auto-closes the ticket. That was why I was suspecting STP before.
>
> I can do ethtool. I do have root on these servers, though. Are you
> trying to get me to turn off link-speed auto-negotiation? Would you
> like me to try that?
>
>
It is just a suspicion, that the reason is pause frames.
Let's start on a NIC which is not used for ovirtmgmt, I guess em1.
Does 'ethtool -S em1  | grep pause' show something?
Does 'ethtool em1 | grep pause' indicates support for pause?
The current config is shown by 'ethtool -a em1'.
'-A autoneg' "Specifies whether pause autonegotiation should be enabled."
according to ethtool doc.
Assuming flow control is enabled by default, I would try to  disable it via
'ethtool -A em1 autoneg off rx off tx off'
and check if it is applied via
'ethtool -a em1'
and check if the behavior under load changes.




> On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler 
> wrote:
> >
> >
> >
> > On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. 
> wrote:
> >>
> >> Sure! Right now, I only have a 500gb partition on each node shared over
> NFS, added as storage domains. This is on each node - so, currently 3.
> >>
> >> How can the storage cause a node to drop out?
> >>
> >
> > Thanks, I got it.
> > All three links go down on load, which causes NFS to fail.
> >
> > Can you check in the switch port configuration if there is some kind of
> Ethernet flow control enabled?
> > Can you try to modify the behavior by changing the settings of your host
> interfaces, e.g.
> >
> > ethtool -A em1 autoneg off rx off tx off
> >
> > or
> > ethtool -A em1 autoneg on rx on tx on
> > ?
> >
> >
> >
> >
> >>
> >> On Fri, Aug 23, 2019, 11:46 AM Dominik Holler 
> wrote:
> >>>
> >>>
> >>>
> >>> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
> 
>  Also, if it helps, the hosts will sit there, quietly, for hours or
>  days before anything happens. They're up and working just fine. But
>  then, when I manually migrate a VM from one host to another, they
>  become completely inaccessible.
> 
> >>>
> >>> Can you share some details about your storage?
> >>> Maybe there is a feature used during live migration, which triggers
> the issue.
> >>>
> >>>
> 
>  These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install
>  and configuration.
> 
>  On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr.
>   wrote:
>  >
>  > Hey Dominik,
>  >
>  > Thanks for helping. I really want to try to use ovirt.
>  >
>  > When these events happen, I cannot even SSH to the nodes due to the
>  > link being down. After a little while, the hosts come back...
>  >
>  > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler 
> wrote:
>  > >
>  > > Is you storage connected via NFS?
>  > > Can you manually access the storage on the host?
>  > >
>  > >
>  > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
>  > >>
>  > >> Sorry to dead bump this, but I'm beginning to suspect that maybe
> it's
>  > >> not STP that's the problem.
>  > >>
>  > >> 2 of my hosts just went down when a few VMs tried to migrate.
>  > >>
>  > >> Do any of you have any idea what might be going on here? I don't
> even
>  > >> know where to start. I'm going to include the dmesg in case it
> helps.
>  > >> This happens on both of the hosts whenever any migration
> attempts to start.
>  > >>
>  > >>
>  > >>
>  > >>
>  > >>
>  > >>
>  > >>
>  > >>
>  > >>
>  > >> [68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
>  > >> [68099.246055] internal: port 1(em1) entered disabled state
>  > >> [68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
>  > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
>  > >> [68184.177856] ovirtmgmt: topology change detected, propagating
>  > >> [68277.078671] INFO: task qemu-kvm: blocked for more than
> 120 seconds.
>  > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>  > >> disables this message.
>  > >> [68277.078723] qemu-kvmD 9db40c359040 0  
>   1 0x01a0
>  > >> [68277.078727] Call Trace:
>  > >> [68277.078738]  [] ?
> avc_has_perm_flags+0xdc/0x1c0
>  > >> [68277.078743]  [] schedule+0x29/0x70
>  > >> [68277.078746]  [] inode_dio_wait+0xd9/0x100
>  > >> [68277.078751]  [] ?
> wake_bit_function+0x40/0x40
>  > >> [68277.078765]  [] nfs_getattr+0x1b6/0x250
> [nfs]
>

[ovirt-users] Re: Creating hosts via the REST API using SSH Public Key authentication

2019-08-23 Thread Julian Schill

> POST /ovirt-engine/api/hosts
> 
> myhost
> myhost.example.com
> 
>   publickey
> 
> 

It works, thank you very much!
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZXULLK3BVGU622NFMWGJXQS3SQTAOBSI/

[ovirt-users] Re: Evaluate oVirt to remove VMware

2019-08-23 Thread dsalade

Thanks Dominik,

Went through this documentation earlier and it looked as if this was done after 
the Engine is installed.
I am looking to get networking completely setup before the Engine so I have a 
template and can duplicate this effort across multiple hosts during install. 
Possibly using a kickstart script for hosts.

FYI... I need to start with both NICs in a BOND as well as VLAN Tagging.

Went through steps of "not your docs" that I mentioned earlier and I got much 
further. I was able to ping out, but the bridge was based off of br0 and not 
ovirtmgmt so I could not ping into or access the host remotely. For some reason 
the gateway never really mattered in the ifcfg scripts.

I will thoroughly look at the doc again... but like I said I am trying to get 
this down to easily reproduce-able during install of OS.

If we end up going with oVirt, it would be approx. a 100 Node install.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FBBN3ZCHDWIYBLOM5KTTGB4II2NXFDTV/

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Curtis E. Combs Jr.

Unfortunately, I can't check on the switch. Trust me, I've tried.
These servers are in a Co-Lo and I've put 5 tickets in asking about
the port configuration. They just get ignored - but that's par for the
coarse for IT here. Only about 2 out of 10 of our tickets get any
response and usually the response doesn't help. Then the system they
use auto-closes the ticket. That was why I was suspecting STP before.

I can do ethtool. I do have root on these servers, though. Are you
trying to get me to turn off link-speed auto-negotiation? Would you
like me to try that?

On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler  wrote:
>
>
>
> On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr.  
> wrote:
>>
>> Sure! Right now, I only have a 500gb partition on each node shared over NFS, 
>> added as storage domains. This is on each node - so, currently 3.
>>
>> How can the storage cause a node to drop out?
>>
>
> Thanks, I got it.
> All three links go down on load, which causes NFS to fail.
>
> Can you check in the switch port configuration if there is some kind of 
> Ethernet flow control enabled?
> Can you try to modify the behavior by changing the settings of your host 
> interfaces, e.g.
>
> ethtool -A em1 autoneg off rx off tx off
>
> or
> ethtool -A em1 autoneg on rx on tx on
> ?
>
>
>
>
>>
>> On Fri, Aug 23, 2019, 11:46 AM Dominik Holler  wrote:
>>>
>>>
>>>
>>> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr.  
>>> wrote:

 Also, if it helps, the hosts will sit there, quietly, for hours or
 days before anything happens. They're up and working just fine. But
 then, when I manually migrate a VM from one host to another, they
 become completely inaccessible.

>>>
>>> Can you share some details about your storage?
>>> Maybe there is a feature used during live migration, which triggers the 
>>> issue.
>>>
>>>

 These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install
 and configuration.

 On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr.
  wrote:
 >
 > Hey Dominik,
 >
 > Thanks for helping. I really want to try to use ovirt.
 >
 > When these events happen, I cannot even SSH to the nodes due to the
 > link being down. After a little while, the hosts come back...
 >
 > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler  
 > wrote:
 > >
 > > Is you storage connected via NFS?
 > > Can you manually access the storage on the host?
 > >
 > >
 > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. 
 > >  wrote:
 > >>
 > >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's
 > >> not STP that's the problem.
 > >>
 > >> 2 of my hosts just went down when a few VMs tried to migrate.
 > >>
 > >> Do any of you have any idea what might be going on here? I don't even
 > >> know where to start. I'm going to include the dmesg in case it helps.
 > >> This happens on both of the hosts whenever any migration attempts to 
 > >> start.
 > >>
 > >>
 > >>
 > >>
 > >>
 > >>
 > >>
 > >>
 > >>
 > >> [68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
 > >> [68099.246055] internal: port 1(em1) entered disabled state
 > >> [68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
 > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
 > >> [68184.177856] ovirtmgmt: topology change detected, propagating
 > >> [68277.078671] INFO: task qemu-kvm: blocked for more than 120 
 > >> seconds.
 > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
 > >> disables this message.
 > >> [68277.078723] qemu-kvmD 9db40c359040 0    1 
 > >> 0x01a0
 > >> [68277.078727] Call Trace:
 > >> [68277.078738]  [] ? avc_has_perm_flags+0xdc/0x1c0
 > >> [68277.078743]  [] schedule+0x29/0x70
 > >> [68277.078746]  [] inode_dio_wait+0xd9/0x100
 > >> [68277.078751]  [] ? wake_bit_function+0x40/0x40
 > >> [68277.078765]  [] nfs_getattr+0x1b6/0x250 [nfs]
 > >> [68277.078768]  [] vfs_getattr+0x49/0x80
 > >> [68277.078769]  [] vfs_fstat+0x45/0x80
 > >> [68277.078771]  [] SYSC_newfstat+0x24/0x60
 > >> [68277.078774]  [] ? 
 > >> system_call_after_swapgs+0xae/0x146
 > >> [68277.078778]  [] ? 
 > >> __audit_syscall_entry+0xb4/0x110
 > >> [68277.078782]  [] ? syscall_trace_enter+0x16b/0x220
 > >> [68277.078784]  [] SyS_newfstat+0xe/0x10
 > >> [68277.078786]  [] tracesys+0xa3/0xc9
 > >> [68397.072384] INFO: task qemu-kvm: blocked for more than 120 
 > >> seconds.
 > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
 > >> disables this message.
 > >> [68397.072436] qemu-kvmD 9db40c359040 0    1 
 > >> 0x01a0
 > >> [68397.072439] Call Trace:
 > >> [68397.072453]  [] ? avc_has_perm_flags+0xdc/0x1c0
 > >>

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Dominik Holler

On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. 
wrote:

> Sure! Right now, I only have a 500gb partition on each node shared over
> NFS, added as storage domains. This is on each node - so, currently 3.
>
> How can the storage cause a node to drop out?
>
>
Thanks, I got it.
All three links go down on load, which causes NFS to fail.

Can you check in the switch port configuration if there is some kind of
Ethernet flow control enabled?
Can you try to modify the behavior by changing the settings of your host
interfaces, e.g.

ethtool -A em1 autoneg off rx off tx off

or
ethtool -A em1 autoneg on rx on tx on
?





> On Fri, Aug 23, 2019, 11:46 AM Dominik Holler  wrote:
>
>>
>>
>> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. 
>> wrote:
>>
>>> Also, if it helps, the hosts will sit there, quietly, for hours or
>>> days before anything happens. They're up and working just fine. But
>>> then, when I manually migrate a VM from one host to another, they
>>> become completely inaccessible.
>>>
>>>
>> Can you share some details about your storage?
>> Maybe there is a feature used during live migration, which triggers the
>> issue.
>>
>>
>>
>>> These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install
>>> and configuration.
>>>
>>> On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr.
>>>  wrote:
>>> >
>>> > Hey Dominik,
>>> >
>>> > Thanks for helping. I really want to try to use ovirt.
>>> >
>>> > When these events happen, I cannot even SSH to the nodes due to the
>>> > link being down. After a little while, the hosts come back...
>>> >
>>> > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler 
>>> wrote:
>>> > >
>>> > > Is you storage connected via NFS?
>>> > > Can you manually access the storage on the host?
>>> > >
>>> > >
>>> > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <
>>> ej.alb...@gmail.com> wrote:
>>> > >>
>>> > >> Sorry to dead bump this, but I'm beginning to suspect that maybe
>>> it's
>>> > >> not STP that's the problem.
>>> > >>
>>> > >> 2 of my hosts just went down when a few VMs tried to migrate.
>>> > >>
>>> > >> Do any of you have any idea what might be going on here? I don't
>>> even
>>> > >> know where to start. I'm going to include the dmesg in case it
>>> helps.
>>> > >> This happens on both of the hosts whenever any migration attempts
>>> to start.
>>> > >>
>>> > >>
>>> > >>
>>> > >>
>>> > >>
>>> > >>
>>> > >>
>>> > >>
>>> > >>
>>> > >> [68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
>>> > >> [68099.246055] internal: port 1(em1) entered disabled state
>>> > >> [68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
>>> > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
>>> > >> [68184.177856] ovirtmgmt: topology change detected, propagating
>>> > >> [68277.078671] INFO: task qemu-kvm: blocked for more than 120
>>> seconds.
>>> > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> > >> disables this message.
>>> > >> [68277.078723] qemu-kvmD 9db40c359040 0  
>>> 1 0x01a0
>>> > >> [68277.078727] Call Trace:
>>> > >> [68277.078738]  [] ? avc_has_perm_flags+0xdc/0x1c0
>>> > >> [68277.078743]  [] schedule+0x29/0x70
>>> > >> [68277.078746]  [] inode_dio_wait+0xd9/0x100
>>> > >> [68277.078751]  [] ? wake_bit_function+0x40/0x40
>>> > >> [68277.078765]  [] nfs_getattr+0x1b6/0x250 [nfs]
>>> > >> [68277.078768]  [] vfs_getattr+0x49/0x80
>>> > >> [68277.078769]  [] vfs_fstat+0x45/0x80
>>> > >> [68277.078771]  [] SYSC_newfstat+0x24/0x60
>>> > >> [68277.078774]  [] ?
>>> system_call_after_swapgs+0xae/0x146
>>> > >> [68277.078778]  [] ?
>>> __audit_syscall_entry+0xb4/0x110
>>> > >> [68277.078782]  [] ?
>>> syscall_trace_enter+0x16b/0x220
>>> > >> [68277.078784]  [] SyS_newfstat+0xe/0x10
>>> > >> [68277.078786]  [] tracesys+0xa3/0xc9
>>> > >> [68397.072384] INFO: task qemu-kvm: blocked for more than 120
>>> seconds.
>>> > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> > >> disables this message.
>>> > >> [68397.072436] qemu-kvmD 9db40c359040 0  
>>> 1 0x01a0
>>> > >> [68397.072439] Call Trace:
>>> > >> [68397.072453]  [] ? avc_has_perm_flags+0xdc/0x1c0
>>> > >> [68397.072458]  [] schedule+0x29/0x70
>>> > >> [68397.072462]  [] inode_dio_wait+0xd9/0x100
>>> > >> [68397.072467]  [] ? wake_bit_function+0x40/0x40
>>> > >> [68397.072480]  [] nfs_getattr+0x1b6/0x250 [nfs]
>>> > >> [68397.072485]  [] vfs_getattr+0x49/0x80
>>> > >> [68397.072486]  [] vfs_fstat+0x45/0x80
>>> > >> [68397.072488]  [] SYSC_newfstat+0x24/0x60
>>> > >> [68397.072491]  [] ?
>>> system_call_after_swapgs+0xae/0x146
>>> > >> [68397.072495]  [] ?
>>> __audit_syscall_entry+0xb4/0x110
>>> > >> [68397.072498]  [] ?
>>> syscall_trace_enter+0x16b/0x220
>>> > >> [68397.072500]  [] SyS_newfstat+0xe/0x10
>>> > >> [68397.072502]  [] tracesys+0xa3/0xc9
>>> > >> [68401.573141] bnx2 :01:00.0 em1: NIC Copper Link is Up, 1000
>>> Mbps
>>> > >> full duplex
>>> > >>
>>> > >> [68401.573247]

[ovirt-users] Re: Update single node environment from 4.3.3 to 4.3.5 problem

2019-08-23 Thread Gianluca Cecchi

On Fri, Aug 23, 2019 at 5:06 PM Dominik Holler  wrote:

>
>
>
> Gianluca, can you please share the output of 'rpm -qa' of the affected
> host?
>

here it is output of "rpm -qa | sort"
https://drive.google.com/file/d/1JG8XfomPSgqp4Y40KOwTGsixnkqkMfml/view?usp=sharing

and here contents of yum.log-20190823 that contains logs since March this
year if it can help:
https://drive.google.com/file/d/1zKXbY2ySLPM4TSyzzZ1_AvUrA8sCMYOm/view?usp=sharing

Thanks,
Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YO5POXYFV2MSWX4X2CZDV4OPU57EQC26/

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Curtis E. Combs Jr.

Sure! Right now, I only have a 500gb partition on each node shared over
NFS, added as storage domains. This is on each node - so, currently 3.

How can the storage cause a node to drop out?

On Fri, Aug 23, 2019, 11:46 AM Dominik Holler  wrote:

>
>
> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. 
> wrote:
>
>> Also, if it helps, the hosts will sit there, quietly, for hours or
>> days before anything happens. They're up and working just fine. But
>> then, when I manually migrate a VM from one host to another, they
>> become completely inaccessible.
>>
>>
> Can you share some details about your storage?
> Maybe there is a feature used during live migration, which triggers the
> issue.
>
>
>
>> These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install
>> and configuration.
>>
>> On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr.
>>  wrote:
>> >
>> > Hey Dominik,
>> >
>> > Thanks for helping. I really want to try to use ovirt.
>> >
>> > When these events happen, I cannot even SSH to the nodes due to the
>> > link being down. After a little while, the hosts come back...
>> >
>> > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler 
>> wrote:
>> > >
>> > > Is you storage connected via NFS?
>> > > Can you manually access the storage on the host?
>> > >
>> > >
>> > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <
>> ej.alb...@gmail.com> wrote:
>> > >>
>> > >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's
>> > >> not STP that's the problem.
>> > >>
>> > >> 2 of my hosts just went down when a few VMs tried to migrate.
>> > >>
>> > >> Do any of you have any idea what might be going on here? I don't even
>> > >> know where to start. I'm going to include the dmesg in case it helps.
>> > >> This happens on both of the hosts whenever any migration attempts to
>> start.
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> [68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
>> > >> [68099.246055] internal: port 1(em1) entered disabled state
>> > >> [68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
>> > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
>> > >> [68184.177856] ovirtmgmt: topology change detected, propagating
>> > >> [68277.078671] INFO: task qemu-kvm: blocked for more than 120
>> seconds.
>> > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> > >> disables this message.
>> > >> [68277.078723] qemu-kvmD 9db40c359040 0    1
>> 0x01a0
>> > >> [68277.078727] Call Trace:
>> > >> [68277.078738]  [] ? avc_has_perm_flags+0xdc/0x1c0
>> > >> [68277.078743]  [] schedule+0x29/0x70
>> > >> [68277.078746]  [] inode_dio_wait+0xd9/0x100
>> > >> [68277.078751]  [] ? wake_bit_function+0x40/0x40
>> > >> [68277.078765]  [] nfs_getattr+0x1b6/0x250 [nfs]
>> > >> [68277.078768]  [] vfs_getattr+0x49/0x80
>> > >> [68277.078769]  [] vfs_fstat+0x45/0x80
>> > >> [68277.078771]  [] SYSC_newfstat+0x24/0x60
>> > >> [68277.078774]  [] ?
>> system_call_after_swapgs+0xae/0x146
>> > >> [68277.078778]  [] ?
>> __audit_syscall_entry+0xb4/0x110
>> > >> [68277.078782]  [] ?
>> syscall_trace_enter+0x16b/0x220
>> > >> [68277.078784]  [] SyS_newfstat+0xe/0x10
>> > >> [68277.078786]  [] tracesys+0xa3/0xc9
>> > >> [68397.072384] INFO: task qemu-kvm: blocked for more than 120
>> seconds.
>> > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> > >> disables this message.
>> > >> [68397.072436] qemu-kvmD 9db40c359040 0    1
>> 0x01a0
>> > >> [68397.072439] Call Trace:
>> > >> [68397.072453]  [] ? avc_has_perm_flags+0xdc/0x1c0
>> > >> [68397.072458]  [] schedule+0x29/0x70
>> > >> [68397.072462]  [] inode_dio_wait+0xd9/0x100
>> > >> [68397.072467]  [] ? wake_bit_function+0x40/0x40
>> > >> [68397.072480]  [] nfs_getattr+0x1b6/0x250 [nfs]
>> > >> [68397.072485]  [] vfs_getattr+0x49/0x80
>> > >> [68397.072486]  [] vfs_fstat+0x45/0x80
>> > >> [68397.072488]  [] SYSC_newfstat+0x24/0x60
>> > >> [68397.072491]  [] ?
>> system_call_after_swapgs+0xae/0x146
>> > >> [68397.072495]  [] ?
>> __audit_syscall_entry+0xb4/0x110
>> > >> [68397.072498]  [] ?
>> syscall_trace_enter+0x16b/0x220
>> > >> [68397.072500]  [] SyS_newfstat+0xe/0x10
>> > >> [68397.072502]  [] tracesys+0xa3/0xc9
>> > >> [68401.573141] bnx2 :01:00.0 em1: NIC Copper Link is Up, 1000
>> Mbps
>> > >> full duplex
>> > >>
>> > >> [68401.573247] internal: port 1(em1) entered blocking state
>> > >> [68401.573255] internal: port 1(em1) entered listening state
>> > >> [68403.576985] internal: port 1(em1) entered learning state
>> > >> [68405.580907] internal: port 1(em1) entered forwarding state
>> > >> [68405.580916] internal: topology change detected, propagating
>> > >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding,
>> timed out
>> > >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding,
>> timed out
>> > >> [68487.193932] ixgbe :03:00.0 p1p1: NIC Link is Up

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Dominik Holler

On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. 
wrote:

> Also, if it helps, the hosts will sit there, quietly, for hours or
> days before anything happens. They're up and working just fine. But
> then, when I manually migrate a VM from one host to another, they
> become completely inaccessible.
>
>
Can you share some details about your storage?
Maybe there is a feature used during live migration, which triggers the
issue.



> These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install
> and configuration.
>
> On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr.
>  wrote:
> >
> > Hey Dominik,
> >
> > Thanks for helping. I really want to try to use ovirt.
> >
> > When these events happen, I cannot even SSH to the nodes due to the
> > link being down. After a little while, the hosts come back...
> >
> > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler 
> wrote:
> > >
> > > Is you storage connected via NFS?
> > > Can you manually access the storage on the host?
> > >
> > >
> > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
> > >>
> > >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's
> > >> not STP that's the problem.
> > >>
> > >> 2 of my hosts just went down when a few VMs tried to migrate.
> > >>
> > >> Do any of you have any idea what might be going on here? I don't even
> > >> know where to start. I'm going to include the dmesg in case it helps.
> > >> This happens on both of the hosts whenever any migration attempts to
> start.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> [68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
> > >> [68099.246055] internal: port 1(em1) entered disabled state
> > >> [68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
> > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
> > >> [68184.177856] ovirtmgmt: topology change detected, propagating
> > >> [68277.078671] INFO: task qemu-kvm: blocked for more than 120
> seconds.
> > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >> disables this message.
> > >> [68277.078723] qemu-kvmD 9db40c359040 0    1
> 0x01a0
> > >> [68277.078727] Call Trace:
> > >> [68277.078738]  [] ? avc_has_perm_flags+0xdc/0x1c0
> > >> [68277.078743]  [] schedule+0x29/0x70
> > >> [68277.078746]  [] inode_dio_wait+0xd9/0x100
> > >> [68277.078751]  [] ? wake_bit_function+0x40/0x40
> > >> [68277.078765]  [] nfs_getattr+0x1b6/0x250 [nfs]
> > >> [68277.078768]  [] vfs_getattr+0x49/0x80
> > >> [68277.078769]  [] vfs_fstat+0x45/0x80
> > >> [68277.078771]  [] SYSC_newfstat+0x24/0x60
> > >> [68277.078774]  [] ?
> system_call_after_swapgs+0xae/0x146
> > >> [68277.078778]  [] ?
> __audit_syscall_entry+0xb4/0x110
> > >> [68277.078782]  [] ? syscall_trace_enter+0x16b/0x220
> > >> [68277.078784]  [] SyS_newfstat+0xe/0x10
> > >> [68277.078786]  [] tracesys+0xa3/0xc9
> > >> [68397.072384] INFO: task qemu-kvm: blocked for more than 120
> seconds.
> > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >> disables this message.
> > >> [68397.072436] qemu-kvmD 9db40c359040 0    1
> 0x01a0
> > >> [68397.072439] Call Trace:
> > >> [68397.072453]  [] ? avc_has_perm_flags+0xdc/0x1c0
> > >> [68397.072458]  [] schedule+0x29/0x70
> > >> [68397.072462]  [] inode_dio_wait+0xd9/0x100
> > >> [68397.072467]  [] ? wake_bit_function+0x40/0x40
> > >> [68397.072480]  [] nfs_getattr+0x1b6/0x250 [nfs]
> > >> [68397.072485]  [] vfs_getattr+0x49/0x80
> > >> [68397.072486]  [] vfs_fstat+0x45/0x80
> > >> [68397.072488]  [] SYSC_newfstat+0x24/0x60
> > >> [68397.072491]  [] ?
> system_call_after_swapgs+0xae/0x146
> > >> [68397.072495]  [] ?
> __audit_syscall_entry+0xb4/0x110
> > >> [68397.072498]  [] ? syscall_trace_enter+0x16b/0x220
> > >> [68397.072500]  [] SyS_newfstat+0xe/0x10
> > >> [68397.072502]  [] tracesys+0xa3/0xc9
> > >> [68401.573141] bnx2 :01:00.0 em1: NIC Copper Link is Up, 1000 Mbps
> > >> full duplex
> > >>
> > >> [68401.573247] internal: port 1(em1) entered blocking state
> > >> [68401.573255] internal: port 1(em1) entered listening state
> > >> [68403.576985] internal: port 1(em1) entered learning state
> > >> [68405.580907] internal: port 1(em1) entered forwarding state
> > >> [68405.580916] internal: topology change detected, propagating
> > >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding,
> timed out
> > >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding,
> timed out
> > >> [68487.193932] ixgbe :03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow
> > >> Control: RX/TX
> > >> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state
> > >> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state
> > >> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state
> > >> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state
> > >> [68491.200405] ovirtmgmt: topology change detected, sending

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Curtis E. Combs Jr.

Also, if it helps, the hosts will sit there, quietly, for hours or
days before anything happens. They're up and working just fine. But
then, when I manually migrate a VM from one host to another, they
become completely inaccessible.

These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install
and configuration.

On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr.
 wrote:
>
> Hey Dominik,
>
> Thanks for helping. I really want to try to use ovirt.
>
> When these events happen, I cannot even SSH to the nodes due to the
> link being down. After a little while, the hosts come back...
>
> On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler  wrote:
> >
> > Is you storage connected via NFS?
> > Can you manually access the storage on the host?
> >
> >
> > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr.  
> > wrote:
> >>
> >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's
> >> not STP that's the problem.
> >>
> >> 2 of my hosts just went down when a few VMs tried to migrate.
> >>
> >> Do any of you have any idea what might be going on here? I don't even
> >> know where to start. I'm going to include the dmesg in case it helps.
> >> This happens on both of the hosts whenever any migration attempts to start.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> [68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
> >> [68099.246055] internal: port 1(em1) entered disabled state
> >> [68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
> >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
> >> [68184.177856] ovirtmgmt: topology change detected, propagating
> >> [68277.078671] INFO: task qemu-kvm: blocked for more than 120 seconds.
> >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [68277.078723] qemu-kvmD 9db40c359040 0    1 
> >> 0x01a0
> >> [68277.078727] Call Trace:
> >> [68277.078738]  [] ? avc_has_perm_flags+0xdc/0x1c0
> >> [68277.078743]  [] schedule+0x29/0x70
> >> [68277.078746]  [] inode_dio_wait+0xd9/0x100
> >> [68277.078751]  [] ? wake_bit_function+0x40/0x40
> >> [68277.078765]  [] nfs_getattr+0x1b6/0x250 [nfs]
> >> [68277.078768]  [] vfs_getattr+0x49/0x80
> >> [68277.078769]  [] vfs_fstat+0x45/0x80
> >> [68277.078771]  [] SYSC_newfstat+0x24/0x60
> >> [68277.078774]  [] ? system_call_after_swapgs+0xae/0x146
> >> [68277.078778]  [] ? __audit_syscall_entry+0xb4/0x110
> >> [68277.078782]  [] ? syscall_trace_enter+0x16b/0x220
> >> [68277.078784]  [] SyS_newfstat+0xe/0x10
> >> [68277.078786]  [] tracesys+0xa3/0xc9
> >> [68397.072384] INFO: task qemu-kvm: blocked for more than 120 seconds.
> >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [68397.072436] qemu-kvmD 9db40c359040 0    1 
> >> 0x01a0
> >> [68397.072439] Call Trace:
> >> [68397.072453]  [] ? avc_has_perm_flags+0xdc/0x1c0
> >> [68397.072458]  [] schedule+0x29/0x70
> >> [68397.072462]  [] inode_dio_wait+0xd9/0x100
> >> [68397.072467]  [] ? wake_bit_function+0x40/0x40
> >> [68397.072480]  [] nfs_getattr+0x1b6/0x250 [nfs]
> >> [68397.072485]  [] vfs_getattr+0x49/0x80
> >> [68397.072486]  [] vfs_fstat+0x45/0x80
> >> [68397.072488]  [] SYSC_newfstat+0x24/0x60
> >> [68397.072491]  [] ? system_call_after_swapgs+0xae/0x146
> >> [68397.072495]  [] ? __audit_syscall_entry+0xb4/0x110
> >> [68397.072498]  [] ? syscall_trace_enter+0x16b/0x220
> >> [68397.072500]  [] SyS_newfstat+0xe/0x10
> >> [68397.072502]  [] tracesys+0xa3/0xc9
> >> [68401.573141] bnx2 :01:00.0 em1: NIC Copper Link is Up, 1000 Mbps
> >> full duplex
> >>
> >> [68401.573247] internal: port 1(em1) entered blocking state
> >> [68401.573255] internal: port 1(em1) entered listening state
> >> [68403.576985] internal: port 1(em1) entered learning state
> >> [68405.580907] internal: port 1(em1) entered forwarding state
> >> [68405.580916] internal: topology change detected, propagating
> >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out
> >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out
> >> [68487.193932] ixgbe :03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow
> >> Control: RX/TX
> >> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state
> >> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state
> >> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state
> >> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state
> >> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu
> >> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> >> [68494.777996] NFSD: client 10.15.28.22 testing state ID with
> >> incorrect client ID
> >> [68494.778580] NFSD: client 10.15.28.22 testing state ID with
> >> incorrect client ID
> >>
> >>
> >> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr.  
> >> wrote:
> >> >
> >> > Thanks, I'm just going to revert back to bridges.
> >> >
> >> >

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Curtis E. Combs Jr.

Hey Dominik,

Thanks for helping. I really want to try to use ovirt.

When these events happen, I cannot even SSH to the nodes due to the
link being down. After a little while, the hosts come back...

On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler  wrote:
>
> Is you storage connected via NFS?
> Can you manually access the storage on the host?
>
>
> On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr.  
> wrote:
>>
>> Sorry to dead bump this, but I'm beginning to suspect that maybe it's
>> not STP that's the problem.
>>
>> 2 of my hosts just went down when a few VMs tried to migrate.
>>
>> Do any of you have any idea what might be going on here? I don't even
>> know where to start. I'm going to include the dmesg in case it helps.
>> This happens on both of the hosts whenever any migration attempts to start.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> [68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
>> [68099.246055] internal: port 1(em1) entered disabled state
>> [68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
>> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
>> [68184.177856] ovirtmgmt: topology change detected, propagating
>> [68277.078671] INFO: task qemu-kvm: blocked for more than 120 seconds.
>> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [68277.078723] qemu-kvmD 9db40c359040 0    1 
>> 0x01a0
>> [68277.078727] Call Trace:
>> [68277.078738]  [] ? avc_has_perm_flags+0xdc/0x1c0
>> [68277.078743]  [] schedule+0x29/0x70
>> [68277.078746]  [] inode_dio_wait+0xd9/0x100
>> [68277.078751]  [] ? wake_bit_function+0x40/0x40
>> [68277.078765]  [] nfs_getattr+0x1b6/0x250 [nfs]
>> [68277.078768]  [] vfs_getattr+0x49/0x80
>> [68277.078769]  [] vfs_fstat+0x45/0x80
>> [68277.078771]  [] SYSC_newfstat+0x24/0x60
>> [68277.078774]  [] ? system_call_after_swapgs+0xae/0x146
>> [68277.078778]  [] ? __audit_syscall_entry+0xb4/0x110
>> [68277.078782]  [] ? syscall_trace_enter+0x16b/0x220
>> [68277.078784]  [] SyS_newfstat+0xe/0x10
>> [68277.078786]  [] tracesys+0xa3/0xc9
>> [68397.072384] INFO: task qemu-kvm: blocked for more than 120 seconds.
>> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [68397.072436] qemu-kvmD 9db40c359040 0    1 
>> 0x01a0
>> [68397.072439] Call Trace:
>> [68397.072453]  [] ? avc_has_perm_flags+0xdc/0x1c0
>> [68397.072458]  [] schedule+0x29/0x70
>> [68397.072462]  [] inode_dio_wait+0xd9/0x100
>> [68397.072467]  [] ? wake_bit_function+0x40/0x40
>> [68397.072480]  [] nfs_getattr+0x1b6/0x250 [nfs]
>> [68397.072485]  [] vfs_getattr+0x49/0x80
>> [68397.072486]  [] vfs_fstat+0x45/0x80
>> [68397.072488]  [] SYSC_newfstat+0x24/0x60
>> [68397.072491]  [] ? system_call_after_swapgs+0xae/0x146
>> [68397.072495]  [] ? __audit_syscall_entry+0xb4/0x110
>> [68397.072498]  [] ? syscall_trace_enter+0x16b/0x220
>> [68397.072500]  [] SyS_newfstat+0xe/0x10
>> [68397.072502]  [] tracesys+0xa3/0xc9
>> [68401.573141] bnx2 :01:00.0 em1: NIC Copper Link is Up, 1000 Mbps
>> full duplex
>>
>> [68401.573247] internal: port 1(em1) entered blocking state
>> [68401.573255] internal: port 1(em1) entered listening state
>> [68403.576985] internal: port 1(em1) entered learning state
>> [68405.580907] internal: port 1(em1) entered forwarding state
>> [68405.580916] internal: topology change detected, propagating
>> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out
>> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out
>> [68487.193932] ixgbe :03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow
>> Control: RX/TX
>> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state
>> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state
>> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state
>> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state
>> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu
>> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> [68494.777996] NFSD: client 10.15.28.22 testing state ID with
>> incorrect client ID
>> [68494.778580] NFSD: client 10.15.28.22 testing state ID with
>> incorrect client ID
>>
>>
>> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr.  
>> wrote:
>> >
>> > Thanks, I'm just going to revert back to bridges.
>> >
>> > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler  wrote:
>> > >
>> > >
>> > >
>> > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. 
>> > >  wrote:
>> > >>
>> > >> Seems like the STP options are so common and necessary that it would
>> > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm
>> > >> not even a networking guy - never even heard of half of the
>> > >> bridge_opts that have switches in the UI.
>> > >>
>> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my
>> > >> nodes and used "openvswitch (Technology Preview)" as the

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Dominik Holler

Is you storage connected via NFS?
Can you manually access the storage on the host?


On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. 
wrote:

> Sorry to dead bump this, but I'm beginning to suspect that maybe it's
> not STP that's the problem.
>
> 2 of my hosts just went down when a few VMs tried to migrate.
>
> Do any of you have any idea what might be going on here? I don't even
> know where to start. I'm going to include the dmesg in case it helps.
> This happens on both of the hosts whenever any migration attempts to start.
>
>
>
>
>
>
>
>
>
> [68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
> [68099.246055] internal: port 1(em1) entered disabled state
> [68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
> [68184.177856] ovirtmgmt: topology change detected, propagating
> [68277.078671] INFO: task qemu-kvm: blocked for more than 120 seconds.
> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [68277.078723] qemu-kvmD 9db40c359040 0    1
> 0x01a0
> [68277.078727] Call Trace:
> [68277.078738]  [] ? avc_has_perm_flags+0xdc/0x1c0
> [68277.078743]  [] schedule+0x29/0x70
> [68277.078746]  [] inode_dio_wait+0xd9/0x100
> [68277.078751]  [] ? wake_bit_function+0x40/0x40
> [68277.078765]  [] nfs_getattr+0x1b6/0x250 [nfs]
> [68277.078768]  [] vfs_getattr+0x49/0x80
> [68277.078769]  [] vfs_fstat+0x45/0x80
> [68277.078771]  [] SYSC_newfstat+0x24/0x60
> [68277.078774]  [] ? system_call_after_swapgs+0xae/0x146
> [68277.078778]  [] ? __audit_syscall_entry+0xb4/0x110
> [68277.078782]  [] ? syscall_trace_enter+0x16b/0x220
> [68277.078784]  [] SyS_newfstat+0xe/0x10
> [68277.078786]  [] tracesys+0xa3/0xc9
> [68397.072384] INFO: task qemu-kvm: blocked for more than 120 seconds.
> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [68397.072436] qemu-kvmD 9db40c359040 0    1
> 0x01a0
> [68397.072439] Call Trace:
> [68397.072453]  [] ? avc_has_perm_flags+0xdc/0x1c0
> [68397.072458]  [] schedule+0x29/0x70
> [68397.072462]  [] inode_dio_wait+0xd9/0x100
> [68397.072467]  [] ? wake_bit_function+0x40/0x40
> [68397.072480]  [] nfs_getattr+0x1b6/0x250 [nfs]
> [68397.072485]  [] vfs_getattr+0x49/0x80
> [68397.072486]  [] vfs_fstat+0x45/0x80
> [68397.072488]  [] SYSC_newfstat+0x24/0x60
> [68397.072491]  [] ? system_call_after_swapgs+0xae/0x146
> [68397.072495]  [] ? __audit_syscall_entry+0xb4/0x110
> [68397.072498]  [] ? syscall_trace_enter+0x16b/0x220
> [68397.072500]  [] SyS_newfstat+0xe/0x10
> [68397.072502]  [] tracesys+0xa3/0xc9
> [68401.573141] bnx2 :01:00.0 em1: NIC Copper Link is Up, 1000 Mbps
> full duplex
>
> [68401.573247] internal: port 1(em1) entered blocking state
> [68401.573255] internal: port 1(em1) entered listening state
> [68403.576985] internal: port 1(em1) entered learning state
> [68405.580907] internal: port 1(em1) entered forwarding state
> [68405.580916] internal: topology change detected, propagating
> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed
> out
> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed
> out
> [68487.193932] ixgbe :03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow
> Control: RX/TX
> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state
> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state
> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state
> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state
> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu
> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [68494.777996] NFSD: client 10.15.28.22 testing state ID with
> incorrect client ID
> [68494.778580] NFSD: client 10.15.28.22 testing state ID with
> incorrect client ID
>
>
> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. 
> wrote:
> >
> > Thanks, I'm just going to revert back to bridges.
> >
> > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler 
> wrote:
> > >
> > >
> > >
> > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <
> ej.alb...@gmail.com> wrote:
> > >>
> > >> Seems like the STP options are so common and necessary that it would
> > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm
> > >> not even a networking guy - never even heard of half of the
> > >> bridge_opts that have switches in the UI.
> > >>
> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my
> > >> nodes and used "openvswitch (Technology Preview)" as the engine-setup
> > >> option for the first host. I made a new Cluster for my nodes, added
> > >> them all to the new cluster, created a new "logical network" for the
> > >> internal network and attached it to the internal network ports.
> > >>
> > >> Now, when I go to create a new VM, I don't even have either the
> > >> ovirtmgmt switch OR the internal switch as an

[ovirt-users] Re: oVirt 4.3.5 WARN no gluster network found in cluster

2019-08-23 Thread Alex K

On Thu, Aug 22, 2019, 04:47  wrote:

> Hi,
> I have a 4.3.5 hyperconverged setup with 3 hosts, each host has 2x10G NIC
> ports
>
> Host1:
> NIC1: 192.168.1.11
> NIC2: 192.168.0.67 (Gluster)
>
> Host2:
> NIC1: 10.10.1.12
> NIC2: 192.168.0.68 (Gluster)
>
> Host3:
> NIC1: 10.10.1.13
> NIC2: 192.168.0.69 (Gluster)
>
> I am able to ping all the gluster IPs from within the hosts. i.e  from
> host1 i can ping 192.168.0.68 and 192.168.0.69
> However from the HostedEngine VM I cant ping any of those IPs
> [root@ovirt-engine ~]# ping 192.168.0.9
> PING 192.168.0.60 (192.168.0.60) 56(84) bytes of data.
> From 10.10.255.5 icmp_seq=1 Time to live exceeded
>
>
> and on the HostedEngine I see the following WARNINGS (only for host1)
> which make me think that I am not using a separate network exclusively for
> gluster.
>
> 2019-08-21 21:04:34,215-04 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
> (DefaultQuartzScheduler10) [397b6038] Could not associate brick
> 'host1.example.com:/gluster_bricks/engine/engine' of volume
> 'ac1f73ce-cdf0-4bb9-817d-xxcxxx' with correct network as no gluster
> network found in cluster '11e9-b8d3-00163e5d860d'
>
>
> Any ideas?
>
Have you designated the specific network as gluster? This is done under
Cluster -> network -> network setup

>
> thank you!!
> 2019-08-21 21:04:34,220-04 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
> (DefaultQuartzScheduler10) [397b6038] Could not associate brick
> 'vmm01.virt.ord1d:/gluster_bricks/data/data' of volume
> 'bc26633a-9a0b-49de-b714-97e76f222a02' with correct network as no gluster
> network found in cluster 'e98e2c16-c31e-11e9-b8d3-00163e5d860d'
>
> 2019-08-21 21:04:34,224-04 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
> (DefaultQuartzScheduler10) [397b6038] Could not associate brick
> 'host1:/gluster_bricks/vmstore/vmstore' of volume
> 'x-ca96-45cc-9e0f-649055e0e07b' with correct network as no gluster
> network found in cluster 'e98e2c16-c31e-11e9-b8d3xxx'
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ABYMLUNKBVCPDBGHSDFNCKMH7LOLVA7O/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2LZTSTADZBUR45ICDVCIT7ZXHORU6ZDJ/

[ovirt-users] Re: Need to enable STP on ovirt bridges

2019-08-23 Thread Curtis E. Combs Jr.

Sorry to dead bump this, but I'm beginning to suspect that maybe it's
not STP that's the problem.

2 of my hosts just went down when a few VMs tried to migrate.

Do any of you have any idea what might be going on here? I don't even
know where to start. I'm going to include the dmesg in case it helps.
This happens on both of the hosts whenever any migration attempts to start.









[68099.245833] bnx2 :01:00.0 em1: NIC Copper Link is Down
[68099.246055] internal: port 1(em1) entered disabled state
[68184.177343] ixgbe :03:00.0 p1p1: NIC Link is Down
[68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state
[68184.177856] ovirtmgmt: topology change detected, propagating
[68277.078671] INFO: task qemu-kvm: blocked for more than 120 seconds.
[68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[68277.078723] qemu-kvmD 9db40c359040 0    1 0x01a0
[68277.078727] Call Trace:
[68277.078738]  [] ? avc_has_perm_flags+0xdc/0x1c0
[68277.078743]  [] schedule+0x29/0x70
[68277.078746]  [] inode_dio_wait+0xd9/0x100
[68277.078751]  [] ? wake_bit_function+0x40/0x40
[68277.078765]  [] nfs_getattr+0x1b6/0x250 [nfs]
[68277.078768]  [] vfs_getattr+0x49/0x80
[68277.078769]  [] vfs_fstat+0x45/0x80
[68277.078771]  [] SYSC_newfstat+0x24/0x60
[68277.078774]  [] ? system_call_after_swapgs+0xae/0x146
[68277.078778]  [] ? __audit_syscall_entry+0xb4/0x110
[68277.078782]  [] ? syscall_trace_enter+0x16b/0x220
[68277.078784]  [] SyS_newfstat+0xe/0x10
[68277.078786]  [] tracesys+0xa3/0xc9
[68397.072384] INFO: task qemu-kvm: blocked for more than 120 seconds.
[68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[68397.072436] qemu-kvmD 9db40c359040 0    1 0x01a0
[68397.072439] Call Trace:
[68397.072453]  [] ? avc_has_perm_flags+0xdc/0x1c0
[68397.072458]  [] schedule+0x29/0x70
[68397.072462]  [] inode_dio_wait+0xd9/0x100
[68397.072467]  [] ? wake_bit_function+0x40/0x40
[68397.072480]  [] nfs_getattr+0x1b6/0x250 [nfs]
[68397.072485]  [] vfs_getattr+0x49/0x80
[68397.072486]  [] vfs_fstat+0x45/0x80
[68397.072488]  [] SYSC_newfstat+0x24/0x60
[68397.072491]  [] ? system_call_after_swapgs+0xae/0x146
[68397.072495]  [] ? __audit_syscall_entry+0xb4/0x110
[68397.072498]  [] ? syscall_trace_enter+0x16b/0x220
[68397.072500]  [] SyS_newfstat+0xe/0x10
[68397.072502]  [] tracesys+0xa3/0xc9
[68401.573141] bnx2 :01:00.0 em1: NIC Copper Link is Up, 1000 Mbps
full duplex

[68401.573247] internal: port 1(em1) entered blocking state
[68401.573255] internal: port 1(em1) entered listening state
[68403.576985] internal: port 1(em1) entered learning state
[68405.580907] internal: port 1(em1) entered forwarding state
[68405.580916] internal: topology change detected, propagating
[68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out
[68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out
[68487.193932] ixgbe :03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow
Control: RX/TX
[68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state
[68487.194114] ovirtmgmt: port 1(p1p1) entered listening state
[68489.196508] ovirtmgmt: port 1(p1p1) entered learning state
[68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state
[68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu
[68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
[68494.777996] NFSD: client 10.15.28.22 testing state ID with
incorrect client ID
[68494.778580] NFSD: client 10.15.28.22 testing state ID with
incorrect client ID


On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr.  wrote:
>
> Thanks, I'm just going to revert back to bridges.
>
> On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler  wrote:
> >
> >
> >
> > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr.  
> > wrote:
> >>
> >> Seems like the STP options are so common and necessary that it would
> >> be a priority over seldom-used bridge_opts. I know what STP is and I'm
> >> not even a networking guy - never even heard of half of the
> >> bridge_opts that have switches in the UI.
> >>
> >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my
> >> nodes and used "openvswitch (Technology Preview)" as the engine-setup
> >> option for the first host. I made a new Cluster for my nodes, added
> >> them all to the new cluster, created a new "logical network" for the
> >> internal network and attached it to the internal network ports.
> >>
> >> Now, when I go to create a new VM, I don't even have either the
> >> ovirtmgmt switch OR the internal switch as an option. The drop-down is
> >> empy as if I don't have any vnic-profiles.
> >>
> >
> > openvswitch clusters are limited to ovn networks.
> > You can create one like described in
> > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html#connecting-an-ovn-network-to-a-physical-network
> >
> >
> >>
> >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce

[ovirt-users] Re: Update single node environment from 4.3.3 to 4.3.5 problem

2019-08-23 Thread Dominik Holler

On Fri, Aug 23, 2019 at 2:25 PM Sandro Bonazzola 
wrote:

> Relevant error in the logs seems to be:
>
> MainThread::DEBUG::2016-04-30
> 19:45:56,428::unified_persistence::46::root::(run)
> upgrade-unified-persistence upgrade persisting networks {} and bondings {}
> MainThread::INFO::2016-04-30
> 19:45:56,428::netconfpersistence::187::root::(_clearDisk) Clearing
> /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
> MainThread::DEBUG::2016-04-30
> 19:45:56,428::netconfpersistence::195::root::(_clearDisk) No existent
> config to clear.
> MainThread::INFO::2016-04-30
> 19:45:56,428::netconfpersistence::187::root::(_clearDisk) Clearing
> /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
> MainThread::DEBUG::2016-04-30
> 19:45:56,428::netconfpersistence::195::root::(_clearDisk) No existent
> config to clear.
> MainThread::INFO::2016-04-30
> 19:45:56,428::netconfpersistence::131::root::(save) Saved new config
> RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and
> /var/run/vdsm/netconf/bonds/
> MainThread::DEBUG::2016-04-30 19:45:56,428::utils::671::root::(execCmd)
> /usr/bin/taskset --cpu-list 0-3 /usr/share/vdsm/vdsm-store-net-config
> unified (cwd None)
> MainThread::DEBUG::2016-04-30 19:45:56,440::utils::689::root::(execCmd)
> SUCCESS:  = 'cp: cannot stat
> \xe2\x80\x98/var/run/vdsm/netconf\xe2\x80\x99: No such file or
> directory\n';  = 0
> MainThread::DEBUG::2016-04-30
> 19:45:56,441::upgrade::51::upgrade::(_upgrade_seal) Upgrade
> upgrade-unified-persistence successfully performed
> MainThread::DEBUG::2017-12-31
> 16:44:52,918::libvirtconnection::163::root::(get) trying to connect libvirt
> MainThread::INFO::2017-12-31
> 16:44:53,033::netconfpersistence::194::root::(_clearDisk) Clearing
> /var/lib/vdsm/persistence/netconf/nets/ and
> /var/lib/vdsm/persistence/netconf/bonds/
> MainThread::WARNING::2017-12-31
> 16:44:53,034::fileutils::96::root::(rm_tree) Directory:
> /var/lib/vdsm/persistence/netconf/bonds/ already removed
> MainThread::INFO::2017-12-31
> 16:44:53,034::netconfpersistence::139::root::(save) Saved new config
> PersistentConfig({'ovirtmgmt': {'ipv6autoconf': False, 'nameservers':
> ['192.168.1.1', '8.8.8.8'], u'nic': u'eth0', 'dhcpv6': False, u'ipaddr':
> u'192.168.1.211', 'switch': 'legacy', 'mtu': 1500, u'netmask':
> u'255.255.255.0', u'bootproto': u'static', 'stp': False, 'bridged': True,
> u'gateway': u'192.168.1.1', u'defaultRoute': True}}, {}) to
> /var/lib/vdsm/persistence/netconf/nets/ and
> /var/lib/vdsm/persistence/netconf/bonds/
> MainThread::DEBUG::2017-12-31
> 16:44:53,035::cmdutils::150::root::(exec_cmd)
> /usr/share/openvswitch/scripts/ovs-ctl status (cwd None)
> MainThread::DEBUG::2017-12-31
> 16:44:53,069::cmdutils::158::root::(exec_cmd) FAILED:  = '';  = 1
> MainThread::DEBUG::2018-02-16
> 23:59:17,968::libvirtconnection::167::root::(get) trying to connect libvirt
> MainThread::INFO::2018-02-16
> 23:59:18,500::netconfpersistence::198::root::(_clearDisk) Clearing netconf:
> /var/lib/vdsm/staging/netconf
> MainThread::ERROR::2018-02-16 23:59:18,501::fileutils::53::root::(rm_file)
> Removing file: /var/lib/vdsm/staging/netconf failed
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/common/fileutils.py", line
> 48, in rm_file
> os.unlink(file_to_remove)
> OSError: [Errno 21] Is a directory: '/var/lib/vdsm/staging/netconf'
>
> +Dominik Holler  can you please have a look?
>
>
>
Gianluca, can you please share the output of 'rpm -qa' of the affected host?


>
> Il giorno ven 23 ago 2019 alle ore 00:56 Gianluca Cecchi <
> gianluca.cec...@gmail.com> ha scritto:
>
>> Hello,
>> after updating hosted engine from 4.3.3 to 4.3.5 and then the only host
>> composing the environment (plain CentOS 7.6) it seems it is not able to
>> start vdsm daemons
>>
>> kernel installed with update is kernel-3.10.0-957.27.2.el7.x86_64
>> Same problem also if using previous running kernel
>> 3.10.0-957.12.2.el7.x86_64
>>
>> [root@ovirt01 vdsm]# uptime
>>  00:50:08 up 25 min,  3 users,  load average: 0.60, 0.67, 0.60
>> [root@ovirt01 vdsm]#
>>
>> [root@ovirt01 vdsm]# systemctl status vdsmd -l
>> ● vdsmd.service - Virtual Desktop Server Manager
>>Loaded: loaded (/etc/systemd/system/vdsmd.service; enabled; vendor
>> preset: enabled)
>>Active: failed (Result: start-limit) since Fri 2019-08-23 00:37:27
>> CEST; 7s ago
>>   Process: 25810 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
>> --pre-start (code=exited, status=1/FAILURE)
>>
>> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: Failed to start Virtual
>> Desktop Server Manager.
>> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: Unit vdsmd.service entered
>> failed state.
>> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: vdsmd.service failed.
>> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: vdsmd.service holdoff time
>> over, scheduling restart.
>> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: Stopped Virtual Desktop
>> Server Manager.
>> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: start

[ovirt-users] Re: attach untagged vlan internally on vm

2019-08-23 Thread Tony Pearce

May be I misunderstand but no need for any tag on same layer 2 network

On Fri., 23 Aug. 2019, 22:15 Ernest Clyde Chua, 
wrote:

> Good day.
> yes the VMs and the firewall on the same L2 network also the firewall is
> hosted in oVirt along side the VMs, currently there is no external switch
> connected to the nic and i would like to know if it is possible to pass tag
> internally.
>
>
> On Fri, Aug 23, 2019 at 9:21 PM Tony Pearce  wrote:
>
>> Have the VM and the firewall on the same L2 network. Configure the VM
>> with a default gateway of the interface of the firewall.
>>
>> Is it what you're looking for?
>>
>> On Fri., 23 Aug. 2019, 21:15 Ernest Clyde Chua, <
>> ernestclydeac...@gmail.com> wrote:
>>
>>> Good day.
>>> sorry if i got you guys confused.
>>> for clarity:
>>>
>>> i have a server with two nic, currently one nic is connected to public
>>> network and the other one is disconnected.
>>>
>>> And i have a vm that will be the firewall of other vm inside this
>>> standalone/selfhosted ovirt.
>>>
>>> then i am figuring out how can i pass the vlan ids on the vm or is it
>>> possible.
>>>
>>>
>>>
>>>
>>>
>>> On Fri, 23 Aug 2019, 7:46 PM Dominik Holler  wrote:
>>>


 On Thu, Aug 22, 2019 at 1:18 PM Miguel Duarte de Mora Barroso <
 mdbarr...@redhat.com> wrote:

> On Wed, Aug 21, 2019 at 9:18 AM  wrote:
> >
> > good day
> > currently i am testing oVirt on a single box and setup some tagged
> vms and non tagged vm.
> > the non tagged vm is a firewall but it has limitations on the number
> of nic so i cannot attach tagged vnic and wish to handdle vlan tagging on 
> it
> >
> > is it possible to pass untaged franes internally?
>
> I think it would fallback to the linux bridge default configuration,
> which internally tags untagged frames with vlanID 1, and untags them
> when exiting the port. Unless I'm wrong (for instance, we change the
> bridge defaults), this means you can pass untagged frames through the
> bridge.
>
> Adding Edward, to keep me honest.
>
>
>
 I am unsure if I got the problem.
 If you connect an untagged logical network to a vNIC (virtual NIC of a
 VM), all untagged Ethernet frames will be forwarded from the host interface
 (physical NIC or bond).
 If no tagged logical network is attached to this host interface, VLAN
 tag filtering is not activated and even tagged Frames would be forwarded to
 the vNC.

 Does this answer the question?



>
>
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/HYFSLS5QM5DKBYWFF44NCB4E3CD5GKH4/
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ME77W5PLKOQC5U3OXNZE3W7W27ZOPVIP/
>
 ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/UE3XZWUU5UMT4PGN6GEHH4KCAEDT4MN3/
>>>
>>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/22CK4OVY36OXGKZUYH6LUN5OBSLOJYM6/

[ovirt-users] VDSM Hooks during migration

2019-08-23 Thread Vrgotic, Marko

Dear oVIrt,

Would you be so kind to help me/tell me or point me how to find which Hooks, 
and in which order, are triggered when VM is being migrated?

Kindly awaiting your reply.


— — —
Met vriendelijke groet / Kind regards,

Marko Vrgotic

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/L3YICX5BIW2HT4KW5CB75PPIJCU4EICX/

[ovirt-users] Re: attach untagged vlan internally on vm

2019-08-23 Thread Ernest Clyde Chua

Good day.
yes the VMs and the firewall on the same L2 network also the firewall is
hosted in oVirt along side the VMs, currently there is no external switch
connected to the nic and i would like to know if it is possible to pass tag
internally.


On Fri, Aug 23, 2019 at 9:21 PM Tony Pearce  wrote:

> Have the VM and the firewall on the same L2 network. Configure the VM with
> a default gateway of the interface of the firewall.
>
> Is it what you're looking for?
>
> On Fri., 23 Aug. 2019, 21:15 Ernest Clyde Chua, <
> ernestclydeac...@gmail.com> wrote:
>
>> Good day.
>> sorry if i got you guys confused.
>> for clarity:
>>
>> i have a server with two nic, currently one nic is connected to public
>> network and the other one is disconnected.
>>
>> And i have a vm that will be the firewall of other vm inside this
>> standalone/selfhosted ovirt.
>>
>> then i am figuring out how can i pass the vlan ids on the vm or is it
>> possible.
>>
>>
>>
>>
>>
>> On Fri, 23 Aug 2019, 7:46 PM Dominik Holler  wrote:
>>
>>>
>>>
>>> On Thu, Aug 22, 2019 at 1:18 PM Miguel Duarte de Mora Barroso <
>>> mdbarr...@redhat.com> wrote:
>>>
 On Wed, Aug 21, 2019 at 9:18 AM  wrote:
 >
 > good day
 > currently i am testing oVirt on a single box and setup some tagged
 vms and non tagged vm.
 > the non tagged vm is a firewall but it has limitations on the number
 of nic so i cannot attach tagged vnic and wish to handdle vlan tagging on 
 it
 >
 > is it possible to pass untaged franes internally?

 I think it would fallback to the linux bridge default configuration,
 which internally tags untagged frames with vlanID 1, and untags them
 when exiting the port. Unless I'm wrong (for instance, we change the
 bridge defaults), this means you can pass untagged frames through the
 bridge.

 Adding Edward, to keep me honest.



>>> I am unsure if I got the problem.
>>> If you connect an untagged logical network to a vNIC (virtual NIC of a
>>> VM), all untagged Ethernet frames will be forwarded from the host interface
>>> (physical NIC or bond).
>>> If no tagged logical network is attached to this host interface, VLAN
>>> tag filtering is not activated and even tagged Frames would be forwarded to
>>> the vNC.
>>>
>>> Does this answer the question?
>>>
>>>
>>>


 > ___
 > Users mailing list -- users@ovirt.org
 > To unsubscribe send an email to users-le...@ovirt.org
 > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
 > oVirt Code of Conduct:
 https://www.ovirt.org/community/about/community-guidelines/
 > List Archives:
 https://lists.ovirt.org/archives/list/users@ovirt.org/message/HYFSLS5QM5DKBYWFF44NCB4E3CD5GKH4/
 ___
 Users mailing list -- users@ovirt.org
 To unsubscribe send an email to users-le...@ovirt.org
 Privacy Statement: https://www.ovirt.org/site/privacy-policy/
 oVirt Code of Conduct:
 https://www.ovirt.org/community/about/community-guidelines/
 List Archives:
 https://lists.ovirt.org/archives/list/users@ovirt.org/message/ME77W5PLKOQC5U3OXNZE3W7W27ZOPVIP/

>>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/UE3XZWUU5UMT4PGN6GEHH4KCAEDT4MN3/
>>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6CX3VHBFTXQMSP5RHZ4TOP33XHXBNVCF/

[ovirt-users] Re: Update single node environment from 4.3.3 to 4.3.5 problem

2019-08-23 Thread Gianluca Cecchi

On Fri, Aug 23, 2019 at 2:25 PM Sandro Bonazzola 
wrote:

> Relevant error in the logs seems to be:
>
> MainThread::DEBUG::2016-04-30
> 19:45:56,428::unified_persistence::46::root::(run)
> upgrade-unified-persistence upgrade persisting networks {} and bondings {}
> MainThread::INFO::2016-04-30
> 19:45:56,428::netconfpersistence::187::root::(_clearDisk) Clearing
> /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
> MainThread::DEBUG::2016-04-30
> 19:45:56,428::netconfpersistence::195::root::(_clearDisk) No existent
> config to clear.
> MainThread::INFO::2016-04-30
> 19:45:56,428::netconfpersistence::187::root::(_clearDisk) Clearing
> /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
> MainThread::DEBUG::2016-04-30
> 19:45:56,428::netconfpersistence::195::root::(_clearDisk) No existent
> config to clear.
> MainThread::INFO::2016-04-30
> 19:45:56,428::netconfpersistence::131::root::(save) Saved new config
> RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and
> /var/run/vdsm/netconf/bonds/
> MainThread::DEBUG::2016-04-30 19:45:56,428::utils::671::root::(execCmd)
> /usr/bin/taskset --cpu-list 0-3 /usr/share/vdsm/vdsm-store-net-config
> unified (cwd None)
> MainThread::DEBUG::2016-04-30 19:45:56,440::utils::689::root::(execCmd)
> SUCCESS:  = 'cp: cannot stat
> \xe2\x80\x98/var/run/vdsm/netconf\xe2\x80\x99: No such file or
> directory\n';  = 0
> MainThread::DEBUG::2016-04-30
> 19:45:56,441::upgrade::51::upgrade::(_upgrade_seal) Upgrade
> upgrade-unified-persistence successfully performed
> MainThread::DEBUG::2017-12-31
> 16:44:52,918::libvirtconnection::163::root::(get) trying to connect libvirt
> MainThread::INFO::2017-12-31
> 16:44:53,033::netconfpersistence::194::root::(_clearDisk) Clearing
> /var/lib/vdsm/persistence/netconf/nets/ and
> /var/lib/vdsm/persistence/netconf/bonds/
> MainThread::WARNING::2017-12-31
> 16:44:53,034::fileutils::96::root::(rm_tree) Directory:
> /var/lib/vdsm/persistence/netconf/bonds/ already removed
> MainThread::INFO::2017-12-31
> 16:44:53,034::netconfpersistence::139::root::(save) Saved new config
> PersistentConfig({'ovirtmgmt': {'ipv6autoconf': False, 'nameservers':
> ['192.168.1.1', '8.8.8.8'], u'nic': u'eth0', 'dhcpv6': False, u'ipaddr':
> u'192.168.1.211', 'switch': 'legacy', 'mtu': 1500, u'netmask':
> u'255.255.255.0', u'bootproto': u'static', 'stp': False, 'bridged': True,
> u'gateway': u'192.168.1.1', u'defaultRoute': True}}, {}) to
> /var/lib/vdsm/persistence/netconf/nets/ and
> /var/lib/vdsm/persistence/netconf/bonds/
> MainThread::DEBUG::2017-12-31
> 16:44:53,035::cmdutils::150::root::(exec_cmd)
> /usr/share/openvswitch/scripts/ovs-ctl status (cwd None)
> MainThread::DEBUG::2017-12-31
> 16:44:53,069::cmdutils::158::root::(exec_cmd) FAILED:  = '';  = 1
> MainThread::DEBUG::2018-02-16
> 23:59:17,968::libvirtconnection::167::root::(get) trying to connect libvirt
> MainThread::INFO::2018-02-16
> 23:59:18,500::netconfpersistence::198::root::(_clearDisk) Clearing netconf:
> /var/lib/vdsm/staging/netconf
> MainThread::ERROR::2018-02-16 23:59:18,501::fileutils::53::root::(rm_file)
> Removing file: /var/lib/vdsm/staging/netconf failed
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/common/fileutils.py", line
> 48, in rm_file
> os.unlink(file_to_remove)
> OSError: [Errno 21] Is a directory: '/var/lib/vdsm/staging/netconf'
>
> +Dominik Holler  can you please have a look?
>
>
Just for reference today I also updated, but from 4.3.4 (not 4.3.4) to
4.3.5 an environment composed by three plain CentOS 7.6 servers (using
iSCSI as storage domains, and not NFS as the single one) and I had no
problems.
In their log I don't see these strange attempts to kind of "destroy" and
"regenerate" network config...
The single server is actually a NUC while the 3 servers are Dell M610
blades.
Both the single server and the three ones are without NetworkManager.
The 3-servers environment has though an external engine, while the single
server one is hosted-engine based, I don't know if this can make any
difference.

Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2KTCFXNRNKH3TNGJTLHW4M3RX7HJZCPX/

[ovirt-users] Re: nodectl on plain CentOS hypervisors

2019-08-23 Thread Gianluca Cecchi

On Fri, Aug 23, 2019 at 2:32 PM Sandro Bonazzola 
wrote:

>
>
> Or any other alternative for plain OS nodes vs ovirt-node-ng ones?
>>
>
> what's the use case here? check host sanity? because nodectl is not
> checking that, it just check node config matches to requirements to be able
> to perform rollback if needed.
>
>

ok, understood.
Thanks for clarifying

Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6TYTUQI2BNUKYHAR6X2RG66JEN6VFSNB/

[ovirt-users] Re: Creating hosts via the REST API using SSH Public Key authentication

2019-08-23 Thread Andrej Krejcir

Hi,

the following request should work, but I didn't test it.


POST /ovirt-engine/api/hosts

myhost
myhost.example.com

  publickey




Here is the relevant API documentation:
http://ovirt.github.io/ovirt-engine-api-model/4.4/#types/ssh


Regards,
Andrej

On Fri, 23 Aug 2019 at 15:01,  wrote:

> In the UI one can create hosts using two authentication methods:
> 'Password' and 'SSH Public Key'.
> I have only found the Password authentication in the API Docs
> (/ovirt-engine/apidoc/#/services/hosts/methods/add).
> My question is: How can i create hosts using SSH Public Key authentication
> via the REST API?
> I would appreciate an example POST request!
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ACRORRHQRH54SJGN3QLEOKIYUMSZC2LQ/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LR3MQS3K5YAERTYHHBHYOIQQ6NATKPTX/

[ovirt-users] Re: attach untagged vlan internally on vm

2019-08-23 Thread Tony Pearce

Have the VM and the firewall on the same L2 network. Configure the VM with
a default gateway of the interface of the firewall.

Is it what you're looking for?

On Fri., 23 Aug. 2019, 21:15 Ernest Clyde Chua, 
wrote:

> Good day.
> sorry if i got you guys confused.
> for clarity:
>
> i have a server with two nic, currently one nic is connected to public
> network and the other one is disconnected.
>
> And i have a vm that will be the firewall of other vm inside this
> standalone/selfhosted ovirt.
>
> then i am figuring out how can i pass the vlan ids on the vm or is it
> possible.
>
>
>
>
>
> On Fri, 23 Aug 2019, 7:46 PM Dominik Holler  wrote:
>
>>
>>
>> On Thu, Aug 22, 2019 at 1:18 PM Miguel Duarte de Mora Barroso <
>> mdbarr...@redhat.com> wrote:
>>
>>> On Wed, Aug 21, 2019 at 9:18 AM  wrote:
>>> >
>>> > good day
>>> > currently i am testing oVirt on a single box and setup some tagged vms
>>> and non tagged vm.
>>> > the non tagged vm is a firewall but it has limitations on the number
>>> of nic so i cannot attach tagged vnic and wish to handdle vlan tagging on it
>>> >
>>> > is it possible to pass untaged franes internally?
>>>
>>> I think it would fallback to the linux bridge default configuration,
>>> which internally tags untagged frames with vlanID 1, and untags them
>>> when exiting the port. Unless I'm wrong (for instance, we change the
>>> bridge defaults), this means you can pass untagged frames through the
>>> bridge.
>>>
>>> Adding Edward, to keep me honest.
>>>
>>>
>>>
>> I am unsure if I got the problem.
>> If you connect an untagged logical network to a vNIC (virtual NIC of a
>> VM), all untagged Ethernet frames will be forwarded from the host interface
>> (physical NIC or bond).
>> If no tagged logical network is attached to this host interface, VLAN tag
>> filtering is not activated and even tagged Frames would be forwarded to the
>> vNC.
>>
>> Does this answer the question?
>>
>>
>>
>>>
>>>
>>> > ___
>>> > Users mailing list -- users@ovirt.org
>>> > To unsubscribe send an email to users-le...@ovirt.org
>>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> > oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> > List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/HYFSLS5QM5DKBYWFF44NCB4E3CD5GKH4/
>>> ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ME77W5PLKOQC5U3OXNZE3W7W27ZOPVIP/
>>>
>> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/UE3XZWUU5UMT4PGN6GEHH4KCAEDT4MN3/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RUWTFFWI77IQHGXXZ6STBZOBFOMMITSF/

[ovirt-users] Re: attach untagged vlan internally on vm

2019-08-23 Thread Ernest Clyde Chua

Good day.
sorry if i got you guys confused.
for clarity:

i have a server with two nic, currently one nic is connected to public
network and the other one is disconnected.

And i have a vm that will be the firewall of other vm inside this
standalone/selfhosted ovirt.

then i am figuring out how can i pass the vlan ids on the vm or is it
possible.





On Fri, 23 Aug 2019, 7:46 PM Dominik Holler  wrote:

>
>
> On Thu, Aug 22, 2019 at 1:18 PM Miguel Duarte de Mora Barroso <
> mdbarr...@redhat.com> wrote:
>
>> On Wed, Aug 21, 2019 at 9:18 AM  wrote:
>> >
>> > good day
>> > currently i am testing oVirt on a single box and setup some tagged vms
>> and non tagged vm.
>> > the non tagged vm is a firewall but it has limitations on the number of
>> nic so i cannot attach tagged vnic and wish to handdle vlan tagging on it
>> >
>> > is it possible to pass untaged franes internally?
>>
>> I think it would fallback to the linux bridge default configuration,
>> which internally tags untagged frames with vlanID 1, and untags them
>> when exiting the port. Unless I'm wrong (for instance, we change the
>> bridge defaults), this means you can pass untagged frames through the
>> bridge.
>>
>> Adding Edward, to keep me honest.
>>
>>
>>
> I am unsure if I got the problem.
> If you connect an untagged logical network to a vNIC (virtual NIC of a
> VM), all untagged Ethernet frames will be forwarded from the host interface
> (physical NIC or bond).
> If no tagged logical network is attached to this host interface, VLAN tag
> filtering is not activated and even tagged Frames would be forwarded to the
> vNC.
>
> Does this answer the question?
>
>
>
>>
>>
>> > ___
>> > Users mailing list -- users@ovirt.org
>> > To unsubscribe send an email to users-le...@ovirt.org
>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> > oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> > List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/HYFSLS5QM5DKBYWFF44NCB4E3CD5GKH4/
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ME77W5PLKOQC5U3OXNZE3W7W27ZOPVIP/
>>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UE3XZWUU5UMT4PGN6GEHH4KCAEDT4MN3/

[ovirt-users] Creating hosts via the REST API using SSH Public Key authentication

2019-08-23 Thread schill-julian

In the UI one can create hosts using two authentication methods: 'Password' and 
'SSH Public Key'.
I have only found the Password authentication in the API Docs 
(/ovirt-engine/apidoc/#/services/hosts/methods/add).
My question is: How can i create hosts using SSH Public Key authentication via 
the REST API?
I would appreciate an example POST request!
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ACRORRHQRH54SJGN3QLEOKIYUMSZC2LQ/

[ovirt-users] Re: oVirt 3.6: Node went into Error state while migrations happening

2019-08-23 Thread Sandro Bonazzola

Il giorno lun 8 lug 2019 alle ore 21:44 Christopher Cox 
ha scritto:

> On the node in question, the metadata isn't coming across (state) wise.
> It  shows VMs being in an unknown state (some are up and some are down),
> some show as migrating and there are 9 forever hung migrating tasks.  We
> tried to bring up some of the down VMs that had a state of Down, but
> that ended up getting them the state of "Wait for Lauch", though those
> VMs are actually started.
>
> Right now, my plan is attempt a restart of vdsmd on the node in
> question.  Just trying to get the node to a working state again.  There
> a total of 9 nodes in our cluster, but we can't manage any VMs on the
> affected node right now.
>
> Is there a way in 3.6 to cancel the hung tasks?  I'm worried that if
> vdsmd is restarted on the node, the tasks might be "attempted"... I
> really need them to be forgotten if possible.
>
> Ideally want all "Unknown" to return to either an "up" or "down" state
> (depending if the VM is up or down) and for "Wait for Launch" for those,
> to go to "up" and for all the "Migrating" to go to "up" or "down" (I
> think only one is actually down).
>
> I'm concerned that any attempt manually maniplate the state in the ovirt
> mgmt head db will be moot because the node will be queried for state and
> that state will be taken and override anything I attempt to do.
>
> Thoughts??
>

Hi, please note 3.6 reached End Of Life long time ago.
While someone may still be able to provide help to this specific issue I
would recommend to plan an upgrade as soon as practical.




> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q3AG2HTDIUWLZIINI5OBDZ37A5PEM7N7/
>


-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com
*Red Hat respects your work life balance.
Therefore there is no need to answer this email out of your office hours.
*
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CFZCS5YWY2FFTJNQCCZMQL3P3SJIZNC7/

[ovirt-users] Re: nodectl on plain CentOS hypervisors

2019-08-23 Thread Sandro Bonazzola

Il giorno ven 23 ago 2019 alle ore 11:27 Gianluca Cecchi <
gianluca.cec...@gmail.com> ha scritto:

> Does it make sense to install nodectl utility on plain CentOS 7.x nodes?
>

No, doesn't make sense. nodectl checks for oVirt Node specific
configuration.

# nodectl check
Status: OK
Bootloader ... OK
  Layer boot entries ... OK
  Valid boot entries ... OK
Mount points ... OK
  Separate /var ... OK
  Discard is used ... OK
Basic storage ... OK
  Initialized VG ... OK
  Initialized Thin Pool ... OK
  Initialized LVs ... OK
Thin storage ... OK
  Checking available space in thinpool ... OK
  Checking thinpool auto-extend ... OK
vdsmd ... OK

# nodectl info
layers:
  ovirt-node-ng-4.3.2-0.20190313.0:
ovirt-node-ng-4.3.2-0.20190313.0+1
bootloader:
  default: ovirt-node-ng-4.3.2-0.20190313.0+1
  entries:
ovirt-node-ng-4.3.2-0.20190313.0+1:
  index: 0
  title: ovirt-node-ng-4.3.2-0.20190313.0
  kernel:
/boot/ovirt-node-ng-4.3.2-0.20190313.0+1/vmlinuz-3.10.0-957.5.1.el7.x86_64
  args: "ro crashkernel=auto rd.lvm.lv=onn_host/swap
rd.lvm.lv=onn_host/ovirt-node-ng-4.3.2-0.20190313.0+1
rhgb quiet LANG=en_US.UTF-8 img.bootid=ovirt-node-ng-4.3.2-0.20190313.0+1"
  initrd:
/boot/ovirt-node-ng-4.3.2-0.20190313.0+1/initramfs-3.10.0-957.5.1.el7.x86_64.img
  root: /dev/onn_host/ovirt-node-ng-4.3.2-0.20190313.0+1
current_layer: ovirt-node-ng-4.3.2-0.20190313.0+1




> Or any other alternative for plain OS nodes vs ovirt-node-ng ones?
>

what's the use case here? check host sanity? because nodectl is not
checking that, it just check node config matches to requirements to be able
to perform rollback if needed.


> On my updated CentOS 7.6 oVirt node I have not the command; I think it is
> provided by the package ovirt-node-ng-nodectl, that is one of the available
> ones if I run "yum search" on the system.
>
> Thanks
> Gianluca
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/5I76UHJVK6ADY2P3CIZPJZW37TTZCL4B/
>


-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com
*Red Hat respects your work life balance.
Therefore there is no need to answer this email out of your office hours.
*
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2DMBA4RSEDVAHZXA55K73XFCHOYUCSK6/

[ovirt-users] Re: numa pinning and reserved hugepages (1G) Bug in scheduler calculation or decision ?

2019-08-23 Thread Andrej Krejcir

Hi,

this is a bug in the scheduler. Currently, it ignores hugepages when
evaluating NUMA pinning.

There is a bugzilla ticket[1] that was originally reported as a similar
case, but then later the reporter changed it.

Could you open a new bugzilla ticket and attach the details from this email?

As a workaround, if you don't want to migrate the VM or you are sure that
it can run on the target host, you can clone a cluster policy and remove
the 'NUMA' filter. (In Administration -> Configure -> Scheduling Policies).


[1] - https://bugzilla.redhat.com/show_bug.cgi?id=1720558


Best regards,
Andrej



On Wed, 21 Aug 2019 at 12:16, Ralf Schenk  wrote:

> Hello List,
>
> i ran into problems using numa-pinning and reserved hugepages.
>
> - My EPYC 7281 based Servers (Dual Socket) have 8 Numa-Nodes each having
> 32 GB of memory for a total of 256 GB System Memory
>
> - I'm using 192 x 1 GB hugepages reserved on the kernel cmdline
> default_hugepagesz=1G hugepagesz=1G hugepages=192 This reserves 24
> hugepages on each numa-node.
>
> I wanted to pin a MariaDB VM using 32 GB (Custom Property
> hugepages=1048576) to numa-nodes 0-3 of CPU-Socket 1. Pinning in GUI etc.
> no problem.
>
> When trying to start the vm this can't be done since ovirt claims that the
> host can't fullfill the memory requirements - which is simply not correct
> since there were > 164 hugepages free.
>
> It should have taken 8 hugepages from each numa node 0-3 to fullfill the
> 32 GB Memory requirement.
>
> I also freed the system completely from other VM's but that didn't work
> either.
>
> Is it possible that the scheduler only takes into account the "free
> memory" (as seen in numactl -H below) *not reserved* by hugepages for its
> decisions ? Since the host has only < 8 GB of free mem per numa-node I can
> understand that VM was not able to start under that condition.
>
> VM is runnig and using 32 hugepages without pinning but a warning states
> "VM dbserver01b does not fit to a single NUMA node on host
> myhost.mydomain.de. This may negatively impact its performance. Consider
> using vNUMA and NUMA pinning for this VM."
>
> This is the numa Hardware Layout and hugepages usage now with other VM's
> running:
>
> from cat /proc/meminfo
>
> HugePages_Total: 192
> HugePages_Free:  160
> HugePages_Rsvd:0
> HugePages_Surp:0
>
> I can confirm that also under the condition of running other VM's there
> are at least 8 hugepages free for each numa-node 0-3:
>
> grep ""
> /sys/devices/system/node/*/hugepages/hugepages-1048576kB/free_hugepages
>
> /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/free_hugepages:8
>
> /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/free_hugepages:23
>
> /sys/devices/system/node/node2/hugepages/hugepages-1048576kB/free_hugepages:20
>
> /sys/devices/system/node/node3/hugepages/hugepages-1048576kB/free_hugepages:22
>
> /sys/devices/system/node/node4/hugepages/hugepages-1048576kB/free_hugepages:16
>
> /sys/devices/system/node/node5/hugepages/hugepages-1048576kB/free_hugepages:5
>
> /sys/devices/system/node/node6/hugepages/hugepages-1048576kB/free_hugepages:19
>
> /sys/devices/system/node/node7/hugepages/hugepages-1048576kB/free_hugepages:24
>
> numactl -h:
>
> available: 8 nodes (0-7)
> node 0 cpus: 0 1 2 3 32 33 34 35
> node 0 size: 32673 MB
> node 0 free: 3779 MB
> node 1 cpus: 4 5 6 7 36 37 38 39
> node 1 size: 32767 MB
> node 1 free: 6162 MB
> node 2 cpus: 8 9 10 11 40 41 42 43
> node 2 size: 32767 MB
> node 2 free: 6698 MB
> node 3 cpus: 12 13 14 15 44 45 46 47
> node 3 size: 32767 MB
> node 3 free: 1589 MB
> node 4 cpus: 16 17 18 19 48 49 50 51
> node 4 size: 32767 MB
> node 4 free: 2630 MB
> node 5 cpus: 20 21 22 23 52 53 54 55
> node 5 size: 32767 MB
> node 5 free: 2487 MB
> node 6 cpus: 24 25 26 27 56 57 58 59
> node 6 size: 32767 MB
> node 6 free: 3279 MB
> node 7 cpus: 28 29 30 31 60 61 62 63
> node 7 size: 32767 MB
> node 7 free: 5513 MB
> node distances:
> node   0   1   2   3   4   5   6   7
>   0:  10  16  16  16  32  32  32  32
>   1:  16  10  16  16  32  32  32  32
>   2:  16  16  10  16  32  32  32  32
>   3:  16  16  16  10  32  32  32  32
>   4:  32  32  32  32  10  16  16  16
>   5:  32  32  32  32  16  10  16  16
>   6:  32  32  32  32  16  16  10  16
>   7:  32  32  32  32  16  16  16  10
>
> --
>
>
> *Ralf Schenk*
> fon +49 (0) 24 05 / 40 83 70
> fax +49 (0) 24 05 / 40 83 759
> mail *r...@databay.de* 
>
> *Databay AG*
> Jens-Otto-Krag-Straße 11
> D-52146 Würselen
> *www.databay.de* 
>
> Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
> Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
> Philipp Hermanns
> Aufsichtsratsvorsitzender: Wilhelm Dohmen
> --
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt

[ovirt-users] Re: Update single node environment from 4.3.3 to 4.3.5 problem

2019-08-23 Thread Sandro Bonazzola

Relevant error in the logs seems to be:

MainThread::DEBUG::2016-04-30
19:45:56,428::unified_persistence::46::root::(run)
upgrade-unified-persistence upgrade persisting networks {} and bondings {}
MainThread::INFO::2016-04-30
19:45:56,428::netconfpersistence::187::root::(_clearDisk) Clearing
/var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
MainThread::DEBUG::2016-04-30
19:45:56,428::netconfpersistence::195::root::(_clearDisk) No existent
config to clear.
MainThread::INFO::2016-04-30
19:45:56,428::netconfpersistence::187::root::(_clearDisk) Clearing
/var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/
MainThread::DEBUG::2016-04-30
19:45:56,428::netconfpersistence::195::root::(_clearDisk) No existent
config to clear.
MainThread::INFO::2016-04-30
19:45:56,428::netconfpersistence::131::root::(save) Saved new config
RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and
/var/run/vdsm/netconf/bonds/
MainThread::DEBUG::2016-04-30 19:45:56,428::utils::671::root::(execCmd)
/usr/bin/taskset --cpu-list 0-3 /usr/share/vdsm/vdsm-store-net-config
unified (cwd None)
MainThread::DEBUG::2016-04-30 19:45:56,440::utils::689::root::(execCmd)
SUCCESS:  = 'cp: cannot stat
\xe2\x80\x98/var/run/vdsm/netconf\xe2\x80\x99: No such file or
directory\n';  = 0
MainThread::DEBUG::2016-04-30
19:45:56,441::upgrade::51::upgrade::(_upgrade_seal) Upgrade
upgrade-unified-persistence successfully performed
MainThread::DEBUG::2017-12-31
16:44:52,918::libvirtconnection::163::root::(get) trying to connect libvirt
MainThread::INFO::2017-12-31
16:44:53,033::netconfpersistence::194::root::(_clearDisk) Clearing
/var/lib/vdsm/persistence/netconf/nets/ and
/var/lib/vdsm/persistence/netconf/bonds/
MainThread::WARNING::2017-12-31
16:44:53,034::fileutils::96::root::(rm_tree) Directory:
/var/lib/vdsm/persistence/netconf/bonds/ already removed
MainThread::INFO::2017-12-31
16:44:53,034::netconfpersistence::139::root::(save) Saved new config
PersistentConfig({'ovirtmgmt': {'ipv6autoconf': False, 'nameservers':
['192.168.1.1', '8.8.8.8'], u'nic': u'eth0', 'dhcpv6': False, u'ipaddr':
u'192.168.1.211', 'switch': 'legacy', 'mtu': 1500, u'netmask':
u'255.255.255.0', u'bootproto': u'static', 'stp': False, 'bridged': True,
u'gateway': u'192.168.1.1', u'defaultRoute': True}}, {}) to
/var/lib/vdsm/persistence/netconf/nets/ and
/var/lib/vdsm/persistence/netconf/bonds/
MainThread::DEBUG::2017-12-31 16:44:53,035::cmdutils::150::root::(exec_cmd)
/usr/share/openvswitch/scripts/ovs-ctl status (cwd None)
MainThread::DEBUG::2017-12-31 16:44:53,069::cmdutils::158::root::(exec_cmd)
FAILED:  = '';  = 1
MainThread::DEBUG::2018-02-16
23:59:17,968::libvirtconnection::167::root::(get) trying to connect libvirt
MainThread::INFO::2018-02-16
23:59:18,500::netconfpersistence::198::root::(_clearDisk) Clearing netconf:
/var/lib/vdsm/staging/netconf
MainThread::ERROR::2018-02-16 23:59:18,501::fileutils::53::root::(rm_file)
Removing file: /var/lib/vdsm/staging/netconf failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/fileutils.py", line
48, in rm_file
os.unlink(file_to_remove)
OSError: [Errno 21] Is a directory: '/var/lib/vdsm/staging/netconf'

+Dominik Holler  can you please have a look?



Il giorno ven 23 ago 2019 alle ore 00:56 Gianluca Cecchi <
gianluca.cec...@gmail.com> ha scritto:

> Hello,
> after updating hosted engine from 4.3.3 to 4.3.5 and then the only host
> composing the environment (plain CentOS 7.6) it seems it is not able to
> start vdsm daemons
>
> kernel installed with update is kernel-3.10.0-957.27.2.el7.x86_64
> Same problem also if using previous running kernel
> 3.10.0-957.12.2.el7.x86_64
>
> [root@ovirt01 vdsm]# uptime
>  00:50:08 up 25 min,  3 users,  load average: 0.60, 0.67, 0.60
> [root@ovirt01 vdsm]#
>
> [root@ovirt01 vdsm]# systemctl status vdsmd -l
> ● vdsmd.service - Virtual Desktop Server Manager
>Loaded: loaded (/etc/systemd/system/vdsmd.service; enabled; vendor
> preset: enabled)
>Active: failed (Result: start-limit) since Fri 2019-08-23 00:37:27
> CEST; 7s ago
>   Process: 25810 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
> --pre-start (code=exited, status=1/FAILURE)
>
> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: Failed to start Virtual
> Desktop Server Manager.
> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: Unit vdsmd.service entered
> failed state.
> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: vdsmd.service failed.
> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: vdsmd.service holdoff time
> over, scheduling restart.
> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: Stopped Virtual Desktop
> Server Manager.
> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: start request repeated too
> quickly for vdsmd.service
> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: Failed to start Virtual
> Desktop Server Manager.
> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: Unit vdsmd.service entered
> failed state.
> Aug 23 00:37:27 ovirt01.mydomain systemd[1]: vdsmd.service failed.
>

[ovirt-users] Re: attach untagged vlan internally on vm

2019-08-23 Thread Dominik Holler

On Thu, Aug 22, 2019 at 1:18 PM Miguel Duarte de Mora Barroso <
mdbarr...@redhat.com> wrote:

> On Wed, Aug 21, 2019 at 9:18 AM  wrote:
> >
> > good day
> > currently i am testing oVirt on a single box and setup some tagged vms
> and non tagged vm.
> > the non tagged vm is a firewall but it has limitations on the number of
> nic so i cannot attach tagged vnic and wish to handdle vlan tagging on it
> >
> > is it possible to pass untaged franes internally?
>
> I think it would fallback to the linux bridge default configuration,
> which internally tags untagged frames with vlanID 1, and untags them
> when exiting the port. Unless I'm wrong (for instance, we change the
> bridge defaults), this means you can pass untagged frames through the
> bridge.
>
> Adding Edward, to keep me honest.
>
>
>
I am unsure if I got the problem.
If you connect an untagged logical network to a vNIC (virtual NIC of a VM),
all untagged Ethernet frames will be forwarded from the host interface
(physical NIC or bond).
If no tagged logical network is attached to this host interface, VLAN tag
filtering is not activated and even tagged Frames would be forwarded to the
vNC.

Does this answer the question?



>
>
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/HYFSLS5QM5DKBYWFF44NCB4E3CD5GKH4/
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ME77W5PLKOQC5U3OXNZE3W7W27ZOPVIP/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TBUPW6AWC6INH24Q5SWXCRHTZQYA77KJ/

[ovirt-users] Re: Evaluate oVirt to remove VMware

2019-08-23 Thread Dominik Holler

Allen, please create bonds like described in
https://www.ovirt.org/documentation/admin-guide/chap-Logical_Networks.html#creating-a-bond-device-using-the-administration-portal
avoid manual steps on the host.

On Fri, Aug 23, 2019 at 11:37 AM Sandro Bonazzola 
wrote:

>
>
> Il giorno gio 22 ago 2019 alle ore 22:41  ha scritto:
>
>> Thanks Paul,
>>
>> Hey Paul,
>>
>> Thanks for the reply!
>>
>> Not really sure here, I read the oVirt 3.0 pdf and it says you need to
>> enable LACP for Cisco switches.
>>
>
This is only required if you like to use bond mode 4, which is recommended
if your switch supports this.


> This is really not becoming a learning setup any longer, just a headache.
>>
>> Tried multiple ways and still no luck... the next way I am trying is this:
>> https://www.ovirt.org/develop/networking/bonding-vlan-bridge.html
>>
>
>
This guide should still work, but the configuration via Engine's Web UI or
REST API is much more comfortable and safe.


> Please note this is not meant to be user documentation.
> +Dan Kenigsberg  , +Karli Sjöberg  ,
>  +Dominik Holler  , +Miguel Duarte de Mora Barroso
>  , can you please have a look at
> https://www.ovirt.org/develop/networking/bonding-vlan-bridge.html and see
> if it still make sense with oVirt 4.3?
> If it makes sense, let's ensure this is properly documented in
> https://ovirt.org/documentation/admin-guide/administration-guide.html or
> https://ovirt.org/documentation/install-guide/Installation_Guide.html
>
>
>
>
>>
>> But not hoping any longer.
>> Not sure if it will work or not, but going to give it a go.
>>
>> Thanks!!!
>> Allen
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/7ADQYAFSVXBGDMOL3VLF43W54HX3A7GT/
>>
>
>
> --
>
> Sandro Bonazzola
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R RHV
>
> Red Hat EMEA 
>
> sbona...@redhat.com
> *Red Hat respects your work life balance.
> Therefore there is no need to answer this email out of your office hours.
> *
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IBHABFL6XBFAOC2KORVDPXUY6RVVWHVX/

[ovirt-users] Re: When I create a new domain in storage , it report : VDSM command ActivateStorageDomainVDS failed: Unknown pool id, pool not connected: (u'b87012a1-8f7a-4af5-8884-e0fb8002e842', )

2019-08-23 Thread Sandro Bonazzola

Il giorno ven 23 ago 2019 alle ore 08:10  ha
scritto:

> The version of  ovirt-engine  is 4.2.8
> The version of  ovirt-node is 4.2.8
>
>
Hi, please note oVirt 4.2 reached End Of Life state a few months ago.
If this is a new deployment, please redeploy using 4.3 instead.
If this is an existing deployment, please upgrade to 4.3 and let us know if
the problem persists.



> When I  create  a new domain in storage ,  the storage type is NFS , it
> report :
>
> VDSM command ActivateStorageDomainVDS failed: Unknown pool id, pool not
> connected: (u'b87012a1-8f7a-4af5-8884-e0fb8002e842',)
>
> The error of vdsm.log is :
>
> 2019-08-23 11:02:14,740+0800 INFO  (jsonrpc/4) [vdsm.api] START
> connectStorageServer(domType=1,
> spUUID=u'----', conList=[{u'id':
> u'c6893a09-ab28-4328-b186-d2f88f2320d4', u'connection': 
> u'172.16.10.74:/ovirt-data',
> u'iqn': u'', u'user': u'', u'tpgt': u'1', u'protocol_version': u'auto',
> u'password': '', u'port': u''}], options=None)
> from=:::172.16.90.10,52962, flow_id=5d720a32,
> task_id=2af924ff-37a6-46a1-b79f-4251d21d5ff9 (api:46)
> 2019-08-23 11:02:14,743+0800 INFO  (jsonrpc/4) [vdsm.api] FINISH
> connectStorageServer return={'statuslist': [{'status': 0, 'id':
> u'c6893a09-ab28-4328-b186-d2f88f2320d4'}]} from=:::172.16.90.10,52962,
> flow_id=5d720a32, task_id=2af924ff-37a6-46a1-b79f-4251d21d5ff9 (api:52)
> 2019-08-23 11:02:14,743+0800 INFO  (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC
> call StoragePool.connectStorageServer succeeded in 0.00 seconds
> (__init__:573)
> 2019-08-23 11:02:14,751+0800 INFO  (jsonrpc/6) [vdsm.api] START
> activateStorageDomain(sdUUID=u'3bcdc32c-040e-4a4c-90fb-de950f54f1b4',
> spUUID=u'b87012a1-8f7a-4af5-8884-e0fb8002e842', options=None)
> from=:::172.16.90.10,53080, flow_id=709ee722,
> task_id=9b8f32af-5fdd-4ffa-9520-c474af03db70 (api:46)
> 2019-08-23 11:02:14,752+0800 INFO  (jsonrpc/6) [vdsm.api] FINISH
> activateStorageDomain error=Unknown pool id, pool not connected:
> (u'b87012a1-8f7a-4af5-8884-e0fb8002e842',) from=:::172.16.90.10,53080,
> flow_id=709ee722, task_id=9b8f32af-5fdd-4ffa-9520-c474af03db70 (api:50)
> 2019-08-23 11:02:14,752+0800 ERROR (jsonrpc/6) [storage.TaskManager.Task]
> (Task='9b8f32af-5fdd-4ffa-9520-c474af03db70') Unexpected error (task:875)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
> in _run
> return fn(*args, **kargs)
>   File "", line 2, in activateStorageDomain
>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
> method
> ret = func(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1262,
> in activateStorageDomain
> pool = self.getPool(spUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 350,
> in getPool
> raise se.StoragePoolUnknown(spUUID)
> StoragePoolUnknown: Unknown pool id, pool not connected:
> (u'b87012a1-8f7a-4af5-8884-e0fb8002e842',)
> 2019-08-23 11:02:14,752+0800 INFO  (jsonrpc/6) [storage.TaskManager.Task]
> (Task='9b8f32af-5fdd-4ffa-9520-c474af03db70') aborting: Task is aborted:
> "Unknown pool id, pool not connected:
> (u'b87012a1-8f7a-4af5-8884-e0fb8002e842',)" - code 309 (task:1181)
> 2019-08-23 11:02:14,752+0800 ERROR (jsonrpc/6) [storage.Dispatcher] FINISH
> activateStorageDomain error=Unknown pool id, pool not connected:
> (u'b87012a1-8f7a-4af5-8884-e0fb8002e842',) (dispatcher:82)
>
> How can I solve this problem ?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/43TIQPUM347YIGR6E7F3IMMCZXM6H4ZJ/
>


-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com
*Red Hat respects your work life balance.
Therefore there is no need to answer this email out of your office hours.
*
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZP22SIZ7C3JTHA4RJHHMMHTOQXM35NQS/

[ovirt-users] Re: Evaluate oVirt to remove VMware

2019-08-23 Thread Sandro Bonazzola

Il giorno gio 22 ago 2019 alle ore 22:41  ha scritto:

> Thanks Paul,
>
> Hey Paul,
>
> Thanks for the reply!
>
> Not really sure here, I read the oVirt 3.0 pdf and it says you need to
> enable LACP for Cisco switches.
> This is really not becoming a learning setup any longer, just a headache.
>
> Tried multiple ways and still no luck... the next way I am trying is this:
> https://www.ovirt.org/develop/networking/bonding-vlan-bridge.html
>

Please note this is not meant to be user documentation.
+Dan Kenigsberg  , +Karli Sjöberg  ,
 +Dominik Holler  , +Miguel Duarte de Mora Barroso
 , can you please have a look at
https://www.ovirt.org/develop/networking/bonding-vlan-bridge.html and see
if it still make sense with oVirt 4.3?
If it makes sense, let's ensure this is properly documented in
https://ovirt.org/documentation/admin-guide/administration-guide.html or
https://ovirt.org/documentation/install-guide/Installation_Guide.html




>
> But not hoping any longer.
> Not sure if it will work or not, but going to give it a go.
>
> Thanks!!!
> Allen
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/7ADQYAFSVXBGDMOL3VLF43W54HX3A7GT/
>


-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com
*Red Hat respects your work life balance.
Therefore there is no need to answer this email out of your office hours.
*
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CZ2GPMPP4HFM3SMK3BSSMELUBCHTG47X/

[ovirt-users] nodectl on plain CentOS hypervisors

2019-08-23 Thread Gianluca Cecchi

Does it make sense to install nodectl utility on plain CentOS 7.x nodes?
Or any other alternative for plain OS nodes vs ovirt-node-ng ones?
On my updated CentOS 7.6 oVirt node I have not the command; I think it is
provided by the package ovirt-node-ng-nodectl, that is one of the available
ones if I run "yum search" on the system.

Thanks
Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5I76UHJVK6ADY2P3CIZPJZW37TTZCL4B/

[ovirt-users] When I create a new domain in storage , it report : VDSM command ActivateStorageDomainVDS failed: Unknown pool id, pool not connected: (u'b87012a1-8f7a-4af5-8884-e0fb8002e842', )

2019-08-23 Thread wangyu13476969128

The version of  ovirt-engine  is 4.2.8
The version of  ovirt-node is 4.2.8

When I  create  a new domain in storage ,  the storage type is NFS , it report :

VDSM command ActivateStorageDomainVDS failed: Unknown pool id, pool not 
connected: (u'b87012a1-8f7a-4af5-8884-e0fb8002e842',)

The error of vdsm.log is :

2019-08-23 11:02:14,740+0800 INFO  (jsonrpc/4) [vdsm.api] START 
connectStorageServer(domType=1, spUUID=u'----', 
conList=[{u'id': u'c6893a09-ab28-4328-b186-d2f88f2320d4', u'connection': 
u'172.16.10.74:/ovirt-data', u'iqn': u'', u'user': u'', u'tpgt': u'1', 
u'protocol_version': u'auto', u'password': '', u'port': u''}], 
options=None) from=:::172.16.90.10,52962, flow_id=5d720a32, 
task_id=2af924ff-37a6-46a1-b79f-4251d21d5ff9 (api:46)
2019-08-23 11:02:14,743+0800 INFO  (jsonrpc/4) [vdsm.api] FINISH 
connectStorageServer return={'statuslist': [{'status': 0, 'id': 
u'c6893a09-ab28-4328-b186-d2f88f2320d4'}]} from=:::172.16.90.10,52962, 
flow_id=5d720a32, task_id=2af924ff-37a6-46a1-b79f-4251d21d5ff9 (api:52)
2019-08-23 11:02:14,743+0800 INFO  (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call 
StoragePool.connectStorageServer succeeded in 0.00 seconds (__init__:573)
2019-08-23 11:02:14,751+0800 INFO  (jsonrpc/6) [vdsm.api] START 
activateStorageDomain(sdUUID=u'3bcdc32c-040e-4a4c-90fb-de950f54f1b4', 
spUUID=u'b87012a1-8f7a-4af5-8884-e0fb8002e842', options=None) 
from=:::172.16.90.10,53080, flow_id=709ee722, 
task_id=9b8f32af-5fdd-4ffa-9520-c474af03db70 (api:46)
2019-08-23 11:02:14,752+0800 INFO  (jsonrpc/6) [vdsm.api] FINISH 
activateStorageDomain error=Unknown pool id, pool not connected: 
(u'b87012a1-8f7a-4af5-8884-e0fb8002e842',) from=:::172.16.90.10,53080, 
flow_id=709ee722, task_id=9b8f32af-5fdd-4ffa-9520-c474af03db70 (api:50)
2019-08-23 11:02:14,752+0800 ERROR (jsonrpc/6) [storage.TaskManager.Task] 
(Task='9b8f32af-5fdd-4ffa-9520-c474af03db70') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in 
_run
return fn(*args, **kargs)
  File "", line 2, in activateStorageDomain
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1262, in 
activateStorageDomain
pool = self.getPool(spUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 350, in 
getPool
raise se.StoragePoolUnknown(spUUID)
StoragePoolUnknown: Unknown pool id, pool not connected: 
(u'b87012a1-8f7a-4af5-8884-e0fb8002e842',)
2019-08-23 11:02:14,752+0800 INFO  (jsonrpc/6) [storage.TaskManager.Task] 
(Task='9b8f32af-5fdd-4ffa-9520-c474af03db70') aborting: Task is aborted: 
"Unknown pool id, pool not connected: 
(u'b87012a1-8f7a-4af5-8884-e0fb8002e842',)" - code 309 (task:1181)
2019-08-23 11:02:14,752+0800 ERROR (jsonrpc/6) [storage.Dispatcher] FINISH 
activateStorageDomain error=Unknown pool id, pool not connected: 
(u'b87012a1-8f7a-4af5-8884-e0fb8002e842',) (dispatcher:82)

How can I solve this problem ?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/43TIQPUM347YIGR6E7F3IMMCZXM6H4ZJ/

39 matches

Mail list logo