[ovirt-users] Re: fresh ovirt node 4.4.6 fail on firewalld both host and engine deployment

2021-05-12 Thread Charles Kozler
Yep! I thought the error was reporting as the port already configured
inside the seed engine and not on the actual host. I deleted the firewalld
6900 port addition and everything seems to be flowing through

On Wed, May 12, 2021 at 1:36 PM Patrick Lomakin 
wrote:

> Hello. I know this error. Please see which ports are used in firewalld
> configuration (6900). In gluster wizard click "Edit" button and remove
> gluster firewall config string like "port 6900". Save your configuration
> and try to deploy. Regards!
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/DD2ATIFJEITEI5LZL7IVQS7ROD7HQOYX/
>

-- 
*Notice to Recipient*: https://www.fixflyer.com/disclaimer 

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3MFZHREQGHOCAT44H6JUPZXPVJXWWIDZ/


[ovirt-users] Re: fresh ovirt node 4.4.6 fail on firewalld both host and engine deployment

2021-05-12 Thread Charles Kozler
I also just upgraded to 4.6.6.1 and it is still occurring




On Wed, May 12, 2021 at 12:36 PM Charles Kozler 
wrote:

> Hello -
>
> Deployed fresh ovirt node 4.4.6 and the only thing I did to the system was
> configure the NIC with nmtui
>
> During the gluster install the install errored out with
>
> gluster-deployment-1620832547044.log:failed: [n2] (item=5900/tcp) =>
> {"ansible_loop_var": "item", "changed": false, "item": "5900/tcp", "msg":
> "ERROR: Exception caught: org.fedoraproject.FirewallD1.Exception:
> ALREADY_ENABLED: '5900:tcp' already in 'public' Permanent and
> Non-Permanent(immediate) operation"}
>
> The fix here was easy - I just deleted the port it was complaining about
> with firewall-cmd and restarted the installation and it was all fine
>
> During the hosted engine deployment when the VM is being deployed it dies
> here
>
> [ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Open a port on firewalld]
> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "ERROR:
> Exception caught: org.fedoraproject.FirewallD1.Exception: ALREADY_ENABLED:
> '6900:tcp' already in 'public' Non-permanent operation"}
>
> Now the issue here is that I do not have access to the engine VM as it is
> in a bit of a transient state since when it fails the current image that is
> open is discarded when the ansible playbook is kicked off again
>
> I cannot find any BZ on this and google is turning up nothing. I don't
> think firewalld failing due to the firewall rule already existing should be
> a reason to exit the installation
>
> The interesting part is that this only fails on certain ports. i.e when I
> reran the gluster wizard after 5900 failed, the other ports are presumably
> still added to the firewall, and the installation completes
>
> Suggestions?
>
>
>

-- 
*Notice to Recipient*: https://www.fixflyer.com/disclaimer 
<https://www.fixflyer.com/disclaimer>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QTQ5GCPXWW6VFHP7Y2ADOTB2SNPGP6VZ/


[ovirt-users] fresh ovirt node 4.4.6 fail on firewalld both host and engine deployment

2021-05-12 Thread Charles Kozler
Hello -

Deployed fresh ovirt node 4.4.6 and the only thing I did to the system was
configure the NIC with nmtui

During the gluster install the install errored out with

gluster-deployment-1620832547044.log:failed: [n2] (item=5900/tcp) =>
{"ansible_loop_var": "item", "changed": false, "item": "5900/tcp", "msg":
"ERROR: Exception caught: org.fedoraproject.FirewallD1.Exception:
ALREADY_ENABLED: '5900:tcp' already in 'public' Permanent and
Non-Permanent(immediate) operation"}

The fix here was easy - I just deleted the port it was complaining about
with firewall-cmd and restarted the installation and it was all fine

During the hosted engine deployment when the VM is being deployed it dies
here

[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Open a port on firewalld]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "ERROR:
Exception caught: org.fedoraproject.FirewallD1.Exception: ALREADY_ENABLED:
'6900:tcp' already in 'public' Non-permanent operation"}

Now the issue here is that I do not have access to the engine VM as it is
in a bit of a transient state since when it fails the current image that is
open is discarded when the ansible playbook is kicked off again

I cannot find any BZ on this and google is turning up nothing. I don't
think firewalld failing due to the firewall rule already existing should be
a reason to exit the installation

The interesting part is that this only fails on certain ports. i.e when I
reran the gluster wizard after 5900 failed, the other ports are presumably
still added to the firewall, and the installation completes

Suggestions?

-- 
*Notice to Recipient*: https://www.fixflyer.com/disclaimer 

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5SEB6PJCFTLXKOIBFIECQVJOPBHZJWIR/


[ovirt-users] Re: Recent news & oVirt future

2020-12-11 Thread Charles Kozler
What goes in to oVirt goes in to RHV if I understand correctly, right? If
so sorry, I meant upstream

If I am understanding how all of this is changing correctly then this move
to stream will only serve to benefit oVirt as it speeds up the pace of
CentOS in the ecosystem and therefore potentially won't have breaking
changes dependent and waiting on RHEL to release so CentOS can be built

If I remember correctly (and I could be confusing this with another
application), oVirt requires CentOS 7.3 or higher right?




On Fri, Dec 11, 2020 at 10:08 AM Sandro Bonazzola 
wrote:

>
>
> Il giorno ven 11 dic 2020 alle ore 15:49 Charles Kozler <
> char...@fixflyer.com> ha scritto:
>
>> CentOS was the downstream of RHEL but has now become the upstream
>>
>> I guess oVirt was always downstream as well - yes?
>>
>
> No. oVirt is oVirt. It's  downstream to nothing.
> And it used to work and being used on Fedora which is upstream to RHEL and
> to CentOS Stream.
> Fedora moved just way too fast and we had to drop the effort trying to
> keep the pace: https://blogs.ovirt.org/2020/05/ovirt-and-fedora/
> With CentOS Stream we are just moving the point when CentOS breaks oVirt.
> Instead to wait a couple of months after RHEL release (CentOS 8.3 just
> broke oVirt: we can't build oVirt Node and oVirt appliance anymore due to a
> bug in lorax package, it's preventing oVirt 4.4.4 to be released because
> advanced virtualization build is missing a dependency which is in RHEL but
> not in CentOS due to a bug in CentOS compose system) we'll have the fix in
> oVirt a month before RHEL will be released.
>
>
>
>>
>> If so then yes, I can't see much changing in the ways of oVirt
>>
>
> As far as I can tell by looking at CentOS 8.3, it will change in something
> better.
>
>
>
>>
>>
>>
>>
>> On Fri, Dec 11, 2020 at 2:59 AM Sandro Bonazzola 
>> wrote:
>>
>>>
>>>
>>> Il giorno gio 10 dic 2020 alle ore 21:51 Charles Kozler <
>>> char...@fixflyer.com> ha scritto:
>>>
>>>> I guess this is probably a question for all current open source
>>>> projects that red hat runs but -
>>>>
>>>> Does this mean oVirt will effectively become a rolling release type
>>>> situation as well?
>>>>
>>>
>>> There's no plan to make oVirt a rolling release.
>>>
>>>
>>>>
>>>> How exactly is oVirt going to stay open source and stay in cadence with
>>>> all the other updates happening around it on packages/etc that it depends
>>>> on if the streams are rolling release? Do they now need to fork every piece
>>>> of dependency?
>>>>
>>>
>>> We are going to test regularly oVirt on CentOS Stream, releasing oVirt
>>> Node and oVirt appliance after testing them, without any difference to what
>>> we are doing right now with CentOS Linux.
>>> Any raised issue will be handled as usual.
>>>
>>> What exactly does this mean for oVirt going forward and its overall
>>>> stability?
>>>>
>>>
>>> oVirt plans about CentOS Stream have been communicated one year ago
>>> here: https://blogs.ovirt.org/2019/09/ovirt-and-centos-stream/
>>>
>>> That said, please note that oVirt documentation mentions "Enterprise
>>> Linux" almost everywhere and not explicitly CentOS Linux.
>>> As far as I can tell any RHEL binary compatible rebuild should just work
>>> with oVirt despite I would recommend to follow what will be done within
>>> oVirt Node and oVirt Appliance.
>>>
>>>
>>>
>>>>
>>>> *Notice to Recipient*: https://www.fixflyer.com/disclaimer
>>>> ___
>>>> Users mailing list -- users@ovirt.org
>>>> To unsubscribe send an email to users-le...@ovirt.org
>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/7IUWGES2IG4BELLUPMYGEKN3GC6XVCHA/
>>>>
>>>
>>>
>>> --
>>>
>>> Sandro Bonazzola
>>>
>>> MANAGER, SOFTWARE ENGINEERING, EMEA R RHV
>>>
>>> Red Hat EMEA <https://www.redhat.com/>
>>>
>>> sbona...@redhat.com
>>> <https://www.redhat.com/>
>>>
>>> *Red Hat respects your work life balance. Therefore there is no ne

[ovirt-users] Re: Recent news & oVirt future

2020-12-11 Thread Charles Kozler
CentOS was the downstream of RHEL but has now become the upstream

I guess oVirt was always downstream as well - yes?

If so then yes, I can't see much changing in the ways of oVirt




On Fri, Dec 11, 2020 at 2:59 AM Sandro Bonazzola 
wrote:

>
>
> Il giorno gio 10 dic 2020 alle ore 21:51 Charles Kozler <
> char...@fixflyer.com> ha scritto:
>
>> I guess this is probably a question for all current open source projects
>> that red hat runs but -
>>
>> Does this mean oVirt will effectively become a rolling release type
>> situation as well?
>>
>
> There's no plan to make oVirt a rolling release.
>
>
>>
>> How exactly is oVirt going to stay open source and stay in cadence with
>> all the other updates happening around it on packages/etc that it depends
>> on if the streams are rolling release? Do they now need to fork every piece
>> of dependency?
>>
>
> We are going to test regularly oVirt on CentOS Stream, releasing oVirt
> Node and oVirt appliance after testing them, without any difference to what
> we are doing right now with CentOS Linux.
> Any raised issue will be handled as usual.
>
> What exactly does this mean for oVirt going forward and its overall
>> stability?
>>
>
> oVirt plans about CentOS Stream have been communicated one year ago here:
> https://blogs.ovirt.org/2019/09/ovirt-and-centos-stream/
>
> That said, please note that oVirt documentation mentions "Enterprise
> Linux" almost everywhere and not explicitly CentOS Linux.
> As far as I can tell any RHEL binary compatible rebuild should just work
> with oVirt despite I would recommend to follow what will be done within
> oVirt Node and oVirt Appliance.
>
>
>
>>
>> *Notice to Recipient*: https://www.fixflyer.com/disclaimer
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/7IUWGES2IG4BELLUPMYGEKN3GC6XVCHA/
>>
>
>
> --
>
> Sandro Bonazzola
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R RHV
>
> Red Hat EMEA <https://www.redhat.com/>
>
> sbona...@redhat.com
> <https://www.redhat.com/>
>
> *Red Hat respects your work life balance. Therefore there is no need to
> answer this email out of your office hours.*
>
>
>

-- 
*Notice to Recipient*: https://www.fixflyer.com/disclaimer 
<https://www.fixflyer.com/disclaimer>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6SOQ5MT5DZTHWF76UYP2ZGED2EIPCQS2/


[ovirt-users] Recent news & oVirt future

2020-12-10 Thread Charles Kozler
I guess this is probably a question for all current open source projects
that red hat runs but -

Does this mean oVirt will effectively become a rolling release type
situation as well?

How exactly is oVirt going to stay open source and stay in cadence with all
the other updates happening around it on packages/etc that it depends on if
the streams are rolling release? Do they now need to fork every piece of
dependency?

What exactly does this mean for oVirt going forward and its overall
stability?

-- 
*Notice to Recipient*: https://www.fixflyer.com/disclaimer 

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7IUWGES2IG4BELLUPMYGEKN3GC6XVCHA/


[ovirt-users] Re: stty: standard tty: inappropriate ioctl for device

2020-05-05 Thread Charles Kozler
>
> > What do you mean by that? Usually (I am not a native English speaker),
> > "red herring" means, for me, "something that made me look at the error
> > not in the place where it actually occurred". In this case:
>
> Yep! The installer would die out after failing at the SSO step. In the log
> for this failure it fails with the linked error "You must specify either
> url or hostname" like you see in that bug report. Nowhere was I was able to
> deduce it was a failing SSH remote command. I just had happened to decide
> to look at the other log and saw it
>
> The funny thing is, it was right in front of my face the whole time (stty)
> but it never dawned on me until I decided to try running a command myself
> like you see with uptime below
>
> > The error message does mention 'stty', and I guess this is
> > more-or-less the best the engine could have done, other than trying to
> > analyze your .bashrc.
> {...}
> > A common idiom for such cases is to check PS1 - bash should set it
> > only for interactive shells. E.g.:
> >
> > if [ "$PS1" ]; then
> >stty erase ^?
> > fi
> >
> > But you might better fix your terminal emulator instead.
> {...}
> > :-(
>
> > (Used to be more common 20-30 years ago, when people actually had many
> > more different physical terminals, terminal emulators, and unix-like
> > OSes. Sadly our industry failed to fix this, requiring such
> > workarounds, so far. See also, likely way-out-of-date:
> > http://www.tldp.org/HOWTO/BackspaceDelete/ )
>
> Yup, going forward I will be putting it like that by checking for PS1.
> This was just such an edge case that it had completely slipped my mind that
> it would be the problem
>
> I dont know why or when this started occuring - I want to say around RHEL
> 7.5 or so, but vim reverted back to this behavior I havent seen in almost
> 15 years
>
> I run Fedora on all my machines but collegues on Win10 also started to see
> it
>
> The only fix is the stty thing
>
> > Checking the current 4.3 code, I see that we fail if there is anything
> > at all on stderr.
> > Trying to check git log finding out why we do that, I fail to find a
> > concrete reason - although, if you ask me, it might make sense.
> > Changing that is easy, but then might make real errors you should
> > actually notice, go unnoticed.
>
> > In 4.4 we do not have this code anymore, and use ansible instead for
> > host-deploy. Since 4.4 GA is expected soon (a few weeks?), and 4.3
> > will be EOLed soon after that, I do not see much point in investing in
> > this anyway.
>
> I agree. I can see the benefit in it being a little more decisive,
> however, the trade off means the developer has to try to account for every
> case and this, of course, was a complete edge case so likely would have
> been missed if parsing individual stderr output
>
> However, you can run SSH not try and load .bashrc environment - so I was
> thinking something more along the lines of that
>
> [root@node01 ~]# ssh root@localhost uptime
> root@localhost's password:
> stty: standard input: Inappropriate ioctl for device
>  07:50:39 up 1 day,  8:05,  2 users,  load average: 0.08, 0.07, 0.06
>
> [root@node01 ~]# ssh -t root@localhost uptime
> root@localhost's password:
>  07:51:47 up 1 day,  8:07,  3 users,  load average: 0.06, 0.07, 0.06
> Connection to localhost closed.
>
> > Well, I wouldn't say "110%", but the fact that we try to do things
> > automatically, on remote machines, using tools that were mainly
> > intended for manual/interactive use, does mean that we are limited.
>
> I guess 110 is a bit of a biased over-exaggeration :-)
>
> The only reason I say that is because I love ovirt a lot but every
> experience I have had with getting it installed as always ended with a
> different failure each time where there isnt an obvious reason as to why
> there was a problem and it deters me from installations and only having to
> do it if I have to
>
> The reason I have found, specifically now with this with such a minor
> change in to bashrc, is that in enterprise environment we have a standard
> base configuration that we work off - this involves changing SSH
> parameters, bashrc environment changes, security changes w/ sysctl and
> others
>
> Every time I have had a problem it is apparent now it is because of any of
> those changes. My suggestion was more or less to have it called out
> somewhere obvious that oVirt installer/ansible generally want a completely
> fresh install and that making any changes to the system prior could have
> negative impact. Example here: a simple convenience change to .bashrc that
> we have done all the time led to a 2 day time sink
>
> Going forward, my standard operating for install is going to be install
> immediately after bare ISO install from minimal ISO and bypass any
> bootstrapping changes we do to our systems. Then, once up, I will apply our
> changes as I have never had issues with ovirt after its up - its the
> install that is always a very rigid time consuming problem 

[ovirt-users] stty: standard tty: inappropriate ioctl for device

2020-05-04 Thread Charles Kozler
I'd like to share this with the list because its something that I changed
for convenience, in .bashrc, but had a not so obvious rippling impact on
the ovirt self hosted installer. I could imagine a few others doing this
too and I'd rather save future them hours of google time

Every install failed no matter what and it was always at the SSO step to
revoke token (here:
https://github.com/machacekondra/ansible/blob/71547905fab67a876450f95c9ca714522ca1031c/lib/ansible/modules/cloud/ovirt/ovirt_auth.py#L262-L268)
and then reissue new token but the engine log yielded different
information. The SSO failure was a red herring

engine.log:2020-05-04 22:09:17,150-04 ERROR
[org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
(EE-ManagedThreadFactory-engine-Thread-1) [61038b68] Host installation
failed for host '672be551-9259-4d2d-823d-07f586b4e0f1', 'node1': Unexpected
error during execution: stty: standard input: Inappropriate ioctl for device

engine.log:2020-05-04 22:09:17,145-04 ERROR
[org.ovirt.engine.core.uutils.ssh.SSHDialog]
(EE-ManagedThreadFactory-engine-Thread-1) [61038b68] SSH error running
command root@node1:'umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d
-t ovirt-XX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1;
rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C
"${MYTMP}" -x &&  "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine
DIALOG/customization=bool:True': RuntimeException: Unexpected error during
execution: stty: standard input: Inappropriate ioctl for device

I was on the hunt for this for the better part of 2 days or so, because who
else has anything to do during quarantine, wracking my brain and trying
everything to figure out what was going on

Well, it was my own fault

# cat .bashrc | grep -i stty
stty erase ^?

With this set in the .bashrc of the node I was running the installer via
cockpit from, ovirt installer will fail to install

This was set for convenience to have backspace work in vim since at some
point it stopped working for me

Should I file this as a bug? The message generated is more of a warning
then a failure but I do not know the internals of ovirt like that. Commands
still actually execute fine

ovirt-engine@192.168.222.84) # ssh 10.0.16.221 "uptime"
root@10.0.16.221's password:
stty: standard input: Inappropriate ioctl for device
 22:30:01 up 22:45,  2 users,  load average: 0.80, 0.62, 0.80

One thing I think should be called out in the docs, and called out very
loud, is that the entire ovirt installer expects a clean 110% machine that
is done right after install and provided and IP and hostname. Its not that
obvious, but it is now
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/F4E4XI4WFKP2XTEM7M3FRCCAZLA2YOFE/


[ovirt-users] documentation change suggestion / question

2019-05-14 Thread Charles Kozler
Googling "ovirt windows agent" this is the first link that pops up:
http://www.ovirt.org/documentation/internal/guest-agent/guest-agent-for-windows/

This doc seems non-intuitive and over complicated

Specifically, the RedHat documentation that is 4 links below is simple as
"install this package and mount the iso":
https://community.redhat.com/blog/2015/05/how-to-install-and-use-ovirts-windows-guest-tools/

The former was updated as early as June of this year

1.) The RedHat document worked for me so I dont think oVirt 4.x cant use
the same
2.) Why have separate areas of documentation that are so different from
each other? This has caused me issues in the past whereby I found RH docs
that were much more clear and concise
3.) Is there anyone who might want to merge RH docs with oVirt docs where
the RH docs are better than oVirt?

Thanks so much for providing such a great product! I just feel sometimes
the docs are a little more developer-centric whereas the RedHat docs are
more easily readable

--
IMPORTANT!
This message has been scanned for viruses and phishing links.
However, it is your responsibility to evaluate the links and attachments you 
choose to click.
If you are uncertain, we always try to help.
Greetings helpd...@actnet.se



--
IMPORTANT!
This message has been scanned for viruses and phishing links.
However, it is your responsibility to evaluate the links and attachments you 
choose to click.
If you are uncertain, we always try to help.
Greetings helpd...@actnet.se


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DSRACHYKKDXQDJKWEBKBOQT2WW7RZVHZ/


[ovirt-users] upgrading 4.0.1 to latest 4.2

2018-08-14 Thread Charles Kozler
Hello -

I am kicking around different ideas of how I could achieve this

I have been bitten, hard, with in place upgrades before so I am not really
wanting to do this unless its a complete last resort...and even then I'm
iffy

Really, the only configuration I have is 1 VDSM hook and about 20 networks.
The physical hardware is not going anywhere. Out of my three nodes, I can
push all VM's to the other 2 nodes and do a fresh build on 1 node

I am hoping there may be a way for me to do a fresh install of 4.2, export
engine config from 4.0.1, and then import it in some way - is this
possible? Quick google doesnt yield much and searches for 'export' usually
result in storage related results
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TKRWHYJGUUC52SWTCGAX43TTM5DDJJIJ/


Re: [ovirt-users] Juniper vSRX Cluster on oVirt/RHEV

2018-03-23 Thread Charles Kozler
Truth be told I dont really know. What I am going to be doing with it is
pretty much mostly some lab stuff and get working with VRF's a bit

There is a known limitation with virtio backend driver uses interrupt mode
to receive packets and vSRX uses DPDK -
https://dpdk.readthedocs.io/en/stable/nics/virtio.html which in turn
creates a bottleneck in to the guest VM. It is more ideal to use something
like SR-IOV instead and remove as many buffer layers as possible with PCI
passthrough

One easier way too is to use DPDK OVS. I know ovirt supports OVS in later
versions more natively so I just didnt go after it and I dont know if there
is any difference between just regular OVS and DPDK OVS. I dont have a huge
requirement of insane throughput, just need to get packets from amazon back
to my lab and support overlapping subnets

This exercise was somewhat of a POC for me to see if it can be done. A lot
of Junipers documentation does not take in to account such things as ovirt
or proxmox or any linux overlay to hypervisors like it does for vmware /
vcenter which is no fault of their own. They assume flat KVM host (or 2 if
clustered) whereas stuff like ovirt can introduce variables (eg: no MAC
spoofing)

On Fri, Mar 23, 2018 at 3:27 PM, FERNANDO FREDIANI <
fernando.fredi...@upx.com> wrote:

> Out of curiosity how much traffic can it handle running in these Virtual
> Machines on the top of reasonable hardware ?
>
> Fernando
>
> 2018-03-23 4:58 GMT-03:00 Joop <jvdw...@xs4all.nl>:
>
>> On 22-3-2018 10:17, Yaniv Kaul wrote:
>>
>>
>>
>> On Wed, Mar 21, 2018 at 10:37 PM, Charles Kozler < <ckozler...@gmail.com>
>> ckozler...@gmail.com> wrote:
>>
>>> Hi All -
>>>
>>> Recently did this and thought it would be worth documenting. I couldnt
>>> find any solid information on vsrx with kvm outside of flat KVM. This
>>> outlines some of the things I hit along the way and how to fix. This is my
>>> one small way of giving back to such an incredible open source tool
>>>
>>> https://ckozler.net/vsrx-cluster-on-ovirtrhev/
>>>
>>
>> Thanks for sharing!
>> Why didn't you just upload the qcow2 disk via the UI/API though?
>> There's quite a bit of manual work that I hope is not needed?
>>
>> @Work we're using Juniper too and oud of curiosity I downloaded the qcow2
>> image and used the UI to upload it and add it to a VM. It just works :-)
>> oVirt++
>>
>> Joop
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Juniper vSRX Cluster on oVirt/RHEV

2018-03-23 Thread Charles Kozler
I hit a lot of errors when I tried to upload through the web UI. I tried
both remote URI and local file and both failed for me. I cant remember
exactly what they were but I recall its where I spent a lot of time
initially. I think it had something to do with the ovirt-imageio
function...something around that I couldnt get working right. Also, doing
the way I did it allowed me to quickly restart if I needed to by creating
an alias around dd command. I had to restart a bunch so it was useful. I
did this all on 4.0.1.1-1.el7.centos

On Fri, Mar 23, 2018 at 3:58 AM, Joop <jvdw...@xs4all.nl> wrote:

> On 22-3-2018 10:17, Yaniv Kaul wrote:
>
>
>
> On Wed, Mar 21, 2018 at 10:37 PM, Charles Kozler < <ckozler...@gmail.com>
> ckozler...@gmail.com> wrote:
>
>> Hi All -
>>
>> Recently did this and thought it would be worth documenting. I couldnt
>> find any solid information on vsrx with kvm outside of flat KVM. This
>> outlines some of the things I hit along the way and how to fix. This is my
>> one small way of giving back to such an incredible open source tool
>>
>> https://ckozler.net/vsrx-cluster-on-ovirtrhev/
>>
>
> Thanks for sharing!
> Why didn't you just upload the qcow2 disk via the UI/API though?
> There's quite a bit of manual work that I hope is not needed?
>
> @Work we're using Juniper too and oud of curiosity I downloaded the qcow2
> image and used the UI to upload it and add it to a VM. It just works :-)
> oVirt++
>
> Joop
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Juniper vSRX Cluster on oVirt/RHEV

2018-03-21 Thread Charles Kozler
Hi All -

Recently did this and thought it would be worth documenting. I couldnt find
any solid information on vsrx with kvm outside of flat KVM. This outlines
some of the things I hit along the way and how to fix. This is my one small
way of giving back to such an incredible open source tool

https://ckozler.net/vsrx-cluster-on-ovirtrhev/
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Why Node was rebooting?

2017-11-25 Thread Charles Kozler
Did you setup fencing?

I've also seen this behavior with stressed CPU and NMI watch dog in BIOS
rebooting a server but that was on freebsd. Have not seen it on Linux

On Nov 25, 2017 2:07 PM, "Jonathan Baecker"  wrote:

> Hello community,
>
> yesterday evening one of our nodes was rebooted, but I have not found out
> why. The engine only reports this:
>
> 24.11.2017 22:01:43 Storage Pool Manager runs on Host onode-1 (Address:
> onode-1.worknet.lan).
> 24.11.2017 21:58:50 Failed to verify Host onode-1 power management.
> 24.11.2017 21:58:50 Status of host onode-1 was set to Up.
> 24.11.2017 21:58:41 Successfully refreshed the capabilities of host
> onode-1.
> 24.11.2017 21:58:37 VDSM onode-1 command GetCapabilitiesVDS failed: Client
> close
> 24.11.2017 21:58:37 VDSM onode-1 command HSMGetAllTasksStatusesVDS failed:
> Not SPM: ()
> 24.11.2017 21:58:22 Host onode-1 is rebooting.
> 24.11.2017 21:58:22 Kdump flow is not in progress on host onode-1.
> 24.11.2017 21:57:51 Host onode-1 is non responsive.
> 24.11.2017 21:57:51 VM playout was set to the Unknown status.
> 24.11.2017 21:57:51 VM gogs was set to the Unknown status.
> 24.11.2017 21:57:51 VM Windows2008 was set to the Unknown status.
> [...]
>
> There is no crash report, and no relevant errors in dmesg.
>
> Does the engine send a reboot command to the node, when it gets no
> responds? Is there any other way to found out why the node was rebooting?
> The node hangs on a usv and all other servers was running well...
>
> In the time, when the reboot was happen, I had a bigger video compression
> job in one of the VMs, so maybe the CPUs got a bit stressed, but they are
> not over committed.
>
>
> Regards
>
> Jonathan
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HEVM root reset?

2017-10-28 Thread Charles Kozler
You might be better off rebuilding. It shouldnt be that hard albeit
slightly time consuming. Is this a production environment or is it a
lab/test area?

Two possible ways for this, I think...

1.) Not sure if HE will use it but if you right click on the HostedEngine
VM in the web UI and click edit, then "Initial Run" you may be able to use
this and set a new root password there and then reboot it. oVirt HE should
pick up the new settings, however, I have only used this on glance VM
templates and had moderate success there so not sure if this will work for
HE
or
2.) And be very cautious, again could be a complete hackshut down the
hosted engine and navigate to its disk storage. Take a backup of the image
with qemu-img or rsync --sparse. You can then mount the HE raw disk with a
loopback device, activate the LVMs, and then remove the root password from
there. Not sure if RHEL7, systemd, or SELinux will block this on reboot,
but it'd be my last ditch resort but only after rebuilding the cluster is
out of the question

On Sat, Oct 28, 2017 at 5:25 PM, Chas Ecomm  wrote:

> TIA for any guidance you can give me here, I’m in a bit of a pickle and
> have spent many hours reading many posts and blogs that don’t seem to get
> me where I need to be, and despite my gut feeling that someone has had to
> have dealt with this somewhere else, my google-fu is failing me in finding
> anyone who has dealt with this particular problem.
>
>
>
> My issue:
>
>
>
> I’ve inherited a 4.1 hosted-engine setup.  This is for a non-profit, so
> the previous consultant did what she could to save capital expense by using
> older gear and oVirt rather than VMware/Hyper-V and newer equipment.  For
> the most part it has worked quite well as I understand it.  This particular
> setup has gone from 3.5 with a standalone engine to 4.1 with a hosted
> engine, in case that matters.
>
>
>
> The VMs hosted on the 2 clusters associated with this engine are currently
> working fine, but I am trying to get into the hosted-engine VM, and either
> there was a problem with the root password during setup and the
> hosted-engine-setup script didn’t catch it or I’ve been given a bad
> password.  The previous admin didn’t setup any alternative users, which is
> a major no-no in my book, and so I’m trying to do that – but I can’t login
> to the VM.  I can log into the portal and manage the hosts, storage, VMs,
> etc., just not the HEVM.  As I understand it, even if I could set aside my
> need for alternate users, when it comes time to upgrade I will need access
> to the HEVM, so I have to solve this at some point.
>
>
>
> If this were a physical machine, I’d reboot in single user mode and work
> at things from there, but it’s not, and I’ve not found a good guide to get
> to the engine console to put it into single user mode.  I’ve found lots
> (and lots, and lots) of links on the oVirt docs site regarding hosted
> engine setups that simply don’t exist as pages anymore and others that
> don’t address this issue, as well as a couple of links on the RedHat site
> that made me think I could connect via ssh to the host running the VM, and
> do some sort of X forwarding, but I haven’t come anywhere close to success
> with that, and since I can’t log into the VM, I’m not sure how that would
> work anyway.  I’ve always struggled with X forwarding, too, so that doesn’t
> help, I’m sure.
>
>
>
> I read a few posts from what look like the early days of the hosted-engine
> OVA implying you could launch a console from the management portal and
> maybe try to reboot and set single user mode there, but they all
> dead-ended.  Also, I may be confused on how the hosted engine works (I find
> this much more confusing than either VMware or Hyper-V), but if I’m
> connected to the console via the oVirt management portal and I reboot the
> VM to try and get into single user mode, wouldn’t I lose my connection and
> still not be able to get into single user mode?
>
>
>
> I can connect via hosted-engine –console, but that asks me for the root
> password, which of course I don’t have.
>
>
>
> Am I just doomed to have to rebuild this whole set of servers from scratch
> or is there some way I could either re-run hosted-engine –deploy so I can
> set the root password and not lose my current config; or alternatively is
> there a way to get the VM into single user mode and accessible so I can use
> normal Linux practices for a lost root password?
>
>
>
> Thanks for listening and for any help you can possibly give.  I’m sure
> there’s some simple thing that I’ve overlooked, but after hours and hours
> of trying to solve this one on my own, I have to admit the need for help.
>
>
>
> Thank you
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org

[ovirt-users] 4.1 to 4.2 upgrade

2017-10-14 Thread Charles Kozler
hello,

What would be the process to go from current 4.1 to 4.2-pre?

I have been demoing 4.2 on my own lab environment and the WebUI, for my
user base, is entirely more intuitive than the previous

My 4.1 is relatively new so there isnt much config done to it, that said,
wondering what it would take to update ovirt HE w/ 3 nodes + GlusterFS (not
managed via ovirt) to 4.2-pre and what it would like like to move back from
a -pre channel to mainline

Alternatively, I could wait for 4.2 to go main, do you have an estimated
date for that?

Thanks!
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Help - Host unresponsive!

2017-10-07 Thread Charles Kozler
Most answers reside in the agent.log under /var/log/hosted-engine-ha/. Tail
that for a bit and see what pops up. You can also review vdsm log under
/var/log

I've found that looking at these two logs and just trying a few things
yields results

On Sat, Oct 7, 2017 at 5:18 PM, Wesley Stewart  wrote:

> I have a single server that is running ovirt as a host and a storage
> domain.  I ran an update when the GUI told me that there was one.
>
> I shutdown all of the VM's, placed the host in maintenance mode and did
> the update.  Upon rebooting the server, I was able to pull my VM's online
> without any issue, but then, all of a sudden, vm's stopped responding as
> well as the host.
>
> The host is showing as "Down" and I cannot "Activate" it.  Also
> "Maintenance Mode" is greyed out.
>
> Storage pools are "down" but eventually come back online, but the host
> still will not come online.
>
> Every few minutes, the host goes to "UP" right before failing and going
> back "Down".  To make it more confusing, like I mentioned before, this is
> all in one box, so it isn't like their is a network issue between the
> storage and the host boxes or anything.
>
> Currently trying to reinstall the host and see if that helps, but I would
> very much appreciate any sort of guidance, support or ideas!
>
>
> Fencing failed on Storage Pool Manager OVIRT-Host for Data Center
> OVIRT-Datacenter. Setting status to Non-Operational.
> Host OVIRT-Host failed to recover.
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Question about cold start

2017-10-04 Thread Charles Kozler
I believe you would accomplish this by setting a VM to be highly available
(like the engine). Then engine makes sure this VM is up on at least one
node through lease agreements (IIRC). In either case, I think this is what
you want

On Wed, Oct 4, 2017 at 10:25 AM, Chris Adams  wrote:

> I have an oVirt cluster that was hard shutdown last night (fire is bad,
> and firemen killed the generators for their safety).  When it came back
> up, it did not start any VMs other than the hosted engine.
>
> Is that expected?  I know this is not a normal use case, but is there a
> way to set VMs to start on cluster boot?
>
> --
> Chris Adams 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine setup question

2017-10-02 Thread Charles Kozler
I did a 3.6 to 4.1 like this. I moved all of my VMs to a new storage domain
(the other was hyperconverged gluster) and then took a full outage, shut
down all of my VMs, detached from 3.6, and imported on 4.1. I had no issues
other than expected mac address changes, but I think you can manually
override this in the engine somewhere

If you are worried, do it with one VM. Create a new storage domain that
both clusters can "see", move one VM to the domain on 3.6, detach, and
import to 3.1. Bring the VM up

If it is Linux VM's older than systemd and using sysvinit, you will hit
issues where your MAC address will change and udev will move it to eth#
wherever # is the next available NIC in your VM host

On Mon, Oct 2, 2017 at 12:54 PM, Demeter Tibor  wrote:

> Hi,
> Can anyone answer my questions?
>
> Thanks in advance,
> R,
>
> Tibor
>
> - 2017. szept.. 19., 8:31, Demeter Tibor  írta:
>
>
> - I have a productive ovirt cluster based on 3.5 series. This using a
> shared nfs storage.  Is it possible to migrate VMs from 3.5 to 4.1 with
> detach shared storage from the old cluster and attach it to the new
> cluster?
> - If yes what will happend with the VM properies? For example mac
> addresses, limits, etc. Those will be migrated or not?
>
> Thanks in advance,
> Regard
>
>
> Tibor
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] deprecating export domain?

2017-10-02 Thread Charles Kozler
Thank you for clearing this up for me everyone. My concern that something
like the export domain wasnt going to exist and it was just going to be
deprecated with no alternative. Glad to hear all the news of the SD

On Mon, Oct 2, 2017 at 8:31 AM, Pavel Gashev  wrote:

> Maor,
>
> Could you please clarify, what would be the process of making backup of a
> running VM to an existing backup storage domain?
>
> I’m asking because it looks like the process is going to be quite the same:
> 1. Clone VM from a snapshot
> 2. Move the cloned VM to a backup storage domain
>
> An ability of choosing destination storage for cloned VMs would increase
> backup efficiency. On the other hand, an ability of exporting VM from a
> snapshot would increase the efficiency in the same way even without
> creating new entity.
>
> Indeed, Backup SDs would increase efficiency of disaster recovery. But the
> same would be achieved by converting Export SDs to Data SDs using a small
> CLI utility.
>
>
> On 01/10/2017, 15:32, "users-boun...@ovirt.org on behalf of Maor Lipchuk"
>  wrote:
>
> On Sun, Oct 1, 2017 at 2:50 PM, Nir Soffer  wrote:
> >
> > Attaching and detaching data domain was not designed for backing up
> vms.
> > How would you use it for backup?
> >
> > How do you ensure that a backup clone of a vm is not started by
> mistake,
> > changing the backup contents?
>
> That is a good question.
> We recently introduced a new feature called "backup storage domain"
> which you can mark the storage domain as backup storage domain.
> That can guarantee that no VMs will run with disks/leases reside on
> the storage domain.
> The feature should already exist in oVirt 4.2 (despite a bug that
> should be handled with this patch https://gerrit.ovirt.org/#/c/81290/)
> You can find more information on this here:
>   https://github.com/shubham0d/ovirt-site/blob/
> 41dcb0f1791d90d1ae0ac43cd34a399cfedf54d8/source/develop/
> release-management/features/storage/backup-storage-domain.html.md
>
> Basically the OVF that is being saved in the export domain should be
> similar to the same one that is being saved in the OVF_STORE disk in
> the storage domain.
> If the user manages replication on that storage domain it can be
> re-used for backup purposes by importing it to a setup.
> Actually it is much more efficient to use a data storage domain than
> to use the copy operation to/from the export storage domain.
>
> >
> > Nir
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] deprecating export domain?

2017-09-30 Thread Charles Kozler
Hello,

I recently read on this list from a redhat member that export domain is
either being deprecated or looking at being deprecated

To that end, can you share details? Can you share any notes/postings/bz's
that document this? I would imagine something like this would be discussed
in larger audience

This seems like a somewhat significant change to make and I am curious
where this is scheduled? Currently, a lot of my backups rely explicitly on
an export domain for online snapshots, so I'd like to plan accordingly

Thanks!
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hyper converged network setup

2017-09-26 Thread Charles Kozler
Bharat -

Can you supply the following

grep -i processor /proc/cpuinfo
free -m
gluster volume info

>From each node. Maybe post to pastebin or something so its easier to read

On Tue, Sep 26, 2017 at 3:54 AM, Tailor, Bharat <
bha...@synergysystemsindia.com> wrote:

> Hi,
>
> I've reinstalled Centos minimal on 3 hosts.
> Configured replica 3 gluster storage for engine & data.
> Configured dns entry in /etc/hosts for all three host and ovirt engine
> also.
> *yum install ovirt-engine-appliance*
> *hosted-engine --deploy*
>
> Choose glusterfs storage for engine. Installation failed at last step
> "This system is not reliable".
>
> My host have low memory & processor so don't know it might be the issue.
>
>
> Please help.
>
>
> Regrards
> Bharat Kumar
>
> G15- Vinayak Nagar complex,Opp.Maa Satiya, ayad
> Udaipur (Raj.)
> 313001
> Mob: +91-9950-9960-25
>
>
>
>
>
> On Fri, Sep 15, 2017 at 12:51 PM, Tailor, Bharat <
> bha...@synergysystemsindia.com> wrote:
>
>> Hi,
>>
>> Any help from you guys would be appreciated.
>>
>> Regrards
>> Bharat Kumar
>>
>> G15- Vinayak Nagar complex,Opp.Maa Satiya, ayad
>> Udaipur (Raj.)
>> 313001
>> Mob: +91-9950-9960-25
>>
>>
>>
>>
>>
>> On Thu, Sep 14, 2017 at 7:02 PM, Tailor, Bharat <
>> bha...@synergysystemsindia.com> wrote:
>>
>>> Hi,
>>>
>>> I've reinstall Centos7 Minimal on all three hosts. Installed Ovirt self
>>> hosted ova 4.1 on test1.localdomain.
>>> After installation completed, I got an error like engine vm is
>>> unreachable. I am able to ping engine from both IP and FQDN but unable to
>>> access it in browser. When I check engine VM status I got an error "Failed
>>> to connect to broker".
>>> I've enclosed engine-vm config snap and Error snap for more details.
>>> Kind help to resolve it.
>>>
>>>
>>> Regrards
>>> Bharat Kumar
>>>
>>> G15- Vinayak Nagar complex,Opp.Maa Satiya, ayad
>>> Udaipur (Raj.)
>>> 313001
>>> Mob: +91-9950-9960-25
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Sep 13, 2017 at 5:14 PM, Tailor, Bharat <
>>> bha...@synergysystemsindia.com> wrote:
>>>
>>>> I can't. I've destroyed that VM and clean all files. Now I am trying to
>>>> reinstall new engine VM.
>>>>
>>>> Regrards
>>>> Bharat Kumar
>>>>
>>>> G15- Vinayak Nagar complex,Opp.Maa Satiya, ayad
>>>> Udaipur (Raj.)
>>>> 313001
>>>> Mob: +91-9950-9960-25
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Sep 13, 2017 at 5:12 PM, Donny Davis <do...@fortnebula.com>
>>>> wrote:
>>>>
>>>>> Can you ping the VM when it comes up?
>>>>>
>>>>> On Wed, Sep 13, 2017 at 7:40 AM, Tailor, Bharat <
>>>>> bha...@synergysystemsindia.com> wrote:
>>>>>
>>>>>> I am not using any DNS server. I have made entries in /etc/hosts for
>>>>>> all Nodes and for engine VM also.
>>>>>>
>>>>>> Regrards
>>>>>> Bharat Kumar
>>>>>>
>>>>>> G15- Vinayak Nagar complex,Opp.Maa Satiya, ayad
>>>>>> Udaipur (Raj.)
>>>>>> 313001
>>>>>> Mob: +91-9950-9960-25
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 13, 2017 at 5:02 PM, Donny Davis <do...@fortnebula.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Bharat,
>>>>>>>
>>>>>>> Are you using DNS for host names or /etc/hosts
>>>>>>>
>>>>>>> I personally place the engines hostname with ip address in
>>>>>>> /etc/hosts on all the hypervisors in case my DNS services go down.
>>>>>>> I also put the hypervisors in /etc/hosts too
>>>>>>>
>>>>>>> Hope this helps.
>>>>>>>
>>>>>>> On Wed, Sep 13, 2017 at 6:04 AM, Yaniv Kaul <yk...@redhat.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Sep 13, 2017 at 8:42 AM, Tailor, Bharat <
>>>>>&

Re: [ovirt-users] trunk port for vm guest

2017-09-21 Thread Charles Kozler
So I tried to set this up. I configured an untagged network and attached it
to all of my hosts. I setup a VM and correctly setup a trunk port but no
traffic is passed

Can anyone assist?

On Wed, Sep 20, 2017 at 9:09 AM, Charles Kozler <ckozler...@gmail.com>
wrote:

> Hello,
>
> I have seen mixed results for this search on this list so I'd like to
> clear it up
>
> I have a VM that I need to configure with a trunk port so that all of my
> VLANs can be configured on it but not as separate NICs (eth0, eth1, eth2,
> etc)
>
> I have seen on this list about a year ago someone said to setup a network
> and leave it untagged and ovirt will pass all tagged and untagged packets
> there. Then attach this to the guest VM and configure the trunk port inside
> the VM (eth0:)
>
> However, I saw also about 6 months ago someone suggest something like VLAN
> tag 4095 so wondering if the support came in for this...or I should use
> untagged VM network
>
> So - how can I configure a trunk port for a guest VM?
>
> Thanks
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Where can we get emergency support for oVirt ?

2017-09-21 Thread Charles Kozler
https://www.ovirt.org/community/user-stories/users-and-providers/

BobCares is based out of India, so know that. They seemed responsive and
helpful and eager for business and generally knowledgeable in initial talks

I'd recommend you check out CornerStone. They are states local (I think NC
or SC) and Gary is a great guy (I am not affiliated with them). The link is
broken on that page so use https://cornerstonets.net/

On Thu, Sep 21, 2017 at 4:33 PM, netad  wrote:

> We have been unable to find anyone who can help fix an urgent problem.
>
> Please advise.
>
> Thank you/
>
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] trunk port for vm guest

2017-09-20 Thread Charles Kozler
Hello,

I have seen mixed results for this search on this list so I'd like to clear
it up

I have a VM that I need to configure with a trunk port so that all of my
VLANs can be configured on it but not as separate NICs (eth0, eth1, eth2,
etc)

I have seen on this list about a year ago someone said to setup a network
and leave it untagged and ovirt will pass all tagged and untagged packets
there. Then attach this to the guest VM and configure the trunk port inside
the VM (eth0:)

However, I saw also about 6 months ago someone suggest something like VLAN
tag 4095 so wondering if the support came in for this...or I should use
untagged VM network

So - how can I configure a trunk port for a guest VM?

Thanks
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] engine randomly updated 1 package on all my hosts overnight

2017-09-17 Thread Charles Kozler
Yedidyah - yes, update would be best of course. I've had this HIDS running
for a little over a year or so and never saw this before so was a little
weary

On Sun, Sep 17, 2017 at 3:00 AM, Yedidyah Bar David <d...@redhat.com> wrote:

> On Fri, Sep 15, 2017 at 5:12 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
> > Hello -
> >
> > Can anyone just briefly tell me if this is expected behavior or not?
> >
> > I know you can tell the engine to update hosts, but nobody was using the
> > engine and I see the engine logging in and the yum command being run so
> I am
> > curious if this is expected or not?
>
> It is, unless you have otopi-1.6.2 or later:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1405838
>
> >
> > On Thu, Sep 14, 2017 at 10:54 AM, Charles Kozler <ckozler...@gmail.com>
> > wrote:
> >>
> >> I received an alert from OSSEC HIDS that a package was installed at
> 00:59.
> >> Nobody uses this infrastructure but me
> >>
> >> Upon investigation I find this
> >>
> >> Sep 14 00:59:18 ovirthost1 sshd[93263]: Accepted publickey for root from
> >> 10.0.16.50 port 50197 ssh2: RSA
> >> 1c:fc:0d:b8:40:2c:bf:87:f7:8f:b2:52:0b:c4:f6:4d
> >> Sep 14 00:59:18 ovirthost1 sshd[93263]: pam_unix(sshd:session): session
> >> opened for user root by (uid=0)
> >> Sep 14 00:59:46 ovirthost1 sshd[93263]: pam_unix(sshd:session): session
> >> closed for user root
> >>
> >> 10.0.16.50 is my ovirt engine
> >>
> >> And the yum log
> >>
> >> Sep 14 00:59:28 Updated: iproute-3.10.0-87.el7.x86_64
> >>
> >> However, what is baffling to me is that this is a cluster I setup about
> 9
> >> months ago and have not updated at all (its a testing env for VM
> systems)
> >>
> >> Why would ovirt seemingly randomly update and install a package? I know
> >> the engine checks for updates on hosts but this is the first time in my
> time
> >> using ovirt that ovirt instructed a host to install a package. This
> occurred
> >> on all of my ovirt nodes in this infrastructure (3)
>
> Probably the reason this didn't happen before is a mere coincidence -
> there are not many updates to 'iproute', or you did update it manually
> in other cases, or something like that.
>
> >>
> >> ovirt Version 4.0.1.1-1.el7.centos
>
> Most likely you have otopi-1.5.2, which does not have above bug fixed.
>
> You might consider upgrading to oVirt 4.1.
>
> Best,
> --
> Didi
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] engine randomly updated 1 package on all my hosts overnight

2017-09-17 Thread Charles Kozler
Thanks for confirming

On Sun, Sep 17, 2017 at 11:05 AM, Christopher Cox <c...@endlessnow.com>
wrote:

> On 09/17/2017 02:00 AM, Yedidyah Bar David wrote:
>
>> On Fri, Sep 15, 2017 at 5:12 PM, Charles Kozler <ckozler...@gmail.com>
>> wrote:
>>
>>> Hello -
>>>
>>> Can anyone just briefly tell me if this is expected behavior or not?
>>>
>>> I know you can tell the engine to update hosts, but nobody was using the
>>> engine and I see the engine logging in and the yum command being run so
>>> I am
>>> curious if this is expected or not?
>>>
>>
> My iproute mysteriously updated as well.
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] centos 7 kernel 693 extremely slow boot on ovirt version 4.1.5.2-1.el7.centos

2017-09-15 Thread Charles Kozler
removing all cloud-* services from boot seems to have fixed this and as
well, for some reason or another after doing a yum update, all of my
network configurations in the VM were lost. I guess I missed the supporting
documentation on glance and what to expect from using them? Having removed
them from boot I still get a brief hang on qxl but its about 5-10 seconds
then goes to login

Where can I follow up on reporting issues with glance images?

On Fri, Sep 15, 2017 at 11:57 PM, Charles Kozler <ckozler...@gmail.com>
wrote:

> I have tried this multiple times
>
> I imported glance image latest centos 7 as a template. Made a VM from it
> and rebooted it multiple times testing some apps. Reboot was always 30
> seconds or less
>
> I did a full yum update and then reboot
>
> on reboot I am hung up at "[drm] initalized qxl 0.1.0 20120117 for
> 000:00:02.0 on minor 0" for between 3-5 minutes (varies, but usually
> averages there)
>
> https://i.imgur.com/ZAhf4je.png for reference
>
> QXL I know is the console. I tried setting it to VGA and it hung on about
> the same part except it doesnt reference QXL, it references another device
> driver. I am guessing there is some issue with the display drivers?
>
> Not really sure what to say about this...is this a KVM issue? A centos 7
> kernel issue? Or is ovirt booting the VM via KVM with a parameter that is
> new-ish and causing some conflict with new kernel? or did something update
> in the image that I wasnt aware that is causing slow boot now?
>
> Host(s) kernel is 3.10.0-514.26.2.el7.x86_64
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] centos 7 kernel 693 extremely slow boot on ovirt version 4.1.5.2-1.el7.centos

2017-09-15 Thread Charles Kozler
I have tried this multiple times

I imported glance image latest centos 7 as a template. Made a VM from it
and rebooted it multiple times testing some apps. Reboot was always 30
seconds or less

I did a full yum update and then reboot

on reboot I am hung up at "[drm] initalized qxl 0.1.0 20120117 for
000:00:02.0 on minor 0" for between 3-5 minutes (varies, but usually
averages there)

https://i.imgur.com/ZAhf4je.png for reference

QXL I know is the console. I tried setting it to VGA and it hung on about
the same part except it doesnt reference QXL, it references another device
driver. I am guessing there is some issue with the display drivers?

Not really sure what to say about this...is this a KVM issue? A centos 7
kernel issue? Or is ovirt booting the VM via KVM with a parameter that is
new-ish and causing some conflict with new kernel? or did something update
in the image that I wasnt aware that is causing slow boot now?

Host(s) kernel is 3.10.0-514.26.2.el7.x86_64
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] engine randomly updated 1 package on all my hosts overnight

2017-09-15 Thread Charles Kozler
Hello -

Can anyone just briefly tell me if this is expected behavior or not?

I know you can tell the engine to update hosts, but nobody was using the
engine and I see the engine logging in and the yum command being run so I
am curious if this is expected or not?

On Thu, Sep 14, 2017 at 10:54 AM, Charles Kozler <ckozler...@gmail.com>
wrote:

> I received an alert from OSSEC HIDS that a package was installed at 00:59.
> Nobody uses this infrastructure but me
>
> Upon investigation I find this
>
> Sep 14 00:59:18 ovirthost1 sshd[93263]: Accepted publickey for root from
> 10.0.16.50 port 50197 ssh2: RSA 1c:fc:0d:b8:40:2c:bf:87:f7:8f:
> b2:52:0b:c4:f6:4d
> Sep 14 00:59:18 ovirthost1 sshd[93263]: pam_unix(sshd:session): session
> opened for user root by (uid=0)
> Sep 14 00:59:46 ovirthost1 sshd[93263]: pam_unix(sshd:session): session
> closed for user root
>
> 10.0.16.50 is my ovirt engine
>
> And the yum log
>
> Sep 14 00:59:28 Updated: iproute-3.10.0-87.el7.x86_64
>
> However, what is baffling to me is that this is a cluster I setup about 9
> months ago and have not updated at all (its a testing env for VM systems)
>
> Why would ovirt seemingly randomly update and install a package? I know
> the engine checks for updates on hosts but this is the first time in my
> time using ovirt that ovirt instructed a host to install a package. This
> occurred on all of my ovirt nodes in this infrastructure (3)
>
> ovirt Version 4.0.1.1-1.el7.centos
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Different link speeds in LACP LAG?

2017-09-14 Thread Charles Kozler
Sorry, I meant "remove the static configuration of 1G full duplex and set
back to auto-negotiation to the 10G when ready"

On Thu, Sep 14, 2017 at 6:19 PM, Charles Kozler <ckozler...@gmail.com>
wrote:

> You could, I believe, turn off auto-negotiation and set it 1G full duplex
> on both sides and then add the links in and then remove the old 1G's when
> ready then remove auto-negotiation from the 10G
>
> But, it would ultimately be much, much easier to create a new LAG and then
> take an outage on ovirt and move everything to the new one
>
> On Thu, Sep 14, 2017 at 6:51 AM, Misak Khachatryan <kmi...@gmail.com>
> wrote:
>
>> Hi,
>>
>> JunOS supports LAG over links with different speeds, so if you have MX
>> series routers in-between, you can try to accomplish that.
>> But it's always better to be on safe side, it's very risky to use in
>> production, IMHO.
>>
>> Best regards,
>> Misak Khachatryan
>>
>>
>> On Thu, Sep 14, 2017 at 2:41 PM, Yaniv Kaul <yk...@redhat.com> wrote:
>> >
>> >
>> > On Thu, Sep 14, 2017 at 1:21 AM, Chris Adams <c...@cmadams.net> wrote:
>> >>
>> >> I have a small oVirt setup for one customer, with two servers each
>> >> connected to a two-switch stack with 1G links.  Now the customer would
>> >> like to upgrade the server links to 10G.  My question is this: can I
>> add
>> >> a 10G NIC and do this with minimal "fuss" by just adding the 10G links
>> >> to the same LAG, then removing the 1G links?  I would have the host in
>> >> maintenance mode no matter what.
>> >
>> >
>> > I highly doubt that's feasible. They usually are in the same speeds...
>> > Y.
>> >
>> >>
>> >>
>> >> I haven't checked the switch to see if it'll support that yet, figured
>> >> I'd start on the oVirt side.
>> >>
>> >> --
>> >> Chris Adams <c...@cmadams.net>
>> >> ___
>> >> Users mailing list
>> >> Users@ovirt.org
>> >> http://lists.ovirt.org/mailman/listinfo/users
>> >
>> >
>> >
>> > ___
>> > Users mailing list
>> > Users@ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/users
>> >
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Different link speeds in LACP LAG?

2017-09-14 Thread Charles Kozler
You could, I believe, turn off auto-negotiation and set it 1G full duplex
on both sides and then add the links in and then remove the old 1G's when
ready then remove auto-negotiation from the 10G

But, it would ultimately be much, much easier to create a new LAG and then
take an outage on ovirt and move everything to the new one

On Thu, Sep 14, 2017 at 6:51 AM, Misak Khachatryan  wrote:

> Hi,
>
> JunOS supports LAG over links with different speeds, so if you have MX
> series routers in-between, you can try to accomplish that.
> But it's always better to be on safe side, it's very risky to use in
> production, IMHO.
>
> Best regards,
> Misak Khachatryan
>
>
> On Thu, Sep 14, 2017 at 2:41 PM, Yaniv Kaul  wrote:
> >
> >
> > On Thu, Sep 14, 2017 at 1:21 AM, Chris Adams  wrote:
> >>
> >> I have a small oVirt setup for one customer, with two servers each
> >> connected to a two-switch stack with 1G links.  Now the customer would
> >> like to upgrade the server links to 10G.  My question is this: can I add
> >> a 10G NIC and do this with minimal "fuss" by just adding the 10G links
> >> to the same LAG, then removing the 1G links?  I would have the host in
> >> maintenance mode no matter what.
> >
> >
> > I highly doubt that's feasible. They usually are in the same speeds...
> > Y.
> >
> >>
> >>
> >> I haven't checked the switch to see if it'll support that yet, figured
> >> I'd start on the oVirt side.
> >>
> >> --
> >> Chris Adams 
> >> ___
> >> Users mailing list
> >> Users@ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >
> >
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] engine randomly updated 1 package on all my hosts overnight

2017-09-14 Thread Charles Kozler
I received an alert from OSSEC HIDS that a package was installed at 00:59.
Nobody uses this infrastructure but me

Upon investigation I find this

Sep 14 00:59:18 ovirthost1 sshd[93263]: Accepted publickey for root from
10.0.16.50 port 50197 ssh2: RSA
1c:fc:0d:b8:40:2c:bf:87:f7:8f:b2:52:0b:c4:f6:4d
Sep 14 00:59:18 ovirthost1 sshd[93263]: pam_unix(sshd:session): session
opened for user root by (uid=0)
Sep 14 00:59:46 ovirthost1 sshd[93263]: pam_unix(sshd:session): session
closed for user root

10.0.16.50 is my ovirt engine

And the yum log

Sep 14 00:59:28 Updated: iproute-3.10.0-87.el7.x86_64

However, what is baffling to me is that this is a cluster I setup about 9
months ago and have not updated at all (its a testing env for VM systems)

Why would ovirt seemingly randomly update and install a package? I know the
engine checks for updates on hosts but this is the first time in my time
using ovirt that ovirt instructed a host to install a package. This
occurred on all of my ovirt nodes in this infrastructure (3)

ovirt Version 4.0.1.1-1.el7.centos
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] why force bond* naming?

2017-09-13 Thread Charles Kozler
Hello -

I had setup my server somewhat before installing ovirt. I left em1 alone
for the ovirtmgmt but I had bonded two NICs on a separate card and named it
storage0 and storage1 accordingly.

Everything setup fine and when I went to add more networks for em3+em4 in
LACP named bond0, the web UI would not let me proceed because my two NICs
named storage0 and storage1. It didnt say directly it was those but it said
it needed "bond" in the name followed by a number. I shut everything down
and renamed them bond100 and bond101 and all was well

However, I cant stop and think that I was not trying to set those NICs
(storage0/1) up in ovirt so why would ovirt UI care about NICs in the
server that it wasnt even responsible for managing nor I had any
modifications to from the UI?

I get forcing it when you're adding through to UI but making the UI force
the user to use bond* for all of their NICs doesnt seem right

Thoughts?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hyperconverged question

2017-09-12 Thread Charles Kozler
So also on my engine storage domain. Shouldnt we see the mount options in
mount -l output? It appears fault tolerance worked (sort of - see more
below) during my test

[root@appovirtp01 ~]# grep -i mnt_options
/etc/ovirt-hosted-engine/hosted-engine.conf
mnt_options=backup-volfile-servers=n2:n3

[root@appovirtp02 ~]# grep -i mnt_options
/etc/ovirt-hosted-engine/hosted-engine.conf
mnt_options=backup-volfile-servers=n2:n3

[root@appovirtp03 ~]# grep -i mnt_options
/etc/ovirt-hosted-engine/hosted-engine.conf
mnt_options=backup-volfile-servers=n2:n3

Meanwhile not visible in mount -l output:

[root@appovirtp01 ~]# mount -l | grep -i n1:/engine
n1:/engine on /rhev/data-center/mnt/glusterSD/n1:_engine type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

[root@appovirtp02 ~]# mount -l | grep -i n1:/engine
n1:/engine on /rhev/data-center/mnt/glusterSD/n1:_engine type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

[root@appovirtp03 ~]# mount -l | grep -i n1:/engine
n1:/engine on /rhev/data-center/mnt/glusterSD/n1:_engine type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

So since everything is "pointed" at node 1 for engine storage, I decided to
hard shut down node 1 while hosted engine VM runs on node 3

The result was that after ~30 seconds the engine crashed likely because of
the gluster 42 second timeout. The hosted engine VM came back up (with node
1 still down) after about 5-7 minutes

Is this expected for the VM to go down? I thought gluster fuse mounted all
bricks in the volume
http://lists.gluster.org/pipermail/gluster-users/2015-May/021989.html so I
would imagine this to be more seamless?




On Tue, Sep 12, 2017 at 7:04 PM, Charles Kozler <ckozler...@gmail.com>
wrote:

> Hey All -
>
> So I havent tested this yet but what I do know is that I did setup
> backupvol option when I added the data gluster volume, however, mount
> options on mount -l do not show it as being used
>
> n1:/data on /rhev/data-center/mnt/glusterSD/n1:_data type fuse.glusterfs
> (rw,relatime,user_id=0,group_id=0,default_permissions,
> allow_other,max_read=131072)
>
> I will delete it and re-add it, but I think this might be part of the
> problem. Perhaps me and Jim have the same issue because oVirt is actually
> not passing the additional mount options from the web UI to the backend to
> mount with said parameters?
>
> Thoughts?
>
> On Mon, Sep 4, 2017 at 10:51 AM, FERNANDO FREDIANI <
> fernando.fredi...@upx.com> wrote:
>
>> I had the very same impression. It doesn't look like that it works then.
>> So for a fully redundant where you can loose a complete host you must have
>> at least 3 nodes then ?
>>
>> Fernando
>>
>> On 01/09/2017 12:53, Jim Kusznir wrote:
>>
>> Huh...Ok., how do I convert the arbitrar to full replica, then?  I was
>> misinformed when I created this setup.  I thought the arbitrator held
>> enough metadata that it could validate or refudiate  any one replica (kinda
>> like the parity drive for a RAID-4 array).  I was also under the impression
>> that one replica  + Arbitrator is enough to keep the array online and
>> functional.
>>
>> --Jim
>>
>> On Fri, Sep 1, 2017 at 5:22 AM, Charles Kozler <ckozler...@gmail.com>
>> wrote:
>>
>>> @ Jim - you have only two data volumes and lost quorum. Arbitrator only
>>> stores metadata, no actual files. So yes, you were running in degraded mode
>>> so some operations were hindered.
>>>
>>> @ Sahina - Yes, this actually worked fine for me once I did that.
>>> However, the issue I am still facing, is when I go to create a new gluster
>>> storage domain (replica 3, hyperconverged) and I tell it "Host to use" and
>>> I select that host. If I fail that host, all VMs halt. I do not recall this
>>> in 3.6 or early 4.0. This to me makes it seem like this is "pinning" a node
>>> to a volume and vice versa like you could, for instance, for a singular
>>> hyperconverged to ex: export a local disk via NFS and then mount it via
>>> ovirt domain. But of course, this has its caveats. To that end, I am using
>>> gluster replica 3, when configuring it I say "host to use: " node 1, then
>>> in the connection details I give it node1:/data. I fail node1, all VMs
>>> halt. Did I miss something?
>>>
>>> On Fri, Sep 1, 2017 at 2:13 AM, Sahina Bose <sab...@redhat.com> wrote:
>>>
>>>> To the OP question, when you set up a gluster storage domain, you need
>>>> to specify backup-volfile-servers=: where server2
>>>>

Re: [ovirt-users] hyperconverged question

2017-09-12 Thread Charles Kozler
Hey All -

So I havent tested this yet but what I do know is that I did setup
backupvol option when I added the data gluster volume, however, mount
options on mount -l do not show it as being used

n1:/data on /rhev/data-center/mnt/glusterSD/n1:_data type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

I will delete it and re-add it, but I think this might be part of the
problem. Perhaps me and Jim have the same issue because oVirt is actually
not passing the additional mount options from the web UI to the backend to
mount with said parameters?

Thoughts?

On Mon, Sep 4, 2017 at 10:51 AM, FERNANDO FREDIANI <
fernando.fredi...@upx.com> wrote:

> I had the very same impression. It doesn't look like that it works then.
> So for a fully redundant where you can loose a complete host you must have
> at least 3 nodes then ?
>
> Fernando
>
> On 01/09/2017 12:53, Jim Kusznir wrote:
>
> Huh...Ok., how do I convert the arbitrar to full replica, then?  I was
> misinformed when I created this setup.  I thought the arbitrator held
> enough metadata that it could validate or refudiate  any one replica (kinda
> like the parity drive for a RAID-4 array).  I was also under the impression
> that one replica  + Arbitrator is enough to keep the array online and
> functional.
>
> --Jim
>
> On Fri, Sep 1, 2017 at 5:22 AM, Charles Kozler <ckozler...@gmail.com>
> wrote:
>
>> @ Jim - you have only two data volumes and lost quorum. Arbitrator only
>> stores metadata, no actual files. So yes, you were running in degraded mode
>> so some operations were hindered.
>>
>> @ Sahina - Yes, this actually worked fine for me once I did that.
>> However, the issue I am still facing, is when I go to create a new gluster
>> storage domain (replica 3, hyperconverged) and I tell it "Host to use" and
>> I select that host. If I fail that host, all VMs halt. I do not recall this
>> in 3.6 or early 4.0. This to me makes it seem like this is "pinning" a node
>> to a volume and vice versa like you could, for instance, for a singular
>> hyperconverged to ex: export a local disk via NFS and then mount it via
>> ovirt domain. But of course, this has its caveats. To that end, I am using
>> gluster replica 3, when configuring it I say "host to use: " node 1, then
>> in the connection details I give it node1:/data. I fail node1, all VMs
>> halt. Did I miss something?
>>
>> On Fri, Sep 1, 2017 at 2:13 AM, Sahina Bose <sab...@redhat.com> wrote:
>>
>>> To the OP question, when you set up a gluster storage domain, you need
>>> to specify backup-volfile-servers=: where server2 and
>>> server3 also have bricks running. When server1 is down, and the volume is
>>> mounted again - server2 or server3 are queried to get the gluster volfiles.
>>>
>>> @Jim, if this does not work, are you using 4.1.5 build with libgfapi
>>> access? If not, please provide the vdsm and gluster mount logs to analyse
>>>
>>> If VMs go to paused state - this could mean the storage is not
>>> available. You can check "gluster volume status " to see if
>>> atleast 2 bricks are running.
>>>
>>> On Fri, Sep 1, 2017 at 11:31 AM, Johan Bernhardsson <jo...@kafit.se>
>>> wrote:
>>>
>>>> If gluster drops in quorum so that it has less votes than it should it
>>>> will stop file operations until quorum is back to normal.If i rember it
>>>> right you need two bricks to write for quorum to be met and that the
>>>> arbiter only is a vote to avoid split brain.
>>>>
>>>>
>>>> Basically what you have is a raid5 solution without a spare. And when
>>>> one disk dies it will run in degraded mode. And some raid systems will stop
>>>> the raid until you have removed the disk or forced it to run anyway.
>>>>
>>>> You can read up on it here: https://gluster.readthed
>>>> ocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
>>>>
>>>> /Johan
>>>>
>>>> On Thu, 2017-08-31 at 22:33 -0700, Jim Kusznir wrote:
>>>>
>>>> Hi all:
>>>>
>>>> Sorry to hijack the thread, but I was about to start essentially the
>>>> same thread.
>>>>
>>>> I have a 3 node cluster, all three are hosts and gluster nodes (replica
>>>> 2 + arbitrar).  I DO have the mnt_options=backup-volfile-servers= set:
>>>>
>>>> storage=192.168.8.11:/engine
>>>> mnt_options=backup-volfile-servers=192.168.8.12:192.168.8

Re: [ovirt-users] Hyper converged network setup

2017-09-12 Thread Charles Kozler
Bharat -

1. Yes. Will need to configure switch port as a trunk and setup your VLANs
and VLAN ID's
2. Yes
3. You can still access the hosts. The engine itself crashing or being down
wont stop your VMs or hosts or anything (unless fencing). You can use virsh
4. My suggestion here is start immediately after a fresh server install and
yum update. Installer does a lot and checks a lot and wont like things: ex
- you setup ovirtmgmt bridged network yourself
5. Yes. See #1, usually what I do is each ovirt node I have I set an IP of
.5, then .6, and so on. This way I can be sure my network itself is working
before adding a VM and attaching that NIC to it

On Tue, Sep 12, 2017 at 4:41 PM, Donny Davis <do...@fortnebula.com> wrote:

> 1. Yes, you can do this
> 2. Yes, In linux it's called bonding and this can be done from the UI
> 3. You can get around using the Engine machine if required with virsh or
> virt-manager - however I would just wait for the manager to migrate and
> start on another host in the cluster
> 4.  The deployment will take care of everything for you. You just need an
> IP
> 5. Yes, you can use vlans or virtual networking(NSXish) called OVS in
> oVirt.
>
> I noticed on your deployment machines 2 and 3 have the same IP. Might want
> to fix that before deploying
>
> Happy trails
> ~D
>
>
> On Tue, Sep 12, 2017 at 2:00 PM, Tailor, Bharat <
> bha...@synergysystemsindia.com> wrote:
>
>> Hi Charles,
>>
>> Thank you so much to share a cool stuff with us.
>>
>> My doubts are still not cleared.
>>
>>
>>1. What If I have only single Physical network adaptor? Can't I use
>>it for management network & production network both.
>>2. If I have two Physical network adaptor, Can I configure NIC
>>teaming as like Vmware ESXi.
>>3. What If my ovirt machine fails during production period? In vmware
>>we can access ESXi hosts and VM without Vcenter and do all the stuffs. Can
>>we do the same with Ovirt & KVM.
>>4. To deploy ovirt engine VM, what kind of configuration I'll have to
>>do on network adaptors? (eg. just configure IP on physical network or have
>>to create br0 for it.)
>>5. Can I make multiple VM networks for vlan configuration?
>>
>>
>> Regrards
>> Bharat Kumar
>>
>> G15- Vinayak Nagar complex,Opp.Maa Satiya, ayad
>> Udaipur (Raj.)
>> 313001
>> Mob: +91-9950-9960-25
>>
>>
>>
>>
>>
>> On Tue, Sep 12, 2017 at 9:30 PM, Charles Kozler <ckozler...@gmail.com>
>> wrote:
>>
>>>
>>> Interestingly enough I literally just went through this same thing with
>>> a slight variation.
>>>
>>> Note to the below: I am not sure if this would be considerd best
>>> practice or good for something long term support but I made due with what I
>>> had
>>>
>>> I had 10Gb cards for my storage network but no 10Gb switch, so I direct
>>> connected them with some fun routing and /etc/hosts settings. I also didnt
>>> want my storage network on a routed network (have firewalls in the way of
>>> VLANs) and I wanted the network separate from my ovirtmgmt - and, as I
>>> said, had no switches for 10Gb. Here is what you need at a bare minimum.
>>> Adapt / change it as you need
>>>
>>> 1 dedicated NIC on each node for ovirtmgmt. Ex: eth0
>>>
>>> 1 dedicated NIC to direct connect node 1 and node 2 - eth1 node1
>>> 1 dedicated NIC to direct connect node 1 and node 3 - eth2 node1
>>>
>>> 1 dedicated NIC to direct connect node 2 and node 1 - eth1 node2
>>> 1 dedicated NIC to direct connect node 2 and node 3 - eth2 node2
>>>
>>> 1 dedicated NIC to direct connect node 3 and node 1 - eth1 node3
>>> 1 dedicated NIC to direct connect node 3 and node 2 - eth2 node3
>>>
>>> You'll need custom routes too:
>>>
>>> Route to node 3 from node 1 via eth2
>>> Route to node 3 from node 2 via eth2
>>> Route to node 2 from node 3 via eth2
>>>
>>> Finally, entries in your /etc/hosts which match to your routes above
>>>
>>> Then, advisably, a dedicated NIC per box for VM network but you can
>>> leverage ovirtmgmt if you are just proofing this out
>>>
>>> At this point if you can reach all of your nodes via this direct connect
>>> IPs then you setup gluster as you normally would referencing your entries
>>> in /etc/hosts when you call "gluster volume create"
>>>
>>> In my setup, as I said, I had 2x 2 port PCIe 10Gb cards per server so I
>>&

Re: [ovirt-users] Hyper converged network setup

2017-09-12 Thread Charles Kozler
Interestingly enough I literally just went through this same thing with a
slight variation.

Note to the below: I am not sure if this would be considerd best practice
or good for something long term support but I made due with what I had

I had 10Gb cards for my storage network but no 10Gb switch, so I direct
connected them with some fun routing and /etc/hosts settings. I also didnt
want my storage network on a routed network (have firewalls in the way of
VLANs) and I wanted the network separate from my ovirtmgmt - and, as I
said, had no switches for 10Gb. Here is what you need at a bare minimum.
Adapt / change it as you need

1 dedicated NIC on each node for ovirtmgmt. Ex: eth0

1 dedicated NIC to direct connect node 1 and node 2 - eth1 node1
1 dedicated NIC to direct connect node 1 and node 3 - eth2 node1

1 dedicated NIC to direct connect node 2 and node 1 - eth1 node2
1 dedicated NIC to direct connect node 2 and node 3 - eth2 node2

1 dedicated NIC to direct connect node 3 and node 1 - eth1 node3
1 dedicated NIC to direct connect node 3 and node 2 - eth2 node3

You'll need custom routes too:

Route to node 3 from node 1 via eth2
Route to node 3 from node 2 via eth2
Route to node 2 from node 3 via eth2

Finally, entries in your /etc/hosts which match to your routes above

Then, advisably, a dedicated NIC per box for VM network but you can
leverage ovirtmgmt if you are just proofing this out

At this point if you can reach all of your nodes via this direct connect
IPs then you setup gluster as you normally would referencing your entries
in /etc/hosts when you call "gluster volume create"

In my setup, as I said, I had 2x 2 port PCIe 10Gb cards per server so I
setup LACP as well as you can see below

This is what my Frankenstein POC looked like: http://i.imgur.com/iURL9jv.png


You can optionally choose to setup this network in ovirt as well (and add
the NICs to each host) but dont configure it as a VM network. Then you can
also, with some other minor tweaks, use these direct connects as migration
networks rather than ovirtmgmt or VM network

On Tue, Sep 12, 2017 at 9:12 AM, Tailor, Bharat <
bha...@synergysystemsindia.com> wrote:

> Hi,
>
> I am trying to deploy 3 hosts hyper converged setup.
> I am using Centos and installed KVM on all hosts.
>
> Host-1
> Hostname - test1.localdomain
>  eth0 - 192.168.100.15/24
> GW - 192.168.100.1
>
> Hoat-2
> Hostname - test2.localdomain
> eth0 - 192.168.100.16/24
> GW - 192.168.100.1
>
> Host-3
> Hostname - test3.localdomain
> eth0 - 192.168.100.16/24
> GW - 192.168.100.1
>
> I have created two gluster volume "engine" & "data" with replica 3.
> I have add fqdn entry in /etc/hosts for all host for DNS resolution.
>
> I want to deploy Ovirt engine self hosted OVA to manage all the hosts and
> production VM and my ovirt-engine VM should have HA enabled.
>
> I found multiple docs over internet to deply Self-hosted-engine-ova but I
> don't what kind of network configuration I've to do on Centos network card
> & KVM. As KVM docs suggest that I've to create a bridge network for Pnic to
> Vnic bridge. If I configure a bridge br0 for eth0 bridge that I can't see
> eth0 while deploying ovirt-engine setup at NIC card choice.
>
> Kindly help me to do correct configuration for Centos hosts, KVM &
> ovirt-engine-vm for HA enabled DC.
> Regrards
> Bharat Kumar
>
> G15- Vinayak Nagar complex,Opp.Maa Satiya, ayad
> Udaipur (Raj.)
> 313001
> Mob: +91-9950-9960-25
>
>
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hyperconverged question

2017-09-01 Thread Charles Kozler
Jim -

result of this test...engine crashed but all VM's on the gluster domain
(backed by the same physical nodes/hardware/gluster process/etc) stayed up
fine

I guess there is some functional difference between 'backupvolfile-server'
and 'backup-volfile-servers'?

Perhaps try latter and see what happens. My next test is going to be to
configure hosted-engine.conf with backupvolfile-server=node2:node3 and see
if engine VM still shuts down. Seems odd engine VM would shut itself down
(or vdsm would shut it down) but not other VMs. Perhaps built in HA
functionality of sorts

On Fri, Sep 1, 2017 at 7:38 PM, Charles Kozler <ckozler...@gmail.com> wrote:

> Jim -
>
> One thing I noticed is that, by accident, I used
> 'backupvolfile-server=node2:node3' which is apparently a supported
> setting. It would appear, by reading the man page of mount.glusterfs, the
> syntax is slightly different. not sure if my setting being different has
> different impacts
>
> hosted-engine.conf:
>
> # cat /etc/ovirt-hosted-engine/hosted-engine.conf | grep -i option
> mnt_options=backup-volfile-servers=node2:node3
>
> And for my datatest gluster domain I have:
>
> backupvolfile-server=node2:node3
>
> I am now curious what happens when I move everything to node1 and drop
> node2
>
> To that end, will follow up with that test
>
>
>
>
> On Fri, Sep 1, 2017 at 7:20 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
>
>> Jim -
>>
>> here is my test:
>>
>> - All VM's on node2: hosted engine and 1 test VM
>> - Test VM on gluster storage domain (with mount options set)
>> - hosted engine is on gluster as well, with settings persisted to
>> hosted-engine.conf for backupvol
>>
>> All VM's stayed up. Nothing in dmesg of the test vm indicating a pause or
>> an issue or anything
>>
>> However, what I did notice during this, is my /datatest volume doesnt
>> have quorum set. So I will set that now and report back what happens
>>
>> # gluster volume info datatest
>>
>> Volume Name: datatest
>> Type: Replicate
>> Volume ID: 229c25f9-405e-4fe7-b008-1d3aea065069
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1:/gluster/data/datatest/brick1
>> Brick2: node2:/gluster/data/datatest/brick1
>> Brick3: node3:/gluster/data/datatest/brick1
>> Options Reconfigured:
>> transport.address-family: inet
>> nfs.disable: on
>>
>> Perhaps quorum may be more trouble than its worth when you have 3 nodes
>> and/or 2 nodes + arbiter?
>>
>> Since I am keeping my 3rd node out of ovirt, I am more content on keeping
>> it as a warm spare if I **had** to swap it in to ovirt cluster, but keeps
>> my storage 100% quorum
>>
>> On Fri, Sep 1, 2017 at 5:18 PM, Jim Kusznir <j...@palousetech.com> wrote:
>>
>>> I can confirm that I did set it up manually, and I did specify
>>> backupvol, and in the "manage domain" storage settings, I do have under
>>> mount options, backup-volfile-servers=192.168.8.12:192.168.8.13  (and
>>> this was done at initial install time).
>>>
>>> The "used managed gluster" checkbox is NOT checked, and if I check it
>>> and save settings, next time I go in it is not checked.
>>>
>>> --Jim
>>>
>>> On Fri, Sep 1, 2017 at 2:08 PM, Charles Kozler <ckozler...@gmail.com>
>>> wrote:
>>>
>>>> @ Jim - here is my setup which I will test in a few (brand new cluster)
>>>> and report back what I found in my tests
>>>>
>>>> - 3x servers direct connected via 10Gb
>>>> - 2 of those 3 setup in ovirt as hosts
>>>> - Hosted engine
>>>> - Gluster replica 3 (no arbiter) for all volumes
>>>> - 1x engine volume gluster replica 3 manually configured (not using
>>>> ovirt managed gluster)
>>>> - 1x datatest volume (20gb) replica 3 manually configured (not using
>>>> ovirt managed gluster)
>>>> - 1x nfstest domain served from some other server in my infrastructure
>>>> which, at the time of my original testing, was master domain
>>>>
>>>> I tested this earlier and all VMs stayed online. However, ovirt cluster
>>>> reported DC/cluster down, all VM's stayed up
>>>>
>>>> As I am now typing this, can you confirm you setup your gluster storage
>>>> domain with backupvol? Also, confirm you updated hosted-engine.conf with
>>>> backupvol mount option as well?
>>>>
>>>> On Fri, Sep 

Re: [ovirt-users] hyperconverged question

2017-09-01 Thread Charles Kozler
Jim -

One thing I noticed is that, by accident, I used
'backupvolfile-server=node2:node3' which is apparently a supported setting.
It would appear, by reading the man page of mount.glusterfs, the syntax is
slightly different. not sure if my setting being different has different
impacts

hosted-engine.conf:

# cat /etc/ovirt-hosted-engine/hosted-engine.conf | grep -i option
mnt_options=backup-volfile-servers=node2:node3

And for my datatest gluster domain I have:

backupvolfile-server=node2:node3

I am now curious what happens when I move everything to node1 and drop node2

To that end, will follow up with that test




On Fri, Sep 1, 2017 at 7:20 PM, Charles Kozler <ckozler...@gmail.com> wrote:

> Jim -
>
> here is my test:
>
> - All VM's on node2: hosted engine and 1 test VM
> - Test VM on gluster storage domain (with mount options set)
> - hosted engine is on gluster as well, with settings persisted to
> hosted-engine.conf for backupvol
>
> All VM's stayed up. Nothing in dmesg of the test vm indicating a pause or
> an issue or anything
>
> However, what I did notice during this, is my /datatest volume doesnt have
> quorum set. So I will set that now and report back what happens
>
> # gluster volume info datatest
>
> Volume Name: datatest
> Type: Replicate
> Volume ID: 229c25f9-405e-4fe7-b008-1d3aea065069
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1:/gluster/data/datatest/brick1
> Brick2: node2:/gluster/data/datatest/brick1
> Brick3: node3:/gluster/data/datatest/brick1
> Options Reconfigured:
> transport.address-family: inet
> nfs.disable: on
>
> Perhaps quorum may be more trouble than its worth when you have 3 nodes
> and/or 2 nodes + arbiter?
>
> Since I am keeping my 3rd node out of ovirt, I am more content on keeping
> it as a warm spare if I **had** to swap it in to ovirt cluster, but keeps
> my storage 100% quorum
>
> On Fri, Sep 1, 2017 at 5:18 PM, Jim Kusznir <j...@palousetech.com> wrote:
>
>> I can confirm that I did set it up manually, and I did specify backupvol,
>> and in the "manage domain" storage settings, I do have under mount
>> options, backup-volfile-servers=192.168.8.12:192.168.8.13  (and this was
>> done at initial install time).
>>
>> The "used managed gluster" checkbox is NOT checked, and if I check it and
>> save settings, next time I go in it is not checked.
>>
>> --Jim
>>
>> On Fri, Sep 1, 2017 at 2:08 PM, Charles Kozler <ckozler...@gmail.com>
>> wrote:
>>
>>> @ Jim - here is my setup which I will test in a few (brand new cluster)
>>> and report back what I found in my tests
>>>
>>> - 3x servers direct connected via 10Gb
>>> - 2 of those 3 setup in ovirt as hosts
>>> - Hosted engine
>>> - Gluster replica 3 (no arbiter) for all volumes
>>> - 1x engine volume gluster replica 3 manually configured (not using
>>> ovirt managed gluster)
>>> - 1x datatest volume (20gb) replica 3 manually configured (not using
>>> ovirt managed gluster)
>>> - 1x nfstest domain served from some other server in my infrastructure
>>> which, at the time of my original testing, was master domain
>>>
>>> I tested this earlier and all VMs stayed online. However, ovirt cluster
>>> reported DC/cluster down, all VM's stayed up
>>>
>>> As I am now typing this, can you confirm you setup your gluster storage
>>> domain with backupvol? Also, confirm you updated hosted-engine.conf with
>>> backupvol mount option as well?
>>>
>>> On Fri, Sep 1, 2017 at 4:22 PM, Jim Kusznir <j...@palousetech.com> wrote:
>>>
>>>> So, after reading the first document twice and the 2nd link thoroughly
>>>> once, I believe that the arbitrator volume should be sufficient and count
>>>> for replica / split brain.  EG, if any one full replica is down, and the
>>>> arbitrator and the other replica is up, then it should have quorum and all
>>>> should be good.
>>>>
>>>> I think my underlying problem has to do more with config than the
>>>> replica state.  That said, I did size the drive on my 3rd node planning to
>>>> have an identical copy of all data on it, so I'm still not opposed to
>>>> making it a full replica.
>>>>
>>>> Did I miss something here?
>>>>
>>>> Thanks!
>>>>
>>>> On Fri, Sep 1, 2017 at 11:59 AM, Charles Kozler <ckozler...@gmail.com>
>>>> wrote:
>>>>
>>>>> These can get a little co

Re: [ovirt-users] hyperconverged question

2017-09-01 Thread Charles Kozler
Jim -

here is my test:

- All VM's on node2: hosted engine and 1 test VM
- Test VM on gluster storage domain (with mount options set)
- hosted engine is on gluster as well, with settings persisted to
hosted-engine.conf for backupvol

All VM's stayed up. Nothing in dmesg of the test vm indicating a pause or
an issue or anything

However, what I did notice during this, is my /datatest volume doesnt have
quorum set. So I will set that now and report back what happens

# gluster volume info datatest

Volume Name: datatest
Type: Replicate
Volume ID: 229c25f9-405e-4fe7-b008-1d3aea065069
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: node1:/gluster/data/datatest/brick1
Brick2: node2:/gluster/data/datatest/brick1
Brick3: node3:/gluster/data/datatest/brick1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

Perhaps quorum may be more trouble than its worth when you have 3 nodes
and/or 2 nodes + arbiter?

Since I am keeping my 3rd node out of ovirt, I am more content on keeping
it as a warm spare if I **had** to swap it in to ovirt cluster, but keeps
my storage 100% quorum

On Fri, Sep 1, 2017 at 5:18 PM, Jim Kusznir <j...@palousetech.com> wrote:

> I can confirm that I did set it up manually, and I did specify backupvol,
> and in the "manage domain" storage settings, I do have under mount
> options, backup-volfile-servers=192.168.8.12:192.168.8.13  (and this was
> done at initial install time).
>
> The "used managed gluster" checkbox is NOT checked, and if I check it and
> save settings, next time I go in it is not checked.
>
> --Jim
>
> On Fri, Sep 1, 2017 at 2:08 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
>
>> @ Jim - here is my setup which I will test in a few (brand new cluster)
>> and report back what I found in my tests
>>
>> - 3x servers direct connected via 10Gb
>> - 2 of those 3 setup in ovirt as hosts
>> - Hosted engine
>> - Gluster replica 3 (no arbiter) for all volumes
>> - 1x engine volume gluster replica 3 manually configured (not using ovirt
>> managed gluster)
>> - 1x datatest volume (20gb) replica 3 manually configured (not using
>> ovirt managed gluster)
>> - 1x nfstest domain served from some other server in my infrastructure
>> which, at the time of my original testing, was master domain
>>
>> I tested this earlier and all VMs stayed online. However, ovirt cluster
>> reported DC/cluster down, all VM's stayed up
>>
>> As I am now typing this, can you confirm you setup your gluster storage
>> domain with backupvol? Also, confirm you updated hosted-engine.conf with
>> backupvol mount option as well?
>>
>> On Fri, Sep 1, 2017 at 4:22 PM, Jim Kusznir <j...@palousetech.com> wrote:
>>
>>> So, after reading the first document twice and the 2nd link thoroughly
>>> once, I believe that the arbitrator volume should be sufficient and count
>>> for replica / split brain.  EG, if any one full replica is down, and the
>>> arbitrator and the other replica is up, then it should have quorum and all
>>> should be good.
>>>
>>> I think my underlying problem has to do more with config than the
>>> replica state.  That said, I did size the drive on my 3rd node planning to
>>> have an identical copy of all data on it, so I'm still not opposed to
>>> making it a full replica.
>>>
>>> Did I miss something here?
>>>
>>> Thanks!
>>>
>>> On Fri, Sep 1, 2017 at 11:59 AM, Charles Kozler <ckozler...@gmail.com>
>>> wrote:
>>>
>>>> These can get a little confusing but this explains it best:
>>>> https://gluster.readthedocs.io/en/latest/Administrator
>>>> %20Guide/arbiter-volumes-and-quorum/#replica-2-and-replica-3-volumes
>>>>
>>>> Basically in the first paragraph they are explaining why you cant have
>>>> HA with quorum for 2 nodes. Here is another overview doc that explains some
>>>> more
>>>>
>>>> http://openmymind.net/Does-My-Replica-Set-Need-An-Arbiter/
>>>>
>>>> From my understanding arbiter is good for resolving split brains.
>>>> Quorum and arbiter are two different things though quorum is a mechanism to
>>>> help you **avoid** split brain and the arbiter is to help gluster resolve
>>>> split brain by voting and other internal mechanics (as outlined in link 1).
>>>> How did you create the volume exactly - what command? It looks to me like
>>>> you created it with 'gluster volume create replica 2 arbiter 1 {}' per
>>>> your earlier mention of "replica 2 

Re: [ovirt-users] hyperconverged question

2017-09-01 Thread Charles Kozler
@ Jim - here is my setup which I will test in a few (brand new cluster) and
report back what I found in my tests

- 3x servers direct connected via 10Gb
- 2 of those 3 setup in ovirt as hosts
- Hosted engine
- Gluster replica 3 (no arbiter) for all volumes
- 1x engine volume gluster replica 3 manually configured (not using ovirt
managed gluster)
- 1x datatest volume (20gb) replica 3 manually configured (not using ovirt
managed gluster)
- 1x nfstest domain served from some other server in my infrastructure
which, at the time of my original testing, was master domain

I tested this earlier and all VMs stayed online. However, ovirt cluster
reported DC/cluster down, all VM's stayed up

As I am now typing this, can you confirm you setup your gluster storage
domain with backupvol? Also, confirm you updated hosted-engine.conf with
backupvol mount option as well?

On Fri, Sep 1, 2017 at 4:22 PM, Jim Kusznir <j...@palousetech.com> wrote:

> So, after reading the first document twice and the 2nd link thoroughly
> once, I believe that the arbitrator volume should be sufficient and count
> for replica / split brain.  EG, if any one full replica is down, and the
> arbitrator and the other replica is up, then it should have quorum and all
> should be good.
>
> I think my underlying problem has to do more with config than the replica
> state.  That said, I did size the drive on my 3rd node planning to have an
> identical copy of all data on it, so I'm still not opposed to making it a
> full replica.
>
> Did I miss something here?
>
> Thanks!
>
> On Fri, Sep 1, 2017 at 11:59 AM, Charles Kozler <ckozler...@gmail.com>
> wrote:
>
>> These can get a little confusing but this explains it best:
>> https://gluster.readthedocs.io/en/latest/Administrator
>> %20Guide/arbiter-volumes-and-quorum/#replica-2-and-replica-3-volumes
>>
>> Basically in the first paragraph they are explaining why you cant have HA
>> with quorum for 2 nodes. Here is another overview doc that explains some
>> more
>>
>> http://openmymind.net/Does-My-Replica-Set-Need-An-Arbiter/
>>
>> From my understanding arbiter is good for resolving split brains. Quorum
>> and arbiter are two different things though quorum is a mechanism to help
>> you **avoid** split brain and the arbiter is to help gluster resolve split
>> brain by voting and other internal mechanics (as outlined in link 1). How
>> did you create the volume exactly - what command? It looks to me like you
>> created it with 'gluster volume create replica 2 arbiter 1 {}' per your
>> earlier mention of "replica 2 arbiter 1". That being said, if you did that
>> and then setup quorum in the volume configuration, this would cause your
>> gluster to halt up since quorum was lost (as you saw until you recovered
>> node 1)
>>
>> As you can see from the docs, there is still a corner case for getting in
>> to split brain with replica 3, which again, is where arbiter would help
>> gluster resolve it
>>
>> I need to amend my previous statement: I was told that arbiter volume
>> does not store data, only metadata. I cannot find anything in the docs
>> backing this up however it would make sense for it to be. That being said,
>> in my setup, I would not include my arbiter or my third node in my ovirt VM
>> cluster component. I would keep it completely separate
>>
>>
>> On Fri, Sep 1, 2017 at 2:46 PM, Jim Kusznir <j...@palousetech.com> wrote:
>>
>>> I'm now also confused as to what the point of an arbiter is / what it
>>> does / why one would use it.
>>>
>>> On Fri, Sep 1, 2017 at 11:44 AM, Jim Kusznir <j...@palousetech.com>
>>> wrote:
>>>
>>>> Thanks for the help!
>>>>
>>>> Here's my gluster volume info for the data export/brick (I have 3:
>>>> data, engine, and iso, but they're all configured the same):
>>>>
>>>> Volume Name: data
>>>> Type: Replicate
>>>> Volume ID: e670c488-ac16-4dd1-8bd3-e43b2e42cc59
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: ovirt1.nwfiber.com:/gluster/brick2/data
>>>> Brick2: ovirt2.nwfiber.com:/gluster/brick2/data
>>>> Brick3: ovirt3.nwfiber.com:/gluster/brick2/data (arbiter)
>>>> Options Reconfigured:
>>>> performance.strict-o-direct: on
>>>> nfs.disable: on
>>>> user.cifs: off
>>>> network.ping-timeout: 30
>>>> cluster.shd-max-threads: 8
>>>> cluster.shd-wait-qlength: 1
>>

Re: [ovirt-users] hyperconverged question

2017-09-01 Thread Charles Kozler
These can get a little confusing but this explains it best:
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/#replica-2-and-replica-3-volumes

Basically in the first paragraph they are explaining why you cant have HA
with quorum for 2 nodes. Here is another overview doc that explains some
more

http://openmymind.net/Does-My-Replica-Set-Need-An-Arbiter/

>From my understanding arbiter is good for resolving split brains. Quorum
and arbiter are two different things though quorum is a mechanism to help
you **avoid** split brain and the arbiter is to help gluster resolve split
brain by voting and other internal mechanics (as outlined in link 1). How
did you create the volume exactly - what command? It looks to me like you
created it with 'gluster volume create replica 2 arbiter 1 {}' per your
earlier mention of "replica 2 arbiter 1". That being said, if you did that
and then setup quorum in the volume configuration, this would cause your
gluster to halt up since quorum was lost (as you saw until you recovered
node 1)

As you can see from the docs, there is still a corner case for getting in
to split brain with replica 3, which again, is where arbiter would help
gluster resolve it

I need to amend my previous statement: I was told that arbiter volume does
not store data, only metadata. I cannot find anything in the docs backing
this up however it would make sense for it to be. That being said, in my
setup, I would not include my arbiter or my third node in my ovirt VM
cluster component. I would keep it completely separate


On Fri, Sep 1, 2017 at 2:46 PM, Jim Kusznir <j...@palousetech.com> wrote:

> I'm now also confused as to what the point of an arbiter is / what it does
> / why one would use it.
>
> On Fri, Sep 1, 2017 at 11:44 AM, Jim Kusznir <j...@palousetech.com> wrote:
>
>> Thanks for the help!
>>
>> Here's my gluster volume info for the data export/brick (I have 3: data,
>> engine, and iso, but they're all configured the same):
>>
>> Volume Name: data
>> Type: Replicate
>> Volume ID: e670c488-ac16-4dd1-8bd3-e43b2e42cc59
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: ovirt1.nwfiber.com:/gluster/brick2/data
>> Brick2: ovirt2.nwfiber.com:/gluster/brick2/data
>> Brick3: ovirt3.nwfiber.com:/gluster/brick2/data (arbiter)
>> Options Reconfigured:
>> performance.strict-o-direct: on
>> nfs.disable: on
>> user.cifs: off
>> network.ping-timeout: 30
>> cluster.shd-max-threads: 8
>> cluster.shd-wait-qlength: 1
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> performance.low-prio-threads: 32
>> features.shard-block-size: 512MB
>> features.shard: on
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> network.remote-dio: enable
>> cluster.eager-lock: enable
>> performance.stat-prefetch: off
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> performance.readdir-ahead: on
>> server.allow-insecure: on
>> [root@ovirt1 ~]#
>>
>>
>> all 3 of my brick nodes ARE also members of the virtualization cluster
>> (including ovirt3).  How can I convert it into a full replica instead of
>> just an arbiter?
>>
>> Thanks!
>> --Jim
>>
>> On Fri, Sep 1, 2017 at 9:09 AM, Charles Kozler <ckozler...@gmail.com>
>> wrote:
>>
>>> @Kasturi - Looks good now. Cluster showed down for a moment but VM's
>>> stayed up in their appropriate places. Thanks!
>>>
>>> < Anyone on this list please feel free to correct my response to Jim if
>>> its wrong>
>>>
>>> @ Jim - If you can share your gluster volume info / status I can confirm
>>> (to the best of my knowledge). From my understanding, If you setup the
>>> volume with something like 'gluster volume set  group virt' this will
>>> configure some quorum options as well, Ex: http://i.imgur.com/Mya4N5o
>>> .png
>>>
>>> While, yes, you are configured for arbiter node you're still losing
>>> quorum by dropping from 2 -> 1. You would need 4 node with 1 being arbiter
>>> to configure quorum which is in effect 3 writable nodes and 1 arbiter. If
>>> one gluster node drops, you still have 2 up. Although in this case, you
>>> probably wouldnt need arbiter at all
>>>
>>> If you are configured, you can drop quorum settings and just let arbiter
>>> run since you're not using arbiter node in your VM cluster p

Re: [ovirt-users] hyperconverged question

2017-09-01 Thread Charles Kozler
@Kasturi - Looks good now. Cluster showed down for a moment but VM's stayed
up in their appropriate places. Thanks!

< Anyone on this list please feel free to correct my response to Jim if its
wrong>

@ Jim - If you can share your gluster volume info / status I can confirm
(to the best of my knowledge). From my understanding, If you setup the
volume with something like 'gluster volume set  group virt' this will
configure some quorum options as well, Ex: http://i.imgur.com/Mya4N5o.png

While, yes, you are configured for arbiter node you're still losing quorum
by dropping from 2 -> 1. You would need 4 node with 1 being arbiter to
configure quorum which is in effect 3 writable nodes and 1 arbiter. If one
gluster node drops, you still have 2 up. Although in this case, you
probably wouldnt need arbiter at all

If you are configured, you can drop quorum settings and just let arbiter
run since you're not using arbiter node in your VM cluster part (I
believe), just storage cluster part. When using quorum, you need > 50% of
the cluster being up at one time. Since you have 3 nodes with 1 arbiter,
you're actually losing 1/2 which == 50 which == degraded / hindered gluster

Again, this is to the best of my knowledge based on other quorum backed
softwareand this is what I understand from testing with gluster and
ovirt thus far

On Fri, Sep 1, 2017 at 11:53 AM, Jim Kusznir <j...@palousetech.com> wrote:

> Huh...Ok., how do I convert the arbitrar to full replica, then?  I was
> misinformed when I created this setup.  I thought the arbitrator held
> enough metadata that it could validate or refudiate  any one replica (kinda
> like the parity drive for a RAID-4 array).  I was also under the impression
> that one replica  + Arbitrator is enough to keep the array online and
> functional.
>
> --Jim
>
> On Fri, Sep 1, 2017 at 5:22 AM, Charles Kozler <ckozler...@gmail.com>
> wrote:
>
>> @ Jim - you have only two data volumes and lost quorum. Arbitrator only
>> stores metadata, no actual files. So yes, you were running in degraded mode
>> so some operations were hindered.
>>
>> @ Sahina - Yes, this actually worked fine for me once I did that.
>> However, the issue I am still facing, is when I go to create a new gluster
>> storage domain (replica 3, hyperconverged) and I tell it "Host to use" and
>> I select that host. If I fail that host, all VMs halt. I do not recall this
>> in 3.6 or early 4.0. This to me makes it seem like this is "pinning" a node
>> to a volume and vice versa like you could, for instance, for a singular
>> hyperconverged to ex: export a local disk via NFS and then mount it via
>> ovirt domain. But of course, this has its caveats. To that end, I am using
>> gluster replica 3, when configuring it I say "host to use: " node 1, then
>> in the connection details I give it node1:/data. I fail node1, all VMs
>> halt. Did I miss something?
>>
>> On Fri, Sep 1, 2017 at 2:13 AM, Sahina Bose <sab...@redhat.com> wrote:
>>
>>> To the OP question, when you set up a gluster storage domain, you need
>>> to specify backup-volfile-servers=: where server2 and
>>> server3 also have bricks running. When server1 is down, and the volume is
>>> mounted again - server2 or server3 are queried to get the gluster volfiles.
>>>
>>> @Jim, if this does not work, are you using 4.1.5 build with libgfapi
>>> access? If not, please provide the vdsm and gluster mount logs to analyse
>>>
>>> If VMs go to paused state - this could mean the storage is not
>>> available. You can check "gluster volume status " to see if
>>> atleast 2 bricks are running.
>>>
>>> On Fri, Sep 1, 2017 at 11:31 AM, Johan Bernhardsson <jo...@kafit.se>
>>> wrote:
>>>
>>>> If gluster drops in quorum so that it has less votes than it should it
>>>> will stop file operations until quorum is back to normal.If i rember it
>>>> right you need two bricks to write for quorum to be met and that the
>>>> arbiter only is a vote to avoid split brain.
>>>>
>>>>
>>>> Basically what you have is a raid5 solution without a spare. And when
>>>> one disk dies it will run in degraded mode. And some raid systems will stop
>>>> the raid until you have removed the disk or forced it to run anyway.
>>>>
>>>> You can read up on it here: https://gluster.readthed
>>>> ocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
>>>>
>>>> /Johan
>>>>
>>>> On Thu, 2017-08-31 at 22:33 -0700, Jim Kusznir wrote:
>>>>
>>>> Hi all:
&g

Re: [ovirt-users] hyperconverged question

2017-09-01 Thread Charles Kozler
Are you referring to "Mount Options" - > http://i.imgur.com/bYfbyzz.png

Then no, but that would explain why it wasnt working :-). I guess I had a
silly assumption that oVirt would have detected it and automatically taken
up the redundancy that was configured inside the replica set / brick
detection.

I will test and let you know

Thanks!

On Fri, Sep 1, 2017 at 8:52 AM, Kasturi Narra <kna...@redhat.com> wrote:

> Hi Charles,
>
>   One question, while configuring a storage domain  you are saying
> "host to use: " node1,  then in the connection details you say node1:/data.
> What about the backup-volfile-servers option in the UI while configuring
> storage domain? Are you specifying that too?
>
> Thanks
> kasturi
>
>
> On Fri, Sep 1, 2017 at 5:52 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
>
>> @ Jim - you have only two data volumes and lost quorum. Arbitrator only
>> stores metadata, no actual files. So yes, you were running in degraded mode
>> so some operations were hindered.
>>
>> @ Sahina - Yes, this actually worked fine for me once I did that.
>> However, the issue I am still facing, is when I go to create a new gluster
>> storage domain (replica 3, hyperconverged) and I tell it "Host to use" and
>> I select that host. If I fail that host, all VMs halt. I do not recall this
>> in 3.6 or early 4.0. This to me makes it seem like this is "pinning" a node
>> to a volume and vice versa like you could, for instance, for a singular
>> hyperconverged to ex: export a local disk via NFS and then mount it via
>> ovirt domain. But of course, this has its caveats. To that end, I am using
>> gluster replica 3, when configuring it I say "host to use: " node 1, then
>> in the connection details I give it node1:/data. I fail node1, all VMs
>> halt. Did I miss something?
>>
>> On Fri, Sep 1, 2017 at 2:13 AM, Sahina Bose <sab...@redhat.com> wrote:
>>
>>> To the OP question, when you set up a gluster storage domain, you need
>>> to specify backup-volfile-servers=: where server2 and
>>> server3 also have bricks running. When server1 is down, and the volume is
>>> mounted again - server2 or server3 are queried to get the gluster volfiles.
>>>
>>> @Jim, if this does not work, are you using 4.1.5 build with libgfapi
>>> access? If not, please provide the vdsm and gluster mount logs to analyse
>>>
>>> If VMs go to paused state - this could mean the storage is not
>>> available. You can check "gluster volume status " to see if
>>> atleast 2 bricks are running.
>>>
>>> On Fri, Sep 1, 2017 at 11:31 AM, Johan Bernhardsson <jo...@kafit.se>
>>> wrote:
>>>
>>>> If gluster drops in quorum so that it has less votes than it should it
>>>> will stop file operations until quorum is back to normal.If i rember it
>>>> right you need two bricks to write for quorum to be met and that the
>>>> arbiter only is a vote to avoid split brain.
>>>>
>>>>
>>>> Basically what you have is a raid5 solution without a spare. And when
>>>> one disk dies it will run in degraded mode. And some raid systems will stop
>>>> the raid until you have removed the disk or forced it to run anyway.
>>>>
>>>> You can read up on it here: https://gluster.readthed
>>>> ocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
>>>>
>>>> /Johan
>>>>
>>>> On Thu, 2017-08-31 at 22:33 -0700, Jim Kusznir wrote:
>>>>
>>>> Hi all:
>>>>
>>>> Sorry to hijack the thread, but I was about to start essentially the
>>>> same thread.
>>>>
>>>> I have a 3 node cluster, all three are hosts and gluster nodes (replica
>>>> 2 + arbitrar).  I DO have the mnt_options=backup-volfile-servers= set:
>>>>
>>>> storage=192.168.8.11:/engine
>>>> mnt_options=backup-volfile-servers=192.168.8.12:192.168.8.13
>>>>
>>>> I had an issue today where 192.168.8.11 went down.  ALL VMs immediately
>>>> paused, including the engine (all VMs were running on host2:192.168.8.12).
>>>> I couldn't get any gluster stuff working until host1 (192.168.8.11) was
>>>> restored.
>>>>
>>>> What's wrong / what did I miss?
>>>>
>>>> (this was set up "manually" through the article on setting up
>>>> self-hosted gluster cluster back when 4.0 was new..I've upgraded it to 4.1
>>>

Re: [ovirt-users] hyperconverged question

2017-09-01 Thread Charles Kozler
@ Jim - you have only two data volumes and lost quorum. Arbitrator only
stores metadata, no actual files. So yes, you were running in degraded mode
so some operations were hindered.

@ Sahina - Yes, this actually worked fine for me once I did that. However,
the issue I am still facing, is when I go to create a new gluster storage
domain (replica 3, hyperconverged) and I tell it "Host to use" and I select
that host. If I fail that host, all VMs halt. I do not recall this in 3.6
or early 4.0. This to me makes it seem like this is "pinning" a node to a
volume and vice versa like you could, for instance, for a singular
hyperconverged to ex: export a local disk via NFS and then mount it via
ovirt domain. But of course, this has its caveats. To that end, I am using
gluster replica 3, when configuring it I say "host to use: " node 1, then
in the connection details I give it node1:/data. I fail node1, all VMs
halt. Did I miss something?

On Fri, Sep 1, 2017 at 2:13 AM, Sahina Bose <sab...@redhat.com> wrote:

> To the OP question, when you set up a gluster storage domain, you need to
> specify backup-volfile-servers=: where server2 and
> server3 also have bricks running. When server1 is down, and the volume is
> mounted again - server2 or server3 are queried to get the gluster volfiles.
>
> @Jim, if this does not work, are you using 4.1.5 build with libgfapi
> access? If not, please provide the vdsm and gluster mount logs to analyse
>
> If VMs go to paused state - this could mean the storage is not available.
> You can check "gluster volume status " to see if atleast 2 bricks
> are running.
>
> On Fri, Sep 1, 2017 at 11:31 AM, Johan Bernhardsson <jo...@kafit.se>
> wrote:
>
>> If gluster drops in quorum so that it has less votes than it should it
>> will stop file operations until quorum is back to normal.If i rember it
>> right you need two bricks to write for quorum to be met and that the
>> arbiter only is a vote to avoid split brain.
>>
>>
>> Basically what you have is a raid5 solution without a spare. And when one
>> disk dies it will run in degraded mode. And some raid systems will stop the
>> raid until you have removed the disk or forced it to run anyway.
>>
>> You can read up on it here: https://gluster.readthed
>> ocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
>>
>> /Johan
>>
>> On Thu, 2017-08-31 at 22:33 -0700, Jim Kusznir wrote:
>>
>> Hi all:
>>
>> Sorry to hijack the thread, but I was about to start essentially the same
>> thread.
>>
>> I have a 3 node cluster, all three are hosts and gluster nodes (replica 2
>> + arbitrar).  I DO have the mnt_options=backup-volfile-servers= set:
>>
>> storage=192.168.8.11:/engine
>> mnt_options=backup-volfile-servers=192.168.8.12:192.168.8.13
>>
>> I had an issue today where 192.168.8.11 went down.  ALL VMs immediately
>> paused, including the engine (all VMs were running on host2:192.168.8.12).
>> I couldn't get any gluster stuff working until host1 (192.168.8.11) was
>> restored.
>>
>> What's wrong / what did I miss?
>>
>> (this was set up "manually" through the article on setting up self-hosted
>> gluster cluster back when 4.0 was new..I've upgraded it to 4.1 since).
>>
>> Thanks!
>> --Jim
>>
>>
>> On Thu, Aug 31, 2017 at 12:31 PM, Charles Kozler <ckozler...@gmail.com>
>> wrote:
>>
>> Typo..."Set it up and then failed that **HOST**"
>>
>> And upon that host going down, the storage domain went down. I only have
>> hosted storage domain and this new one - is this why the DC went down and
>> no SPM could be elected?
>>
>> I dont recall this working this way in early 4.0 or 3.6
>>
>> On Thu, Aug 31, 2017 at 3:30 PM, Charles Kozler <ckozler...@gmail.com>
>> wrote:
>>
>> So I've tested this today and I failed a node. Specifically, I setup a
>> glusterfs domain and selected "host to use: node1". Set it up and then
>> failed that VM
>>
>> However, this did not work and the datacenter went down. My engine stayed
>> up, however, it seems configuring a domain to pin to a host to use will
>> obviously cause it to fail
>>
>> This seems counter-intuitive to the point of glusterfs or any redundant
>> storage. If a single host has to be tied to its function, this introduces a
>> single point of failure
>>
>> Am I missing something obvious?
>>
>> On Thu, Aug 31, 2017 at 9:43 AM, Kasturi Narra <kna...@redhat.com> wrote:
>>
>> yes, right.  What you can do is edit the hosted-e

Re: [ovirt-users] hyperconverged question

2017-08-31 Thread Charles Kozler
Typo..."Set it up and then failed that **HOST**"

And upon that host going down, the storage domain went down. I only have
hosted storage domain and this new one - is this why the DC went down and
no SPM could be elected?

I dont recall this working this way in early 4.0 or 3.6

On Thu, Aug 31, 2017 at 3:30 PM, Charles Kozler <ckozler...@gmail.com>
wrote:

> So I've tested this today and I failed a node. Specifically, I setup a
> glusterfs domain and selected "host to use: node1". Set it up and then
> failed that VM
>
> However, this did not work and the datacenter went down. My engine stayed
> up, however, it seems configuring a domain to pin to a host to use will
> obviously cause it to fail
>
> This seems counter-intuitive to the point of glusterfs or any redundant
> storage. If a single host has to be tied to its function, this introduces a
> single point of failure
>
> Am I missing something obvious?
>
> On Thu, Aug 31, 2017 at 9:43 AM, Kasturi Narra <kna...@redhat.com> wrote:
>
>> yes, right.  What you can do is edit the hosted-engine.conf file and
>> there is a parameter as shown below [1] and replace h2 and h3 with your
>> second and third storage servers. Then you will need to restart
>> ovirt-ha-agent and ovirt-ha-broker services in all the nodes .
>>
>> [1] 'mnt_options=backup-volfile-servers=:'
>>
>> On Thu, Aug 31, 2017 at 5:54 PM, Charles Kozler <ckozler...@gmail.com>
>> wrote:
>>
>>> Hi Kasturi -
>>>
>>> Thanks for feedback
>>>
>>> > If cockpit+gdeploy plugin would be have been used then that would
>>> have automatically detected glusterfs replica 3 volume created during
>>> Hosted Engine deployment and this question would not have been asked
>>>
>>> Actually, doing hosted-engine --deploy it too also auto detects
>>> glusterfs.  I know glusterfs fuse client has the ability to failover
>>> between all nodes in cluster, but I am still curious given the fact that I
>>> see in ovirt config node1:/engine (being node1 I set it to in hosted-engine
>>> --deploy). So my concern was to ensure and find out exactly how engine
>>> works when one node goes away and the fuse client moves over to the other
>>> node in the gluster cluster
>>>
>>> But you did somewhat answer my question, the answer seems to be no (as
>>> default) and I will have to use hosted-engine.conf and change the parameter
>>> as you list
>>>
>>> So I need to do something manual to create HA for engine on gluster? Yes?
>>>
>>> Thanks so much!
>>>
>>> On Thu, Aug 31, 2017 at 3:03 AM, Kasturi Narra <kna...@redhat.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>>During Hosted Engine setup question about glusterfs volume is being
>>>> asked because you have setup the volumes yourself. If cockpit+gdeploy
>>>> plugin would be have been used then that would have automatically detected
>>>> glusterfs replica 3 volume created during Hosted Engine deployment and this
>>>> question would not have been asked.
>>>>
>>>>During new storage domain creation when glusterfs is selected there
>>>> is a feature called 'use managed gluster volumes' and upon checking this
>>>> all glusterfs volumes managed will be listed and you could choose the
>>>> volume of your choice from the dropdown list.
>>>>
>>>> There is a conf file called /etc/hosted-engine/hosted-engine.conf
>>>> where there is a parameter called backup-volfile-servers="h1:h2" and if one
>>>> of the gluster node goes down engine uses this parameter to provide ha /
>>>> failover.
>>>>
>>>>  Hope this helps !!
>>>>
>>>> Thanks
>>>> kasturi
>>>>
>>>>
>>>>
>>>> On Wed, Aug 30, 2017 at 8:09 PM, Charles Kozler <ckozler...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello -
>>>>>
>>>>> I have successfully created a hyperconverged hosted engine setup
>>>>> consisting of 3 nodes - 2 for VM's and the third purely for storage. I
>>>>> manually configured it all, did not use ovirt node or anything. Built the
>>>>> gluster volumes myself
>>>>>
>>>>> However, I noticed that when setting up the hosted engine and even
>>>>> when adding a new storage domain with glusterfs type, it still asks for
>>>>> hostname:/volumename
>>>>>
>>>>> This leads me to believe that if that one node goes down (ex:
>>>>> node1:/data), then ovirt engine wont be able to communicate with that
>>>>> volume because its trying to reach it on node 1 and thus, go down
>>>>>
>>>>> I know glusterfs fuse client can connect to all nodes to provide
>>>>> failover/ha but how does the engine handle this?
>>>>>
>>>>> ___
>>>>> Users mailing list
>>>>> Users@ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>
>>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hyperconverged question

2017-08-31 Thread Charles Kozler
So I've tested this today and I failed a node. Specifically, I setup a
glusterfs domain and selected "host to use: node1". Set it up and then
failed that VM

However, this did not work and the datacenter went down. My engine stayed
up, however, it seems configuring a domain to pin to a host to use will
obviously cause it to fail

This seems counter-intuitive to the point of glusterfs or any redundant
storage. If a single host has to be tied to its function, this introduces a
single point of failure

Am I missing something obvious?

On Thu, Aug 31, 2017 at 9:43 AM, Kasturi Narra <kna...@redhat.com> wrote:

> yes, right.  What you can do is edit the hosted-engine.conf file and there
> is a parameter as shown below [1] and replace h2 and h3 with your second
> and third storage servers. Then you will need to restart ovirt-ha-agent and
> ovirt-ha-broker services in all the nodes .
>
> [1] 'mnt_options=backup-volfile-servers=:'
>
> On Thu, Aug 31, 2017 at 5:54 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
>
>> Hi Kasturi -
>>
>> Thanks for feedback
>>
>> > If cockpit+gdeploy plugin would be have been used then that would have
>> automatically detected glusterfs replica 3 volume created during Hosted
>> Engine deployment and this question would not have been asked
>>
>> Actually, doing hosted-engine --deploy it too also auto detects
>> glusterfs.  I know glusterfs fuse client has the ability to failover
>> between all nodes in cluster, but I am still curious given the fact that I
>> see in ovirt config node1:/engine (being node1 I set it to in hosted-engine
>> --deploy). So my concern was to ensure and find out exactly how engine
>> works when one node goes away and the fuse client moves over to the other
>> node in the gluster cluster
>>
>> But you did somewhat answer my question, the answer seems to be no (as
>> default) and I will have to use hosted-engine.conf and change the parameter
>> as you list
>>
>> So I need to do something manual to create HA for engine on gluster? Yes?
>>
>> Thanks so much!
>>
>> On Thu, Aug 31, 2017 at 3:03 AM, Kasturi Narra <kna...@redhat.com> wrote:
>>
>>> Hi,
>>>
>>>During Hosted Engine setup question about glusterfs volume is being
>>> asked because you have setup the volumes yourself. If cockpit+gdeploy
>>> plugin would be have been used then that would have automatically detected
>>> glusterfs replica 3 volume created during Hosted Engine deployment and this
>>> question would not have been asked.
>>>
>>>During new storage domain creation when glusterfs is selected there
>>> is a feature called 'use managed gluster volumes' and upon checking this
>>> all glusterfs volumes managed will be listed and you could choose the
>>> volume of your choice from the dropdown list.
>>>
>>>     There is a conf file called /etc/hosted-engine/hosted-engine.conf
>>> where there is a parameter called backup-volfile-servers="h1:h2" and if one
>>> of the gluster node goes down engine uses this parameter to provide ha /
>>> failover.
>>>
>>>  Hope this helps !!
>>>
>>> Thanks
>>> kasturi
>>>
>>>
>>>
>>> On Wed, Aug 30, 2017 at 8:09 PM, Charles Kozler <ckozler...@gmail.com>
>>> wrote:
>>>
>>>> Hello -
>>>>
>>>> I have successfully created a hyperconverged hosted engine setup
>>>> consisting of 3 nodes - 2 for VM's and the third purely for storage. I
>>>> manually configured it all, did not use ovirt node or anything. Built the
>>>> gluster volumes myself
>>>>
>>>> However, I noticed that when setting up the hosted engine and even when
>>>> adding a new storage domain with glusterfs type, it still asks for
>>>> hostname:/volumename
>>>>
>>>> This leads me to believe that if that one node goes down (ex:
>>>> node1:/data), then ovirt engine wont be able to communicate with that
>>>> volume because its trying to reach it on node 1 and thus, go down
>>>>
>>>> I know glusterfs fuse client can connect to all nodes to provide
>>>> failover/ha but how does the engine handle this?
>>>>
>>>> ___
>>>> Users mailing list
>>>> Users@ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] unsupported configuration: Unable to find security driver for model selinux

2017-08-31 Thread Charles Kozler
Also, to add to this, I figured all nodes need to "equal" in terms of
selinux now so I went on node 1 and set selinux to permissive, rebooted,
and then vdsmd wouldnt start which would show the host as nonresponsive in
engine UI. Upon inspection of the log it was because of the missing sebool
module. So I ran 'vdsm-tool configure --force' and then vdsmd started fine.
Once doing this the host came up in the web UI

Tested migrating a VM to it and it worked with no issue

Hope this helps someone else who lands in this situation, however, I'd like
to know what the expected environment of ovirt is. It would be helpful to
have some checks along the way for this condition if its a blocker for
functions

On Thu, Aug 31, 2017 at 9:09 AM, Charles Kozler <ckozler...@gmail.com>
wrote:

> Hello,
>
> I recently installed ovirt cluster on 3 nodes and saw that I could only
> migrate one way
>
> Reviewing the logs I found this
>
> 2017-08-31 09:04:30,685-0400 ERROR (migsrc/1eca84bd) [virt.vm]
> (vmId='1eca84bd-2796-469d-a071-6ba2b21d82f4') unsupported configuration:
> Unable to find security driver for model selinux (migration:287)
> 2017-08-31 09:04:30,698-0400 ERROR (migsrc/1eca84bd) [virt.vm]
> (vmId='1eca84bd-2796-469d-a071-6ba2b21d82f4') Failed to migrate
> (migration:429)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
> 411, in run
> self._startUnderlyingMigration(time.time())
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
> 487, in _startUnderlyingMigration
> self._perform_with_conv_schedule(duri, muri)
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
> 563, in _perform_with_conv_schedule
> self._perform_migration(duri, muri)
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
> 529, in _perform_migration
> self._vm._dom.migrateToURI3(duri, params, flags)
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line
> 69, in f
> ret = attr(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
> 123, in wrapper
> ret = f(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 944, in
> wrapper
> return func(inst, *args, **kwargs)
>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in
> migrateToURI3
> if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed',
> dom=self)
> libvirtError: unsupported configuration: Unable to find security driver
> for model selinux
>
>
> Which led me to this
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1013617
>
> I could migrate from node1 -> node 2 but not node2 -> node1, so obviously
> I had something different with node 1. In this case, it was selinux
>
> On node 1 it is set to disabled but on node 2 it is set to permissive. I
> am not sure how they got different but I wanted to update this list with
> this finding
>
> Node 2 was setup directly via web UI in the engine with host -> new.
> Perhaps I manually set node 1 to disabled
>
> Does ovirt / libvirt expect permissive? Or does it expect enforcing? Or
> does it need to be both the same matching?
>
> thanks!
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] unsupported configuration: Unable to find security driver for model selinux

2017-08-31 Thread Charles Kozler
Hello,

I recently installed ovirt cluster on 3 nodes and saw that I could only
migrate one way

Reviewing the logs I found this

2017-08-31 09:04:30,685-0400 ERROR (migsrc/1eca84bd) [virt.vm]
(vmId='1eca84bd-2796-469d-a071-6ba2b21d82f4') unsupported configuration:
Unable to find security driver for model selinux (migration:287)
2017-08-31 09:04:30,698-0400 ERROR (migsrc/1eca84bd) [virt.vm]
(vmId='1eca84bd-2796-469d-a071-6ba2b21d82f4') Failed to migrate
(migration:429)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 411,
in run
self._startUnderlyingMigration(time.time())
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 487,
in _startUnderlyingMigration
self._perform_with_conv_schedule(duri, muri)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 563,
in _perform_with_conv_schedule
self._perform_migration(duri, muri)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 529,
in _perform_migration
self._vm._dom.migrateToURI3(duri, params, flags)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69,
in f
ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
123, in wrapper
ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 944, in
wrapper
return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in
migrateToURI3
if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed',
dom=self)
libvirtError: unsupported configuration: Unable to find security driver for
model selinux


Which led me to this

https://bugzilla.redhat.com/show_bug.cgi?id=1013617

I could migrate from node1 -> node 2 but not node2 -> node1, so obviously I
had something different with node 1. In this case, it was selinux

On node 1 it is set to disabled but on node 2 it is set to permissive. I am
not sure how they got different but I wanted to update this list with this
finding

Node 2 was setup directly via web UI in the engine with host -> new.
Perhaps I manually set node 1 to disabled

Does ovirt / libvirt expect permissive? Or does it expect enforcing? Or
does it need to be both the same matching?

thanks!
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hyperconverged question

2017-08-31 Thread Charles Kozler
Hi Kasturi -

Thanks for feedback

> If cockpit+gdeploy plugin would be have been used then that would have
automatically detected glusterfs replica 3 volume created during Hosted
Engine deployment and this question would not have been asked

Actually, doing hosted-engine --deploy it too also auto detects glusterfs.
I know glusterfs fuse client has the ability to failover between all nodes
in cluster, but I am still curious given the fact that I see in ovirt
config node1:/engine (being node1 I set it to in hosted-engine --deploy).
So my concern was to ensure and find out exactly how engine works when one
node goes away and the fuse client moves over to the other node in the
gluster cluster

But you did somewhat answer my question, the answer seems to be no (as
default) and I will have to use hosted-engine.conf and change the parameter
as you list

So I need to do something manual to create HA for engine on gluster? Yes?

Thanks so much!

On Thu, Aug 31, 2017 at 3:03 AM, Kasturi Narra <kna...@redhat.com> wrote:

> Hi,
>
>During Hosted Engine setup question about glusterfs volume is being
> asked because you have setup the volumes yourself. If cockpit+gdeploy
> plugin would be have been used then that would have automatically detected
> glusterfs replica 3 volume created during Hosted Engine deployment and this
> question would not have been asked.
>
>During new storage domain creation when glusterfs is selected there is
> a feature called 'use managed gluster volumes' and upon checking this all
> glusterfs volumes managed will be listed and you could choose the volume of
> your choice from the dropdown list.
>
> There is a conf file called /etc/hosted-engine/hosted-engine.conf
> where there is a parameter called backup-volfile-servers="h1:h2" and if one
> of the gluster node goes down engine uses this parameter to provide ha /
> failover.
>
>  Hope this helps !!
>
> Thanks
> kasturi
>
>
>
> On Wed, Aug 30, 2017 at 8:09 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
>
>> Hello -
>>
>> I have successfully created a hyperconverged hosted engine setup
>> consisting of 3 nodes - 2 for VM's and the third purely for storage. I
>> manually configured it all, did not use ovirt node or anything. Built the
>> gluster volumes myself
>>
>> However, I noticed that when setting up the hosted engine and even when
>> adding a new storage domain with glusterfs type, it still asks for
>> hostname:/volumename
>>
>> This leads me to believe that if that one node goes down (ex:
>> node1:/data), then ovirt engine wont be able to communicate with that
>> volume because its trying to reach it on node 1 and thus, go down
>>
>> I know glusterfs fuse client can connect to all nodes to provide
>> failover/ha but how does the engine handle this?
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] hyperconverged question

2017-08-30 Thread Charles Kozler
Hello -

I have successfully created a hyperconverged hosted engine setup consisting
of 3 nodes - 2 for VM's and the third purely for storage. I manually
configured it all, did not use ovirt node or anything. Built the gluster
volumes myself

However, I noticed that when setting up the hosted engine and even when
adding a new storage domain with glusterfs type, it still asks for
hostname:/volumename

This leads me to believe that if that one node goes down (ex: node1:/data),
then ovirt engine wont be able to communicate with that volume because its
trying to reach it on node 1 and thus, go down

I know glusterfs fuse client can connect to all nodes to provide
failover/ha but how does the engine handle this?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] live migration between datacenters with shared storage

2017-06-01 Thread Charles Kozler
My only real concern with a detach and attach is "what if" the upgrade of
the storage domain does not go well, I will have to recover my entire
storage from backup

On Thu, Jun 1, 2017 at 2:50 PM, Yaniv Kaul <yk...@redhat.com> wrote:

>
>
> On Thu, Jun 1, 2017 at 4:55 PM, Adam Litke <ali...@redhat.com> wrote:
>
>> You cannot migrate VMs between Datacenters.  I think an export domain
>> will be your easiest option but there may be a way to upgrade in-place (ie.
>> upgrade engine while vms are running, then upgrade cluster) but I am not an
>> expert in this area.
>>
>
> Why is an export domain better than detach and attach a storage domain?
> Y.
>
>
>>
>> On Wed, May 31, 2017 at 4:08 PM, Charles Kozler <ckozler...@gmail.com>
>> wrote:
>>
>>> I couldnt find a definitive on this so I would like to inquire here
>>>
>>> I have gluster on my storage backend exporting the volume from a single
>>> node via NFS
>>>
>>> I have a DC of 4.0 and I would like to upgrade to 4.1. I would ideally
>>> like to take one node out of the cluster and build a 4.1 datacenter. Then
>>> live migrate VMs from the 4.0 DC over to the 4.1 DC with zero downtime to
>>> the VMs
>>>
>>> Is this possible? Or would I be safer to export/import VMs?
>>>
>>> Thanks!
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>>
>> --
>> Adam Litke
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] live migration between datacenters with shared storage

2017-05-31 Thread Charles Kozler
I couldnt find a definitive on this so I would like to inquire here

I have gluster on my storage backend exporting the volume from a single
node via NFS

I have a DC of 4.0 and I would like to upgrade to 4.1. I would ideally like
to take one node out of the cluster and build a 4.1 datacenter. Then live
migrate VMs from the 4.0 DC over to the 4.1 DC with zero downtime to the VMs

Is this possible? Or would I be safer to export/import VMs?

Thanks!
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] storage redundancy in Ovirt

2017-04-15 Thread Charles Kozler
I think he means SPM? I've seen when node with SPM goes down it isn't
really seamless and takes a minute or two to catch up unless of course you
out that node in maintenance mode but that isn't possible if it crashes or
something

On Apr 15, 2017 1:38 PM, "FERNANDO FREDIANI" 
wrote:

> Hello Konstantin.
>
> That doesn`t make much sense make a whole cluster depend on a single host.
> From what I know any host talk directly to NFS Storage Array or whatever
> other Shared Storage you have.
> Have you tested that host going down if that affects the other with the
> NFS mounted directlly in a NFS Storage array ?
>
> Fernando
>
> 2017-04-15 12:42 GMT-03:00 Konstantin Raskoshnyi :
>
>> In ovirt you have to attach storage through specific host.
>> If host goes down storage is not available.
>>
>> On Sat, Apr 15, 2017 at 7:31 AM FERNANDO FREDIANI <
>> fernando.fredi...@upx.com> wrote:
>>
>>> Well, make it not go through host1 and dedicate a storage server for
>>> running NFS and make both hosts connect to it.
>>> In my view NFS is much easier to manage than any other type of storage,
>>> specially FC and iSCSI and performance is pretty much the same, so you
>>> won`t get better results other than management going to other type.
>>>
>>> Fernando
>>>
>>> 2017-04-15 5:25 GMT-03:00 Konstantin Raskoshnyi :
>>>
 Hi guys,
 I have one nfs storage,
 it's connected through host1.
 host2 also has access to it, I can easily migrate vms between them.

 The question is - if host1 is down - all infrastructure is down, since
 all traffic goes through host1,
 is there any way in oVirt to use redundant storage?

 Only glusterfs?

 Thanks


 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users


>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] OSSEC reporting hidden processes

2017-03-22 Thread Charles Kozler
Well yes I am sure about the research I did :-)

However, to your point, I didnt actually consider that and of course now
clearly makes the most sense. Thanks!

On Wed, Mar 22, 2017 at 9:57 AM, Yedidyah Bar David <d...@redhat.com> wrote:

> On Wed, Mar 22, 2017 at 3:43 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
> > Thanks for the feedback!
> >
> > From my research though it seems like it would take some effort to start
> a
> > process and not register it in /proc or at least, it would be by
> intention
> > to do so for that desired affect.
>
> Are you sure about that?
>
> What if the process is simply very short-lived? Wouldn't something like
> OSSEC
> wrongly count it as "hiding" simply because it died in the middle of being
> investigated? E.g. vdsm runs many times instances of 'dd' to check that
> some
> file is readable (or writable), IIRC.
>
> > I guess my ask here would be why would
> > ovirt do that? Is there a relative performance gain? What processes
> inside
> > ovirt would do such a thing?
> >
> > Appreciate the help
> >
> > On Wed, Mar 22, 2017 at 3:32 AM, Yedidyah Bar David <d...@redhat.com>
> wrote:
> >>
> >> On Tue, Mar 21, 2017 at 7:54 PM, Charles Kozler <ckozler...@gmail.com>
> >> wrote:
> >> > Unfortunately by the time I am able to SSH to the server and start
> >> > looking
> >> > around, that PID is no where to be found
> >>
> >> Even if you do this immediately when OSSEC finishes?
> >> Do you get from it only a single pid?
> >>
> >> >
> >> > So it seems something winds up in ovirt, runs, doesnt register in
> /proc
> >> > (I
> >> > think even threads register themself in /proc),
> >>
> >> Now did some tests. Seems like they do, but are only "visible" if you
> >> access them directly, not if you e.g. 'ls -l /proc'.
> >>
> >> > and then dies off
> >> >
> >> > Any ideas?
> >>
> >> No idea about your specific issue. Based on your above question, did
> this:
> >>
> >> # for pid in $(seq 32768); do if kill -0 $pid 2>/dev/null && ! ls -1
> >> /proc | grep -qw $pid; then ps -e -T | grep -w $pid | awk '{print
> >> $1}'; fi; done | sort -u | while read ppid; do echo number of threads:
> >> $(ps -e -T | grep -w $ppid | wc -l) ps $ppid: $(ps -h -p $ppid); done
> >> number of threads: 5 ps 1149: 1149 ? Ssl 0:23 /usr/bin/python -Es
> >> /usr/sbin/tuned -l -P
> >> number of threads: 3 ps 1151: 1151 ? Ssl 0:55 /usr/sbin/rsyslogd -n
> >> number of threads: 2 ps 1155: 1155 ? Ssl 0:00 /usr/bin/ruby
> >> /usr/bin/fluentd -c /etc/fluentd/fluent.conf
> >> number of threads: 12 ps 1156: 1156 ? Ssl 4:49 /usr/sbin/collectd
> >> number of threads: 16 ps 1205: 1205 ? Ssl 0:08 /usr/sbin/libvirtd
> --listen
> >> number of threads: 6 ps 1426: 1426 ? Sl 23:57 /usr/bin/ruby
> >> /usr/bin/fluentd -c /etc/fluentd/fluent.conf
> >> number of threads: 32 ps 3171: 3171 ? S >> /usr/share/vdsm/vdsmd
> >> number of threads: 6 ps 3173: 3173 ? Ssl 8:48 python /usr/sbin/momd -c
> >> /etc/vdsm/mom.conf
> >> number of threads: 7 ps 575: 575 ? SLl 0:14 /sbin/multipathd
> >> number of threads: 3 ps 667: 667 ? SLsl 0:09 /usr/sbin/dmeventd -f
> >> number of threads: 2 ps 706: 706 ? S >> number of threads: 6 ps 730: 730 ? Ssl 0:00 /usr/lib/polkit-1/polkitd
> >> --no-debug
> >> number of threads: 3 ps 735: 735 ? Ssl 0:31 /usr/bin/python
> >> /usr/bin/ovirt-imageio-daemon
> >> number of threads: 4 ps 741: 741 ? S >> /usr/share/vdsm/supervdsmd --sockfile /var/run/vdsm/svdsm.sock
> >> number of threads: 2 ps 743: 743 ? Ssl 0:00 /bin/dbus-daemon --system
> >> --address=systemd: --nofork --nopidfile --systemd-activation
> >> number of threads: 6 ps 759: 759 ? Ssl 0:00 /usr/sbin/gssproxy -D
> >> number of threads: 5 ps 790: 790 ? SLsl 0:09 /usr/sbin/sanlock daemon
> >>
> >> (There are probably more efficient ways to do this, nvm).
> >>
> >> >
> >> > On Tue, Mar 21, 2017 at 3:10 AM, Yedidyah Bar David <d...@redhat.com>
> >> > wrote:
> >> >>
> >> >> On Mon, Mar 20, 2017 at 5:59 PM, Charles Kozler <
> ckozler...@gmail.com>
> >> >> wrote:
> >> >> > Hi -
> >> >> >
> >> >> > I am wondering why OSSEC would be reporting hidden processes on my
> >> >> > ovirt
> >> &

Re: [ovirt-users] OSSEC reporting hidden processes

2017-03-22 Thread Charles Kozler
Thanks for the feedback!

>From my research though it seems like it would take some effort to start a
process and not register it in /proc or at least, it would be by intention
to do so for that desired affect. I guess my ask here would be why would
ovirt do that? Is there a relative performance gain? What processes inside
ovirt would do such a thing?

Appreciate the help

On Wed, Mar 22, 2017 at 3:32 AM, Yedidyah Bar David <d...@redhat.com> wrote:

> On Tue, Mar 21, 2017 at 7:54 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
> > Unfortunately by the time I am able to SSH to the server and start
> looking
> > around, that PID is no where to be found
>
> Even if you do this immediately when OSSEC finishes?
> Do you get from it only a single pid?
>
> >
> > So it seems something winds up in ovirt, runs, doesnt register in /proc
> (I
> > think even threads register themself in /proc),
>
> Now did some tests. Seems like they do, but are only "visible" if you
> access them directly, not if you e.g. 'ls -l /proc'.
>
> > and then dies off
> >
> > Any ideas?
>
> No idea about your specific issue. Based on your above question, did this:
>
> # for pid in $(seq 32768); do if kill -0 $pid 2>/dev/null && ! ls -1
> /proc | grep -qw $pid; then ps -e -T | grep -w $pid | awk '{print
> $1}'; fi; done | sort -u | while read ppid; do echo number of threads:
> $(ps -e -T | grep -w $ppid | wc -l) ps $ppid: $(ps -h -p $ppid); done
> number of threads: 5 ps 1149: 1149 ? Ssl 0:23 /usr/bin/python -Es
> /usr/sbin/tuned -l -P
> number of threads: 3 ps 1151: 1151 ? Ssl 0:55 /usr/sbin/rsyslogd -n
> number of threads: 2 ps 1155: 1155 ? Ssl 0:00 /usr/bin/ruby
> /usr/bin/fluentd -c /etc/fluentd/fluent.conf
> number of threads: 12 ps 1156: 1156 ? Ssl 4:49 /usr/sbin/collectd
> number of threads: 16 ps 1205: 1205 ? Ssl 0:08 /usr/sbin/libvirtd --listen
> number of threads: 6 ps 1426: 1426 ? Sl 23:57 /usr/bin/ruby
> /usr/bin/fluentd -c /etc/fluentd/fluent.conf
> number of threads: 32 ps 3171: 3171 ? S /usr/share/vdsm/vdsmd
> number of threads: 6 ps 3173: 3173 ? Ssl 8:48 python /usr/sbin/momd -c
> /etc/vdsm/mom.conf
> number of threads: 7 ps 575: 575 ? SLl 0:14 /sbin/multipathd
> number of threads: 3 ps 667: 667 ? SLsl 0:09 /usr/sbin/dmeventd -f
> number of threads: 2 ps 706: 706 ? S number of threads: 6 ps 730: 730 ? Ssl 0:00 /usr/lib/polkit-1/polkitd
> --no-debug
> number of threads: 3 ps 735: 735 ? Ssl 0:31 /usr/bin/python
> /usr/bin/ovirt-imageio-daemon
> number of threads: 4 ps 741: 741 ? S /usr/share/vdsm/supervdsmd --sockfile /var/run/vdsm/svdsm.sock
> number of threads: 2 ps 743: 743 ? Ssl 0:00 /bin/dbus-daemon --system
> --address=systemd: --nofork --nopidfile --systemd-activation
> number of threads: 6 ps 759: 759 ? Ssl 0:00 /usr/sbin/gssproxy -D
> number of threads: 5 ps 790: 790 ? SLsl 0:09 /usr/sbin/sanlock daemon
>
> (There are probably more efficient ways to do this, nvm).
>
> >
> > On Tue, Mar 21, 2017 at 3:10 AM, Yedidyah Bar David <d...@redhat.com>
> wrote:
> >>
> >> On Mon, Mar 20, 2017 at 5:59 PM, Charles Kozler <ckozler...@gmail.com>
> >> wrote:
> >> > Hi -
> >> >
> >> > I am wondering why OSSEC would be reporting hidden processes on my
> ovirt
> >> > nodes? I run OSSEC across the infrastructure and multiple ovirt
> clusters
> >> > have assorted nodes that will report a process is running but does not
> >> > have
> >> > an entry in /proc and thus "possible rootkit" alert is fired
> >> >
> >> > I am well aware that I do not have rootkits on these systems but am
> >> > wondering what exactly inside ovirt is causing this to trigger? Or any
> >> > ideas? Below is sample alert. All my google-fu turns up is that a
> >> > process
> >> > would have to **try** to hide itself from /proc, so curious what this
> is
> >> > inside ovirt. Thanks!
> >> >
> >> > -
> >> >
> >> > OSSEC HIDS Notification.
> >> > 2017 Mar 20 11:54:47
> >> >
> >> > Received From: (ovirtnode2.mydomain.com2) any->rootcheck
> >> > Rule: 510 fired (level 7) -> "Host-based anomaly detection event
> >> > (rootcheck)."
> >> > Portion of the log(s):
> >> >
> >> > Process '24574' hidden from /proc. Possible kernel level rootkit.
> >>
> >> What do you get from:
> >>
> >> ps -eLf | grep -w 24574
> >>
> >> Thanks,
> >> --
> >> Didi
> >
> >
>
>
>
> --
> Didi
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] OSSEC reporting hidden processes

2017-03-21 Thread Charles Kozler
Unfortunately by the time I am able to SSH to the server and start looking
around, that PID is no where to be found

So it seems something winds up in ovirt, runs, doesnt register in /proc (I
think even threads register themself in /proc), and then dies off

Any ideas?

On Tue, Mar 21, 2017 at 3:10 AM, Yedidyah Bar David <d...@redhat.com> wrote:

> On Mon, Mar 20, 2017 at 5:59 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
> > Hi -
> >
> > I am wondering why OSSEC would be reporting hidden processes on my ovirt
> > nodes? I run OSSEC across the infrastructure and multiple ovirt clusters
> > have assorted nodes that will report a process is running but does not
> have
> > an entry in /proc and thus "possible rootkit" alert is fired
> >
> > I am well aware that I do not have rootkits on these systems but am
> > wondering what exactly inside ovirt is causing this to trigger? Or any
> > ideas? Below is sample alert. All my google-fu turns up is that a process
> > would have to **try** to hide itself from /proc, so curious what this is
> > inside ovirt. Thanks!
> >
> > -
> >
> > OSSEC HIDS Notification.
> > 2017 Mar 20 11:54:47
> >
> > Received From: (ovirtnode2.mydomain.com2) any->rootcheck
> > Rule: 510 fired (level 7) -> "Host-based anomaly detection event
> > (rootcheck)."
> > Portion of the log(s):
> >
> > Process '24574' hidden from /proc. Possible kernel level rootkit.
>
> What do you get from:
>
> ps -eLf | grep -w 24574
>
> Thanks,
> --
> Didi
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] OSSEC reporting hidden processes

2017-03-20 Thread Charles Kozler
Hi -

I am wondering why OSSEC would be reporting hidden processes on my ovirt
nodes? I run OSSEC across the infrastructure and multiple ovirt clusters
have assorted nodes that will report a process is running but does not have
an entry in /proc and thus "possible rootkit" alert is fired

I am well aware that I do not have rootkits on these systems but am
wondering what exactly inside ovirt is causing this to trigger? Or any
ideas? Below is sample alert. All my google-fu turns up is that a process
would have to **try** to hide itself from /proc, so curious what this is
inside ovirt. Thanks!

-

OSSEC HIDS Notification.
2017 Mar 20 11:54:47

Received From: (ovirtnode2.mydomain.com2) any->rootcheck
Rule: 510 fired (level 7) -> "Host-based anomaly detection event
(rootcheck)."
Portion of the log(s):

Process '24574' hidden from /proc. Possible kernel level rootkit.



 --END OF NOTIFICATION


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Storage domains not found by vdsm after a reboot

2016-12-03 Thread Charles Kozler
I am facing the same issue here as well. The engine comes up and web UI is
reachable. Initial login takes about 6 minutes to finally let me in and
then once I am in under the events tab there is events for "storage domain
 does not exist" yet they are all there. After this comes
'reconstructing master domain' and it tries to cycle through my 2 storage
domains not including ISO_UPLOAD and hosted_engine domains. Eventually it
will either 1.) Settle on one and actually able to bring it up master
domain or 2.) they all stay down and I have to manually activate one

Its not really an issue since on three tests now I have recovered fine but
it required some manual intervention on at least one occasion but otherwise
it just flaps about until it can settle on one and actually bring it up

Clocking it today its usually like this:

7 minutes for HE to come up on node 1 and access to web UI
+6 minutes while hanging on logging in to web UI
+9 minutes for one of the two storage domains to get activated as master

Total around 20 minutes before entire cluster is usable.

On Sat, Dec 3, 2016 at 11:14 AM, Yoann Laissus 
wrote:

> Hello,
>
> I'm running into some weird issues with vdsm and my storage domains
> after a reboot or a shutdown. I can't manage to figure out what's
> going on...
>
> Currently, my cluster (4.0.5 with hosted engine) is composed of one
> main node. (and another inactive one but unrelated to this issue).
> It has local storage exposed to oVirt via 3 NFS exports (one specific
> for the hosted engine vm) reachable from my local network.
>
> When I wan't to shutdown or reboot my main host (and so the whole
> cluster), I use a custom script :
> 1. Shutdown all VM
> 2. Shutdown engine VM
> 3. Stop HA agent and broker
> 4. Stop vdsmd
> 5. Release the sanlock on the hosted engine SD
> 6. Shutdown / Reboot
>
> It works just fine, but at the next boot, VDSM takes at least 10-15
> minutes to find storage domains, except the hosted engine one. The
> engine loops trying to reconstruct the SPM.
> During this time, vdsClient getConnectedStoragePoolsList returns nothing.
> getStorageDomainsList returns only the hosted engine domain.
> NFS exports are mountable from another server.
>
> But when I restart vdsm manually after the boot, it seems to detect
> immediately the storage domains.
>
> Is there some kind of staled storage data used by vdsm and a timeout
> to invalidate them ?
> Am I missing something on the vdsm side in my shutdown procedure ?
>
> Thanks !
>
> Engine and vdsm logs are attached.
>
>
> --
> Yoann Laissus
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] deleted then rebuilt node still showing up in status

2016-11-29 Thread Charles Kozler
I put the node in to maint. mode, removed all VMs from it, and then deleted
it from the web UI. I then rebuilt it with hosted-engine --deploy and
indicated it is now node 4 (was previously #3) and now in --vm-status it is
still showing up as stale data

What can I do?

In fact, it looks like ovirt thinks its still in maint mode I am presuming
because I used the same host name?


--== Host 1 status ==--

Status up-to-date  : True
Hostname   : node03
Host ID: 1
Engine status  : {"reason": "vm not running on this
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 87e08365
Host timestamp : 3270683
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=3270683 (Tue Nov 29 08:43:52 2016)
host-id=1
score=3400
maintenance=False
state=EngineDown
stopped=False


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : node01
Host ID: 2
Engine status  : {"health": "good", "vm": "up",
"detail": "up"}
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 66c7154a
Host timestamp : 3267326
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=3267326 (Tue Nov 29 08:43:44 2016)
host-id=2
score=3400
maintenance=False
state=EngineUp
stopped=False


--== Host 3 status ==--

Status up-to-date  : False
Hostname   : node02
Host ID: 3
Engine status  : unknown stale-data
Score  : 0
stopped: False
Local maintenance  : True
crc32  : 040b4192
Host timestamp : 211984
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=211984 (Mon Nov 28 20:28:06 2016)
host-id=3
score=0
maintenance=True
state=LocalMaintenance
stopped=False


--== Host 4 status ==--

Status up-to-date  : True
Hostname   : node02
Host ID: 4
Engine status  : {"reason": "vm not running on this
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score  : 0
stopped: False
Local maintenance  : True
crc32  : 6aa7d568
Host timestamp : 43683
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=43683 (Tue Nov 29 08:44:11 2016)
host-id=4
score=0
maintenance=True
state=LocalMaintenance
stopped=False



In web UI it does not show the node (node02 / Host ID #4) as being in
maint. mode
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirtmgmt manual bridge cannot be used in ovirt 4.0

2016-11-28 Thread Charles Kozler
What happens when you configure the bond and then build the bridge manually
over the bond? oVirt installer should skip over it and not do anything.
Just make sure you have DEFROUTE set or routes configuration file as you
expect (this is what used to screw me up)

On Mon, Nov 28, 2016 at 10:06 AM, <jo...@familiealbers.nl> wrote:

> Thanks for your responses but the ui is not an option for me as i am
> dealing with loads of systems.
> in 3.5 ovirt used to just accept the bridge as it was and incorporate it,
> i am just wondering if i am facing a bug or a feature at the moment.
>
>
> Charles Kozler schreef op 2016-11-28 15:48:
>
>> Thats what I used to do as well then on oVirt 4 it started screwing
>> with the the bond as well so I ended up just dumbing it down and
>> figured using the UI after the fact would be OK. I cant remember
>> exactly what would happen but it would be stupid little things like
>> routing would break or something.
>>
>> On Mon, Nov 28, 2016 at 9:43 AM, Simone Tiraboschi
>> <stira...@redhat.com [8]> wrote:
>>
>> On Mon, Nov 28, 2016 at 3:42 PM, Charles Kozler
>>> <ckozler...@gmail.com [7]> wrote:
>>>
>>> What Ive been doing since oVirt 4 is just configuring one NIC
>>>> manually when I provision the server (eg: eth0, em1, etc) and then
>>>> let oVirt do the bridge setup. Once the engine is up I login to
>>>> the UI and I use it to bond the NICs in whatever fashion I need
>>>> (LACP or active-backup). Any time I tried to configure ovirtmgmt
>>>> manually it seemed to "annoy" the hosted-engine --deploy script
>>>>
>>>
>>> This is fine.
>>> Another thing you could do is manually creating the bond and then
>>> having hosted-engine-setup creating the management bridge over your
>>> bond.
>>>
>>>
>>>
>>> On Mon, Nov 28, 2016 at 9:33 AM, Simone Tiraboschi
>>>> <stira...@redhat.com [6]> wrote:
>>>>
>>>> On Mon, Nov 28, 2016 at 12:24 PM, <jo...@familiealbers.nl [3]>
>>>>> wrote:
>>>>>
>>>>> Hi All,
>>>>>>
>>>>>> In our ovirt 3.5 setup. i have always setup the ovirtmgmt
>>>>>> bridge manually .
>>>>>> The bridge consisted of 2 nics
>>>>>>
>>>>>> Id have /etc/vdsm/vdsm.conf list net_persist = ifcfg
>>>>>>
>>>>>>
>>>>>> When i then deployed the host from the ovirt ui or api it
>>>>>> would install and would display the network setup correctly in
>>>>>> the ui.
>>>>>>
>>>>>> On ovirt 4. (vdsm-4.18.15.3-1.el7.centos.x86_64)
>>>>>> I seem unable to follow the same approach.
>>>>>>
>>>>>> In the engine logs i get among other things
>>>>>>
>>>>>> If the interface ovirtmgmt is a bridge, it should be
>>>>>> torn-down manually.
>>>>>>
>>>>>> the interface is indeed a bridge with two nics which i would
>>>>>> like to keep this way.
>>>>>>
>>>>>> On the host vdsm.log i get limited info,
>>>>>>
>>>>>> when start a python terminal to obtain netinfo i get this
>>>>>>
>>>>>> from vdsm.tool import unified_persistence
>>>>>>>>> unified_persistence.netswitch.netinfo()
>>>>>>>>>
>>>>>>>> Traceback (most recent call last):
>>>>>>   File "", line 1, in 
>>>>>>   File
>>>>>> "/usr/lib/python2.7/site-packages/vdsm/network/netswitch.py",
>>>>>> line 298, in netinfo
>>>>>> _netinfo = netinfo_get(compatibility=compatibility)
>>>>>>   File
>>>>>>
>>>>>>
>>>>> "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py",
>>>>
>>>>> line 109, in get
>>>>>> return _get(vdsmnets)
>>>>>>   File
>>>>>>
>>>>>>
>>>>> "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py",
>>>>
>>>>> line 101, in _get
>>>>>> report_network_qos(networking)
>>>>>>   File
>>>>>>
>>>>>> "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/qos.py",
>>>>>
>>>>>> line 46, in report_network_qos
>>>>>> iface, = host_ports
>>>>>> ValueError: too many values to unpack
>>>>>>
>>>>>> As it appears the line in question does not like to deal with
>>>>>> a list of nics i think.
>>>>>> but either way.
>>>>>>
>>>>>> Is in ovirt 4 the ability to use the ovirtmgmt bridge with
>>>>>> multiple nics removed?
>>>>>>
>>>>>
>>>>> But do you need a bridge or a bond?
>>>>>
>>>>>
>>>>> If so what can i do to stick to what we have done in the past.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> ___
>>>>>> Users mailing list
>>>>>> Users@ovirt.org [1]
>>>>>> http://lists.ovirt.org/mailman/listinfo/users [2]
>>>>>>
>>>>>
>>>>> ___
>>>>> Users mailing list
>>>>> Users@ovirt.org [4]
>>>>> http://lists.ovirt.org/mailman/listinfo/users [5]
>>>>>
>>>>
>>
>>
>> Links:
>> --
>> [1] mailto:Users@ovirt.org
>> [2] http://lists.ovirt.org/mailman/listinfo/users
>> [3] mailto:jo...@familiealbers.nl
>> [4] mailto:Users@ovirt.org
>> [5] http://lists.ovirt.org/mailman/listinfo/users
>> [6] mailto:stira...@redhat.com
>> [7] mailto:ckozler...@gmail.com
>> [8] mailto:stira...@redhat.com
>>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirtmgmt manual bridge cannot be used in ovirt 4.0

2016-11-28 Thread Charles Kozler
Thats what I used to do as well then on oVirt 4 it started screwing with
the the bond as well so I ended up just dumbing it down and figured using
the UI after the fact would be OK. I cant remember exactly what would
happen but it would be stupid little things like routing would break or
something.

On Mon, Nov 28, 2016 at 9:43 AM, Simone Tiraboschi <stira...@redhat.com>
wrote:

>
>
> On Mon, Nov 28, 2016 at 3:42 PM, Charles Kozler <ckozler...@gmail.com>
> wrote:
>
>> What I've been doing since oVirt 4 is just configuring one NIC manually
>> when I provision the server (eg: eth0, em1, etc) and then let oVirt do the
>> bridge setup. Once the engine is up I login to the UI and I use it to bond
>> the NICs in whatever fashion I need (LACP or active-backup). Any time I
>> tried to configure ovirtmgmt manually it seemed to "annoy" the
>> hosted-engine --deploy script
>>
>
> This is fine.
> Another thing you could do is manually creating the bond and then having
> hosted-engine-setup creating the management bridge over your bond.
>
>
>>
>> On Mon, Nov 28, 2016 at 9:33 AM, Simone Tiraboschi <stira...@redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Mon, Nov 28, 2016 at 12:24 PM, <jo...@familiealbers.nl> wrote:
>>>
>>>> Hi All,
>>>>
>>>> In our ovirt 3.5 setup. i have always setup the ovirtmgmt bridge
>>>> manually .
>>>> The bridge consisted of 2 nic's
>>>>
>>>> I'd have /etc/vdsm/vdsm.conf list net_persist = ifcfg
>>>>
>>>> When i then deployed the host from the ovirt ui or api it would install
>>>> and would display the network setup correctly in the ui.
>>>>
>>>> On ovirt 4. (vdsm-4.18.15.3-1.el7.centos.x86_64)
>>>> I seem unable to follow the same approach.
>>>>
>>>> In the engine logs i get among other things
>>>>
>>>> 'If the interface ovirtmgmt is a bridge, it should be torn-down
>>>> manually. '
>>>>
>>>> the interface is indeed a bridge with two nics which i would like to
>>>> keep this way.
>>>>
>>>> On the host vdsm.log i get limited info,
>>>>
>>>>
>>>>
>>>> when start a python terminal to obtain netinfo i get this
>>>>
>>>> from vdsm.tool import unified_persistence
>>>>>>> unified_persistence.netswitch.netinfo()
>>>>>>>
>>>>>> Traceback (most recent call last):
>>>>   File "", line 1, in 
>>>>   File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch.py",
>>>> line 298, in netinfo
>>>> _netinfo = netinfo_get(compatibility=compatibility)
>>>>   File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py",
>>>> line 109, in get
>>>> return _get(vdsmnets)
>>>>   File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py",
>>>> line 101, in _get
>>>> report_network_qos(networking)
>>>>   File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/qos.py",
>>>> line 46, in report_network_qos
>>>> iface, = host_ports
>>>> ValueError: too many values to unpack
>>>>
>>>>
>>>> As it appears the line in question does not like to deal with a list of
>>>> nics i think.
>>>> but either way.
>>>>
>>>> Is in ovirt 4 the ability to use the ovirtmgmt bridge with multiple
>>>> nics removed?
>>>>
>>>
>>> But do you need a bridge or a bond?
>>>
>>>
>>>> If so what can i do to stick to what we have done in the past.
>>>>
>>>>
>>>> Thanks.
>>>>
>>>> ___
>>>> Users mailing list
>>>> Users@ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirtmgmt manual bridge cannot be used in ovirt 4.0

2016-11-28 Thread Charles Kozler
What I've been doing since oVirt 4 is just configuring one NIC manually
when I provision the server (eg: eth0, em1, etc) and then let oVirt do the
bridge setup. Once the engine is up I login to the UI and I use it to bond
the NICs in whatever fashion I need (LACP or active-backup). Any time I
tried to configure ovirtmgmt manually it seemed to "annoy" the
hosted-engine --deploy script

On Mon, Nov 28, 2016 at 9:33 AM, Simone Tiraboschi 
wrote:

>
>
> On Mon, Nov 28, 2016 at 12:24 PM,  wrote:
>
>> Hi All,
>>
>> In our ovirt 3.5 setup. i have always setup the ovirtmgmt bridge manually
>> .
>> The bridge consisted of 2 nic's
>>
>> I'd have /etc/vdsm/vdsm.conf list net_persist = ifcfg
>>
>> When i then deployed the host from the ovirt ui or api it would install
>> and would display the network setup correctly in the ui.
>>
>> On ovirt 4. (vdsm-4.18.15.3-1.el7.centos.x86_64)
>> I seem unable to follow the same approach.
>>
>> In the engine logs i get among other things
>>
>> 'If the interface ovirtmgmt is a bridge, it should be torn-down manually.
>> '
>>
>> the interface is indeed a bridge with two nics which i would like to keep
>> this way.
>>
>> On the host vdsm.log i get limited info,
>>
>>
>>
>> when start a python terminal to obtain netinfo i get this
>>
>> from vdsm.tool import unified_persistence
> unified_persistence.netswitch.netinfo()
>
 Traceback (most recent call last):
>>   File "", line 1, in 
>>   File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch.py",
>> line 298, in netinfo
>> _netinfo = netinfo_get(compatibility=compatibility)
>>   File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py",
>> line 109, in get
>> return _get(vdsmnets)
>>   File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py",
>> line 101, in _get
>> report_network_qos(networking)
>>   File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/qos.py",
>> line 46, in report_network_qos
>> iface, = host_ports
>> ValueError: too many values to unpack
>>
>>
>> As it appears the line in question does not like to deal with a list of
>> nics i think.
>> but either way.
>>
>> Is in ovirt 4 the ability to use the ovirtmgmt bridge with multiple nics
>> removed?
>>
>
> But do you need a bridge or a bond?
>
>
>> If so what can i do to stick to what we have done in the past.
>>
>>
>> Thanks.
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] GFS2 and OCFS2 for Shared Storage

2016-11-23 Thread Charles Kozler
Hey Fernando -

I've had success using OCFS2 with both oVirt and Xen although you still
need something to replicate the blocks and this is where DRBD comes in. The
premise was simple - configured two DRBD devices and then setup OCFS2 as
desired (very straight forward vs comparatively to GFS2). Start the cluster
and export via NFS. From there you create an oVirt storage domain as an NFS
backend and its good to go

On your note about using the network traffic for better stuff - eg: VM
traffic - its usually wise, when you have the capabilities, to keep your
storage network separate of VM network so in that you do not have any
latency between your VM nodes and their backend storage. Take for instance
if one VM starts crippling the network (in whatever scenario) then your
oVirt nodes and engine cannot contact storage. oVirt will begin to take
corrective action and will pause all of your VMs

On Wed, Nov 23, 2016 at 9:08 AM, Fernando Frediani <
fernando.fredi...@upx.com.br> wrote:

> Right Pavel. Then where is it or where is the reference to it ?
>
> The only way I heard of is using Thinprovisioning in the SAN level.
>
> With regards to OCFS2 if anyone has any experience with I would like to
> hear about its sucess or not using it.
>
> Thanks
>
> Fernando
>
>
>
> On 23/11/2016 11:46, Pavel Gashev wrote:
>
>> Fernando,
>>
>> Clustered LVM doesn’t support lvmthin(7) http://man7.org/linux/man-page
>> s/man7/lvmthin.7.html
>> There is an oVirt LVM-based thin provisioning implementation.
>>
>> -Original Message-
>> From: Fernando Frediani 
>> Date: Wednesday 23 November 2016 at 16:31
>> To: Pavel Gashev , "users@ovirt.org" 
>> Subject: Re: [ovirt-users] GFS2 and OCFS2 for Shared Storage
>>
>> Are you sure Pavel ?
>>
>> As far as I know and it has been discussed in this list before, the
>> limitation is in CLVM which doesn't support Thinprovisioning yet. LVM2
>> does, but it is not in Clustered mode. I tried to use GFS2 in the past
>> for other non-virtualization related stuff and didn't have much success
>> either.
>>
>> What about OCFS2 ? Has anyone ?
>>
>> Fernando
>>
>>
>> On 23/11/2016 11:26, Pavel Gashev wrote:
>>
>>> Fernando,
>>>
>>> oVirt supports thin provisioning for shared block storages (DAS or
>>> iSCSI). It works using QCOW2 disk images directly on LVM volumes. oVirt
>>> extends volumes when QCOW2 is growing.
>>>
>>> I tried GFS2. It's slow, and blocks other hosts on a host failure.
>>>
>>> -Original Message-
>>> From:  on behalf of Fernando Frediani <
>>> fernando.fredi...@upx.com.br>
>>> Date: Wednesday 23 November 2016 at 15:03
>>> To: "users@ovirt.org" 
>>> Subject: [ovirt-users] GFS2 and OCFS2 for Shared Storage
>>>
>>> Has anyone managed to use GFS2 or OCFS2 for Shared Block Storage between
>>> hosts ? How scalable was it and which of the two work better ?
>>>
>>> Using traditional CLVM is far from good starting because of the lack of
>>> Thinprovision so I'm willing to consider either of the Filesystems.
>>>
>>> Thanks
>>>
>>> Fernando
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>>
>>
>>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt and Ubuntu>=14.04?

2016-10-20 Thread Charles Kozler
Probably not. There are a lot of difference between CentOS/RHEL &
Ubuntu/Debian in this regard - paths like /etc/sysconfig and similar that
it expects. You can run Ubuntu VMs, of course, which is why you found the
guest utilities. If you want Debian-backed KVM solution you can look to
Proxmox

On Thu, Oct 20, 2016 at 7:22 PM, Jon Forrest <jon.forr...@locationlabs.com>
wrote:

>
>
> On 10/20/16 4:11 PM, Charles Kozler wrote:
>
>> oVirt is the upstream source project for RedHat Enterprise
>> Virtualization (RHEV). As expected, its only supported on CentOS 7 (and
>> older versions on 6)
>>
>
> This makes sense. But, do either of these components work on Ubuntu,
> and, if so, how well?
>
> Thanks for any information.
>
> Jon Forrest
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt and Ubuntu>=14.04?

2016-10-20 Thread Charles Kozler
oVirt is the upstream source project for RedHat Enterprise Virtualization
(RHEV). As expected, its only supported on CentOS 7 (and older versions on
6)

On Thu, Oct 20, 2016 at 6:46 PM, Jon Forrest 
wrote:

> I've done some looking around to find the status of oVirt running on
> Ubuntu 14.04 and later. I found the announcement of the
> ovirt-guest-agent for Ubuntu but I see nothing about the oVirt
> Engine on Ubuntu. Given that KVM works fine on Ubuntu, I'm curious
> what's preventing all of oVirt from working on Ubuntu.
>
> Cordially,
> Jon Forrest
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed to read hardware information

2016-10-09 Thread Charles Kozler
Possibly stupid question but are you doing this on a base empty centos/rhel
7?

On Oct 9, 2016 9:48 PM, "David Pinkerton"  wrote:

>
> I've spent the weekend trying to get to the bottom of this issue.
>
> Adding a Host fails:
>
> From RHVM
>
>
> VDSM rhv1 command failed: Connection reset by peer
> Could not get hardware information for host rhv1
> VDSM rhv1 command failed: Failed to read hardware information
> Host rhv1 installed
> Network changes were saved on host rhv1
> Installing Host rhv1. Stage: Termination.
> Installing Host rhv1. Retrieving installation logs to:
> '/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-
> 20161010115606-192.168.21.71-24d39274.log'.
> Installing Host rhv1. Stage: Pre-termination.
> Installing Host rhv1. Starting ovirt-vmconsole-host-sshd.
> Installing Host rhv1. Starting vdsm.
> Installing Host rhv1. Stopping libvirtd.
> Installing Host rhv1. Stage: Closing up.
> Installing Host rhv1. Setting kernel arguments.
> Installing Host rhv1. Stage: Transaction commit.
> Installing Host rhv1. Enrolling serial console certificate.
> Installing Host rhv1. Enrolling certificate.
> Installing Host rhv1. Stage: Misc configuration.
>
>
>
> This was in the /var/log/vdsm/vdsm.log on the host trying to be added:
>
> jsonrpc.Executor/2::ERROR::2016-10-10 
> 11:57:10,276::API::1340::vds::(getHardwareInfo)
> failed to retrieve hardware info
> Traceback (most recent call last):
>   File "/usr/share/vdsm/API.py", line 1337, in getHardwareInfo
> hw = supervdsm.getProxy().getHardwareInfo()
>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 53, in
> __call__
> return callMethod()
>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 51, in
> 
> **kwargs)
>   File "", line 2, in getHardwareInfo
>   File "/usr/lib64/python2.7/multiprocessing/managers.py", line 759, in
> _callmethod
> kind, result = conn.recv()
> EOFError
>
>
> and then VDSM fails to start.
>
>
>
> Looking at the source code...
>
> def getHardwareInfoStructure():
> dmiInfo = getAllDmidecodeInfo()
> sysStruct = {}
> for k1, k2 in (('system', 'Manufacturer'),
>('system', 'Product Name'),
>('system', 'Version'),
>('system', 'Serial Number'),
>('system', 'UUID'),
>('system', 'Family')):
> val = dmiInfo.get(k1, {}).get(k2, None)
> if val not in [None, 'Not Specified']:
> sysStruct[(k1 + k2).replace(' ', '')] = val
>
> return sysStruct
>
>
>
> Running dmidecode from command line I get..
>
> System Information
> Manufacturer: Supermicro
> Product Name: H8DM8-2
> Version: 1234567890
> Serial Number: 1234567890
> UUID: 00020003-0004-0005-0006-000700080009
> Wake-up Type: Power Switch
> SKU Number: To Be Filled By O.E.M.
> Family: To Be Filled By O.E.M.
>
>
> Q: Is the string in Family the source of my problems??
>
> Q: Any work arounds??
>
>
>
>
>
>
>
>
> --
>
> David Pinkerton
> Consultant
> Red Hat Asia Pacific Pty. Ltd.
> Level 11, Canberra House
> 40 Marcus Clarke Street
> Canberra 2600 ACT
>
> Mobile: +61-488-904-232
> Email: david.pinker...@redhat.com
> Web: http://apac.redhat.com/ 
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] documentation change suggestion / question

2016-10-08 Thread Charles Kozler
Googling "ovirt windows agent" this is the first link that pops up:
http://www.ovirt.org/documentation/internal/guest-agent/guest-agent-for-windows/

This doc seems non-intuitive and over complicated

Specifically, the RedHat documentation that is 4 links below is simple as
"install this package and mount the iso":
https://community.redhat.com/blog/2015/05/how-to-install-and-use-ovirts-windows-guest-tools/

The former was updated as early as June of this year

1.) The RedHat document worked for me so I dont think oVirt 4.x cant use
the same
2.) Why have separate areas of documentation that are so different from
each other? This has caused me issues in the past whereby I found RH docs
that were much more clear and concise
3.) Is there anyone who might want to merge RH docs with oVirt docs where
the RH docs are better than oVirt?

Thanks so much for providing such a great product! I just feel sometimes
the docs are a little more developer-centric whereas the RedHat docs are
more easily readable
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] proper update steps for HE (same major branch)

2016-09-14 Thread Charles Kozler
I have my entire cluster running on oVirt 4. As updates become available,
what is the correct method to upgrade within the same major version?

My initial thought is:

1. From oVirt web update my individual nodes that are marked as "a new
version is available"

2. Logon to oVirt web through SSH and then yum update -y there

Or is it backwards? I just want to be sure so that I do not break
hosted-engine because I had to rebuild entirely on oVirt 4 after I could
not get the engines out of 'stale data' in hosted-engine --vm-status so I
am very cautious to update

I've read a couple posts on this list before but I want to be absolutely
sure before I do this

Thanks!
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Moving Hosted Engine Disk

2016-09-06 Thread Charles Kozler
Hi Everyone - can anyone assist me? I'd really like to move HE off of the
location it is right now without having to rebuild my entire datacenter. Is
there anything I can do? Any documentation you can point me to? Thanks!

On Sun, Aug 14, 2016 at 2:51 PM, Charles Kozler <ckozler...@gmail.com>
wrote:

> Hi - I followed this doc successfully https://www.
> ovirt.org/documentation/how-to/hosted-engine/ with no real issues
>
> In oVirt I have the hosted_storage storage domain and the one I created
> (Datastore_VM_NFS) to be a master domain - this is all fine
>
> My hosted_storage domain is on a SAN that is being deprecated. I am
> wondering what I need to do to **move** the disks for Hosted Engine VM off
> of that storage domain and on to another one. I can create a new volume on
> the new SAN where the VMs are stored but from all that I have found it
> doesnt look like I can actually move the hosted engines disks with out
> possibly rerunning the entire hosted-engine --deploy but I am not sure if
> this will cause any other conflicting problems
>
> Thanks
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Suddenly all VM's down including HostedEngine & NFSshares (except HE) unmounted

2016-08-21 Thread Charles Kozler
This usually happens when SPM falls off or master storage domain was
unreachable for a brief period of time in some capacity. Your logs should
say something about an underlying storage problem so oVirt offlined or
paused the VMs to avoid problems. I'd check the pathway to your master
storage domain. You're probably right that something had another conflict
IP. This happened to me one time where someone brought up a system on an IP
that matched my SPM

On Aug 21, 2016 3:33 PM, "Matt ."  wrote:

> HI All,
>
> I'm trying to tackle an issues on 4.0.2 that sunddenly all VM's
> including the HostedEngine are just down at once.
>
> I have also seen that all NFS shares are unmounted except the
> HostedEngine Storage, which is on the same NFS device as well.
>
> I have checked the logs, nothing strange to see there, but as I run a
> vrrp setup and do some tests also I wonder if there is a duplicate IP
> brought up, could this make happen the whole system to go down and the
> Engine or VDSM unmounts the NFS shares ? My switches don't complain.
>
> It's strange that the HE share is only available after it happens.
>
> If so, this would be quite fragile and we should tackle where it goes
> wrong.
>
> Anyone seen this bahaviour ?
>
> Thanks,
>
> Matt
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine vm-status stale data and cluster seems "broken"

2016-06-15 Thread Charles Kozler
 agent
> >> > MainThread::WARNING::2016-06-15
>
> and connection timeout between agent and broker.
>
> > Thread-482175::INFO::2016-06-14
> >
> 12:59:30,429::storage_backends::120::ovirt_hosted_engine_ha.lib.storage_backends::(_check_symlinks)
> > Cleaning up stale LV link
> '/rhev/data-center/mnt/nas01:_volume1_vm__os_ovirt
> >
> 36__engine/c6323975-2966-409d-b9e0-48370a513a98/ha_agent/hosted-engine.metadata'
>
> This is also not normal, it means the storage disappeared.
>
>
> This seems to indicate there is some kind of issue with your network..
> are you sure that your firewall allows connections over lo interface
> and to the storage server?
>
>
> Martin
>
> On Wed, Jun 15, 2016 at 4:11 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
> > Marin -
> >
> > Anything I should be looking for specifically? The only errors I see are
> > smtp errors when it tries to send a notification but nothing indicating
> what
> > the notification is / might be. I see this repeated about every minute
> >
> > Thread-482115::INFO::2016-06-14
> >
> 12:58:54,431::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> > Connection established
> > Thread-482109::INFO::2016-06-14
> >
> 12:58:54,491::storage_backends::120::ovirt_hosted_engine_ha.lib.storage_backends::(_check_symlinks)
> > Cleaning up stale LV link
> '/rhev/data-center/mnt/nas01:_volume1_vm__os_ovirt
> >
> 36__engine/c6323975-2966-409d-b9e0-48370a513a98/ha_agent/hosted-engine.lockspace'
> > Thread-482109::INFO::2016-06-14
> >
> 12:58:54,515::storage_backends::120::ovirt_hosted_engine_ha.lib.storage_backends::(_check_symlinks)
> > Cleaning up stale LV link
> '/rhev/data-center/mnt/nas01:_volume1_vm__os_ovirt
> >
> 36__engine/c6323975-2966-409d-b9e0-48370a513a98/ha_agent/hosted-engine.metadata'
> >
> > nas01 is the primary storage for the engine (as previously noted)
> >
> > Thread-482175::INFO::2016-06-14
> >
> 12:59:30,398::storage_backends::120::ovirt_hosted_engine_ha.lib.storage_backends::(_check_symlinks)
> > Cleaning up stale LV link
> '/rhev/data-center/mnt/nas01:_volume1_vm__os_ovirt
> >
> 36__engine/c6323975-2966-409d-b9e0-48370a513a98/ha_agent/hosted-engine.lockspace'
> > Thread-482175::INFO::2016-06-14
> >
> 12:59:30,429::storage_backends::120::ovirt_hosted_engine_ha.lib.storage_backends::(_check_symlinks)
> > Cleaning up stale LV link
> '/rhev/data-center/mnt/nas01:_volume1_vm__os_ovirt
> >
> 36__engine/c6323975-2966-409d-b9e0-48370a513a98/ha_agent/hosted-engine.metadata'
> >
> >
> > But otherwise the broker looks like its accepting and handling
> connections
> >
> > Thread-481980::INFO::2016-06-14
> > 12:59:33,105::mem_free::53::mem_free.MemFree::(action) memFree: 26491
> > Thread-482193::INFO::2016-06-14
> >
> 12:59:33,977::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> > Connection established
> > Thread-482193::INFO::2016-06-14
> >
> 12:59:34,033::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> > Connection closed
> > Thread-482194::INFO::2016-06-14
> >
> 12:59:34,034::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> > Connection established
> > Thread-482194::INFO::2016-06-14
> >
> 12:59:34,035::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> > Connection closed
> > Thread-482195::INFO::2016-06-14
> >
> 12:59:34,035::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> > Connection established
> > Thread-482195::INFO::2016-06-14
> >
> 12:59:34,036::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> > Connection closed
> > Thread-482196::INFO::2016-06-14
> >
> 12:59:34,037::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> > Connection established
> > Thread-482196::INFO::2016-06-14
> >
> 12:59:34,037::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> > Connection closed
> > Thread-482197::INFO::2016-06-14
> >
> 12:59:38,544::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> > Connection established
> > Thread-482197::INFO::2016-06-14
> >
> 12:59:38,598::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> > Connection closed
> > Thread-482198::INFO::2016-06-14
> >
> 12:59:38,598::listener::134::ovirt_hosted_engine_ha.broker.

Re: [ovirt-users] hosted-engine vm-status stale data and cluster seems "broken"

2016-06-15 Thread Charles Kozler
ot;/usr/lib64/python2.7/smtplib.py", line 315, in connect
self.sock = self._get_socket(host, port, self.timeout)
  File "/usr/lib64/python2.7/smtplib.py", line 290, in _get_socket
return socket.create_connection((host, port), timeout)
  File "/usr/lib64/python2.7/socket.py", line 571, in create_connection
raise err
error: [Errno 110] Connection timed out
Thread-481977::INFO::2016-06-14
12:59:50,264::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
Connection closed
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::90::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopping submonitor ping, id 140681792007632
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::99::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopped submonitor ping, id 140681792007632
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::90::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopping submonitor mgmt-bridge, id 140681925896272
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::99::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopped submonitor mgmt-bridge, id 140681925896272
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::90::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopping submonitor mem-free, id 140681926005456
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::99::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopped submonitor mem-free, id 140681926005456
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::90::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopping submonitor cpu-load-no-engine, id 140681926012880
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::99::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopped submonitor cpu-load-no-engine, id 140681926012880
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::90::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopping submonitor engine-health, id 140681926011984
Thread-481977::INFO::2016-06-14
12:59:50,264::monitor::99::ovirt_hosted_engine_ha.broker.monitor.Monitor::(stop_submonitor)
Stopped submonitor engine-health, id 140681926011984





On Wed, Jun 15, 2016 at 10:04 AM, Martin Sivak <msi...@redhat.com> wrote:

> Charles, check the broker log too please. It is possible that the
> broker process is running, but is not accepting connections for
> example.
>
> Martin
>
> On Wed, Jun 15, 2016 at 3:32 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
> > Actually, broker is the only thing acting "right" between broker and
> agent.
> > Broker is up when I bring the system up but agent is restarting all the
> > time. Have a look
> >
> > The 11th is when I restarted this node after doing 'reinstall' in the
> web UI
> >
> > ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability
> > Communications Broker
> >Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
> enabled;
> > vendor preset: disabled)
> >Active: active (running) since Sat 2016-06-11 13:09:51 EDT; 3 days ago
> >  Main PID: 1285 (ovirt-ha-broker)
> >CGroup: /system.slice/ovirt-ha-broker.service
> >└─1285 /usr/bin/python
> > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon
> >
> > Jun 15 09:23:56 njsevcnp01 ovirt-ha-broker[1285]:
> > INFO:mgmt_bridge.MgmtBridge:Found bridge ovirtmgmt with ports
> > Jun 15 09:23:58 njsevcnp01 ovirt-ha-broker[1285]:
> > INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
> > established
> > Jun 15 09:23:58 njsevcnp01 ovirt-ha-broker[1285]:
> > INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
> > closed
> > Jun 15 09:23:58 njsevcnp01 ovirt-ha-broker[1285]:
> > INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
> > established
> > Jun 15 09:23:58 njsevcnp01 ovirt-ha-broker[1285]:
> > INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
> > closed
> > Jun 15 09:23:58 njsevcnp01 ovirt-ha-broker[1285]:
> > INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
> > established
> > Jun 15 09:23:58 njsevcnp01 ovirt-ha-broker[1285]:
> > INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
> > closed
> > Jun 15 09:23:58 njsevcnp01 ovirt-ha-broker[1285]:
> > INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
> > established
> > Jun 15 09:23:58 njsevcnp01 ovirt-ha-broker[1285]:
> > INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
> > closed
> > Jun 15 09:23:58 njsevcnp01 ovirt-ha-broker[1285

Re: [ovirt-users] hosted-engine vm-status stale data and cluster seems "broken"

2016-06-15 Thread Charles Kozler
vcnp01'}: Connection timed out' - trying to restart agent
MainThread::WARNING::2016-06-15
09:26:17,136::agent::208::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
Restarting agent, attempt '0'
MainThread::ERROR::2016-06-15
09:26:48,058::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
Error: 'Failed to start monitor , options {'hostname':
'njsevcnp01'}: Connection timed out' - trying to restart agent
MainThread::WARNING::2016-06-15
09:26:53,063::agent::208::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
Restarting agent, attempt '1'
MainThread::ERROR::2016-06-15
09:27:23,969::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
Error: 'Failed to start monitor , options {'hostname':
'njsevcnp01'}: Connection timed out' - trying to restart agent
MainThread::WARNING::2016-06-15
09:27:28,973::agent::208::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
Restarting agent, attempt '2'


Storage is also completely fine. No logs stating anything "going away" or
having issues. Engine has dedicated NFS NAS device meanwhile VM storage is
completely separate storage cluster. Storage has 100% dedicated backend
network with no changes being done



On Wed, Jun 15, 2016 at 7:42 AM, Martin Sivak <msi...@redhat.com> wrote:

> > Jun 14 08:11:11 njsevcnp01 ovirt-ha-agent[15713]: ovirt-ha-agent
> > ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Connection closed:
> > Connection timed out
> > Jun 14 08:11:11 njsevcnp01.fixflyer.com ovirt-ha-agent[15713]:
> > ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Error:
> 'Failed
> > to start monitor , options {'hostname': 'njsevcnp01'}:
> > Connection timed out' - trying to restart agent
>
> Broker is broken or down. Check the status of ovirt-ha-broker service.
>
> > The other interesting thing is this log from node01. The odd thing is
> that
> > it seems there is some split brain somewhere in oVirt because this log is
> > from node02 but it is asking the engine and its getting back "vm not
> running
> > on this host' rather than 'stale data'. But I dont know engine internals
>
> This is another piece that points to broker or storage issues. Agent
> collects local data and then publishes them to other nodes through
> broker. So it is possible for the agent to know the status of the VM
> locally, but not be able to publish it.
>
> hosted-engine command line tool then reads the synchronization
> whiteboard too, but it does not see anything that was not published
> and ends up reporting stale data.
>
> >> What is the status of the hosted engine services? systemctl status
> >> ovirt-ha-agent ovirt-ha-broker
>
> Please check the services.
>
> Best regards
>
> Martin
>
> On Tue, Jun 14, 2016 at 2:16 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
> > Martin -
> >
> > One thing I noticed on all of the nodes is this:
> >
> > Jun 14 08:11:11 njsevcnp01 ovirt-ha-agent[15713]: ovirt-ha-agent
> > ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Connection closed:
> > Connection timed out
> > Jun 14 08:11:11 njsevcnp01.fixflyer.com ovirt-ha-agent[15713]:
> > ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Error:
> 'Failed
> > to start monitor , options {'hostname': 'njsevcnp01'}:
> > Connection timed out' - trying to restart agent
> >
> > Then the agent is restarted
> >
> > [root@njsevcnp01 ~]# ps -Aef | grep -i ovirt-ha-agent | grep -iv grep
> > vdsm  15713  1  0 08:09 ?00:00:01 /usr/bin/python
> > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon
> >
> > I dont know why the connection would time out because as you can see that
> > log is from node01 and I cant figure out why its timing out on the
> > connection
> >
> > The other interesting thing is this log from node01. The odd thing is
> that
> > it seems there is some split brain somewhere in oVirt because this log is
> > from node02 but it is asking the engine and its getting back "vm not
> running
> > on this host' rather than 'stale data'. But I dont know engine internals
> >
> > MainThread::INFO::2016-06-14
> >
> 08:13:05,163::state_machine::171::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> > Host njsevcnp02 (id 2): {hostname: njsevcnp02, host-id: 2, engine-status:
> > {reason: vm not running on this host, health: bad, vm: down, detail:
> > unknown}, score: 0, stopped: True, maintenance: False, crc32: 25da07df,
> > host-ts: 3030}
> > MainThread::INFO::2016-06-14
> >
> 08:13:05,163::state_machine::171::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> > Host njs

Re: [ovirt-users] which NIC/network NFS storage is using

2016-06-14 Thread Charles Kozler
Set a static route to the storage to go through the NIC(s) you want it to

ip route add x.x.x.x/32 dev  via 

where x.x.x.x/32 is the IP of the NFS server

On Tue, Jun 14, 2016 at 1:30 PM, Ryan Mahoney <
r...@beaconhillentertainment.com> wrote:

> Yes, Fernando this is exactly what I'm asking for
>
> On Tue, Jun 14, 2016 at 1:25 PM, Fernando Frediani <
> fernando.fredi...@upx.com.br> wrote:
>
>> I guess what the colleague wants to know is how to specify a interface in
>> a different VLAN on the top of the 10Gb LACP in order for the NFS traffic
>> to flow.
>> In VMware world that would be vmkernel interface, so a new
>> network/interface with an different IP address than Management (ovirtmgmt).
>>
>> Fernando
>>
>>
>> Em 14/06/2016 13:52, Ryan Mahoney escreveu:
>>
>> Right, but how do you specify which network the nfs traffic is using?
>>
>> On Tue, Jun 14, 2016 at 12:41 PM, Nir Soffer <nsof...@redhat.com> wrote:
>>
>>> On Tue, Jun 14, 2016 at 5:26 PM, Ryan Mahoney
>>> <r...@beaconhillentertainment.com> wrote:
>>> > On my hosts, I have configured a 1gbe nic for ovirtmgmt whose usage is
>>> > currently setup for Management, Display, VM and Migration. I also have
>>> a 2
>>> > 10gbe nics bonded LACP which are VLAN tagged and assigned the dozen or
>>> so
>>> > VLANS needed for the various VM's to access.  I have NFS storage
>>> mounted to
>>> > the Data Center, and I would like to know how I check/specify which
>>> network
>>> > connection ovirt is using for that NFS storage.  I want to make sure
>>> it is
>>> > utilizing the 10gbe bond on each host vs using the 1gbe connection.
>>>
>>> We don't configured anything regarding network used for nfs storage, so
>>> it works
>>> just like any other nfs mount you create yourself.
>>>
>>> Nir
>>>
>>
>>
>>
>> ___
>> Users mailing 
>> listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>


-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine vm-status stale data and cluster seems "broken"

2016-06-14 Thread Charles Kozler
291-856c-81875ba4e264', 'path':
'/rhev/data-center/mnt/nas01:_volume1_vm__os_ovirt36__engine/c6323975-2966-409d-b9e0-48370a513a98/images/8518ef4a-7b17-4291-856c-81875ba4e264/aa66d378-5a5f-490c-b0ab-993b79838d95'}]},
{'index': '2', 'iface': 'ide', 'name': 'hdc', 'alias': 'ide0-1-0',
'readonly': 'True', 'deviceId': '8c3179ac-b322-4f5c-9449-c52e3665e0ae',
'address': {'bus': '1', 'controller': '0', 'type': 'drive', 'target': '0',
'unit': '0'}, 'device': 'cdrom', 'shared': 'false', 'path': '', 'type':
'disk'}, {'device': 'unix', 'alias': 'channel0', 'type': 'channel',
'address': {'bus': '0', 'controller': '0', 'type': 'virtio-serial', 'port':
'1'}}, {'device': 'unix', 'alias': 'channel1', 'type': 'channel',
'address': {'bus': '0', 'controller': '0', 'type': 'virtio-serial', 'port':
'2'}}, {'device': 'unix', 'alias': 'channel2', 'type': 'channel',
'address': {'bus': '0', 'controller': '0', 'type': 'virtio-serial', 'port':
'3'}}, {'device': '', 'alias': 'video0', 'type': 'video', 'address':
{'slot': '0x02', 'bus': '0x00', 'domain': '0x', 'type': 'pci',
'function': '0x0'}}]
guestDiskMapping = {'8518ef4a-7b17-4291-8': {'name': '/dev/vda'},
'QEMU_DVD-ROM_QM3': {'name': '/dev/sr0'}}
vmType = kvm
displaySecurePort = -1
memSize = 4096
displayPort = 5900
clientIp =
spiceSecureChannels =
smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
smp = 4
displayIp = 0
display = vnc
pauseCode = NOERR




On Mon, Jun 13, 2016 at 8:25 AM, Charles Kozler <char...@fixflyer.com>
wrote:

> It is up. I can do "ps -Aef | grep -i qemu-kvm | grep -i hosted" and see
> it running. I also forcefully shut it down with hosted-engine --vm-stop
> when it was on node1 and then did --vm-start on node 2 and it came up. Also
> the Web UI is reachable so thats how I also know the hosted engine VM is
> running
>
> On Mon, Jun 13, 2016 at 8:24 AM, Alexis HAUSER <
> alexis.hau...@telecom-bretagne.eu> wrote:
>
>>
>> > http://imgur.com/a/6xkaS
>>
>> I had similar errors with one single host and a hosted-engine VM.
>> My case should be totally different, but one thing you could try first is
>> to check VM is really up.
>> In my issues, VM was shown by hosted-engine command as up, but was down.
>> with vdsClient command, you can check if it's status with more details.
>>
>> What is the result for you of the following command ?
>>
>>  vdsClient -s 0 list
>>
>
>
>
> --
>
> *Charles Kozler*
> *Vice President, IT Operations*
>
> FIX Flyer, LLC
> 225 Broadway | Suite 1600 | New York, NY 10007
> 1-888-349-3593
> http://www.fixflyer.com <http://fixflyer.com>
>
> NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
> RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
> WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
> DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
> INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
> RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
> E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
>



-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine vm-status stale data and cluster seems "broken"

2016-06-13 Thread Charles Kozler
It is up. I can do "ps -Aef | grep -i qemu-kvm | grep -i hosted" and see it
running. I also forcefully shut it down with hosted-engine --vm-stop when
it was on node1 and then did --vm-start on node 2 and it came up. Also the
Web UI is reachable so thats how I also know the hosted engine VM is running

On Mon, Jun 13, 2016 at 8:24 AM, Alexis HAUSER <
alexis.hau...@telecom-bretagne.eu> wrote:

>
> > http://imgur.com/a/6xkaS
>
> I had similar errors with one single host and a hosted-engine VM.
> My case should be totally different, but one thing you could try first is
> to check VM is really up.
> In my issues, VM was shown by hosted-engine command as up, but was down.
> with vdsClient command, you can check if it's status with more details.
>
> What is the result for you of the following command ?
>
>  vdsClient -s 0 list
>



-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] hosted-engine vm-status stale data and cluster seems "broken"

2016-06-11 Thread Charles Kozler
See linked images please. As you can see all three nodes are reporting
stale data. The results of this are:

1. Not all VM's migrate seamlessly in the cluster. Sometimes I have to shut
them down to get them to be able to migrate again

2. Hosted engine refuses to move due to constraints (image). This part
doesnt make sense to me  because I can forcefully shut it down and then go
directly on a hosted engine node and bring it back up. Also, the Web UI
shows all nodes under the cluster except then it thinks its not apart of
the cluster

3. Time is in sync (image)

4. Storage is 100% fine. Gluster back end reports mirroring and status
'started'. No split brain has occurred and ovirt nodes have never lost
connectivity to storage

5. I reinstalled all three nodes. For some reason only node 3 still shows
as having updates available. (image). For clarity, I did not click
"upgrade" I simply did 'reinstall' from the Web UI. Having looked at the
output and yum.log from /var/log it almost looks like it did do an update.
All package versions across all three nodes are the same (respective to
ovirt/vdsm) (image). For some reason
though ovirt-engine-appliance-3.6-20160126.1.el7.centos.noarch exists on
node 1 but not on node 2 or 3. Could this be relative? I dont recall
installing that specifically on node 1 but I may have

Been slamming my head on this so I am hoping you can provide some assistance

http://imgur.com/a/6xkaS

Thanks!

-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] spacewalk integration

2016-02-20 Thread Charles Kozler
Hi Guys -

Can you use spacewalk as an external provider in oVirt? I see it allows
Satelilte/Foreman but I know the new Satellite product is much different
than Spacewalk as I believe Spacewalk is still technically Satellite 5
whereas Satellite 6 is an entire rewrite

If Spacewalk can be used, what are some pre-reqs needed to get it working?
Also, what are the benefits of using this? I was looking at it for the
eratta information

Thanks!

-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] matching / mapping ovirt storage domain images to VMs

2016-02-07 Thread Charles Kozler
Thanks!

On Sun, Feb 7, 2016 at 6:00 PM, Arman Khalatyan <arm2...@gmail.com> wrote:

> Hi,
> If engine is down you can check directly on the hosts with virsh in
> readonly mode.
> virsh -r list
> virsh -r dumpxml vmid
> Then parse xml to get the disks.
>
> Cheers,
> Arman
> Am 07.02.2016 02:15 schrieb "Charles Kozler" <char...@fixflyer.com>:
>
>> Pavel -
>>
>> This works if engine is up. What about if it is down? Is there no way to
>> easily correlate? Like maybe through vdsClient to check with VDSM on the
>> engine directly?
>>
>> On Fri, Feb 5, 2016 at 3:11 AM, Pavel Gashev <p...@acronis.com> wrote:
>>
>>> You can use ovirt-shell:
>>>
>>> [oVirt shell (connected)]# list disks --parent-vm-name VM1 --show-all
>>>
>>> id  :
>>> e7a1f91c-4196-4e04-8936-bbc37daff393
>>> name: W2K12_Disk1
>>> active  : True
>>> actual_size : 327680
>>> alias   : W2K12_Disk1
>>> bootable: True
>>> format  : cow
>>> image_id:
>>> 48b748b6-dc20-43f5-8c51-e0f984e2fd00
>>> interface   : virtio
>>> propagate_errors: False
>>> provisioned_size: 85899345920
>>> quota-id:
>>> c224e50a-de46-461e-bd32-94a921151355
>>> read_only   : False
>>> shareable   : False
>>> size: 85899345920
>>> sparse  : True
>>> status-state: ok
>>> storage_domains-storage_domain-id   :
>>> 75801b3b-d9ce-4b62-aa36-6b6519ecc04e
>>> storage_type: image
>>> vm-id   :
>>> 8f163624-823f-4ac2-8964-7aa473c41de2
>>> wipe_after_delete   : False
>>>
>>> Where id is the directory name, and image_id is the file name.
>>>
>>>
>>> From: <users-boun...@ovirt.org> on behalf of Charles Kozler <
>>> char...@fixflyer.com>
>>> Date: Thursday 4 February 2016 at 23:37
>>> To: users <users@ovirt.org>
>>> Subject: [ovirt-users] matching / mapping ovirt storage domain images
>>> to VMs
>>>
>>> is there an easy / intutive way to find out the underlying image
>>> associated to a VM? for instance, looking at a storage domain from the
>>> server, it is not easy to figure out what VM it actually belongs to
>>>
>>> [storage[root@snode01 images]$ find -type f | grep -iv meta | grep -iv
>>> lease  | xargs du -sch
>>> 20K
>>> ./bd765364-064d-487c-a6f8-a290249edca1/4f6dcb0e-e4c9-4ab6-af6d-d49f89228fa1
>>> 20K
>>> ./e69a0128-fddc-4ee7-b91c-04caf8bdd540/2ce9d1aa-70e3-4063-895d-c9848ec122e5
>>> 10G
>>> ./2d1eab4a-df47-4e8e-8a0c-c58ca9c0d6cf/1ab0bebd-57b4-45f3-8e77-7c1973282766
>>> 10G
>>> ./9ed4a196-bc18-4d6a-b7b6-f38bee01e102/32953e4c-8a3c-4252-96d9-9ecbb7c2a603
>>> 0
>>> ./1c3129dc-56fa-4e41-bd5b-3313f9f1aa86/d4221e11-bf3b-4226-a8a8-b53ff0189592
>>> 21G total
>>>
>>> How can I find out what VMs these belong to?
>>>
>>> --
>>>
>>> *Charles Kozler*
>>> *Vice President, IT Operations*
>>>
>>> FIX Flyer, LLC
>>> 225 Broadway | Suite 1600 | New York, NY 10007
>>> 1-888-349-3593
>>> http://www.fixflyer.com <http://fixflyer.com>
>>>
>>> NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
>>> RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
>>> WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
>>> DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
>>> INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
>>> RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
>>> E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
>>>
>>
>>
>>
>> --
>>
>> *Charles Kozler*
>> *Vice President, IT Operations*
>>
>> FIX Flyer, LLC
>>

Re: [ovirt-users] matching / mapping ovirt storage domain images to VMs

2016-02-06 Thread Charles Kozler
Pavel -

This works if engine is up. What about if it is down? Is there no way to
easily correlate? Like maybe through vdsClient to check with VDSM on the
engine directly?

On Fri, Feb 5, 2016 at 3:11 AM, Pavel Gashev <p...@acronis.com> wrote:

> You can use ovirt-shell:
>
> [oVirt shell (connected)]# list disks --parent-vm-name VM1 --show-all
>
> id  :
> e7a1f91c-4196-4e04-8936-bbc37daff393
> name: W2K12_Disk1
> active  : True
> actual_size : 327680
> alias   : W2K12_Disk1
> bootable: True
> format  : cow
> image_id:
> 48b748b6-dc20-43f5-8c51-e0f984e2fd00
> interface   : virtio
> propagate_errors: False
> provisioned_size: 85899345920
> quota-id:
> c224e50a-de46-461e-bd32-94a921151355
> read_only   : False
> shareable   : False
> size: 85899345920
> sparse  : True
> status-state: ok
> storage_domains-storage_domain-id   :
> 75801b3b-d9ce-4b62-aa36-6b6519ecc04e
> storage_type: image
> vm-id   :
> 8f163624-823f-4ac2-8964-7aa473c41de2
> wipe_after_delete   : False
>
> Where id is the directory name, and image_id is the file name.
>
>
> From: <users-boun...@ovirt.org> on behalf of Charles Kozler <
> char...@fixflyer.com>
> Date: Thursday 4 February 2016 at 23:37
> To: users <users@ovirt.org>
> Subject: [ovirt-users] matching / mapping ovirt storage domain images to
> VMs
>
> is there an easy / intutive way to find out the underlying image
> associated to a VM? for instance, looking at a storage domain from the
> server, it is not easy to figure out what VM it actually belongs to
>
> [storage[root@snode01 images]$ find -type f | grep -iv meta | grep -iv
> lease  | xargs du -sch
> 20K
> ./bd765364-064d-487c-a6f8-a290249edca1/4f6dcb0e-e4c9-4ab6-af6d-d49f89228fa1
> 20K
> ./e69a0128-fddc-4ee7-b91c-04caf8bdd540/2ce9d1aa-70e3-4063-895d-c9848ec122e5
> 10G
> ./2d1eab4a-df47-4e8e-8a0c-c58ca9c0d6cf/1ab0bebd-57b4-45f3-8e77-7c1973282766
> 10G
> ./9ed4a196-bc18-4d6a-b7b6-f38bee01e102/32953e4c-8a3c-4252-96d9-9ecbb7c2a603
> 0
> ./1c3129dc-56fa-4e41-bd5b-3313f9f1aa86/d4221e11-bf3b-4226-a8a8-b53ff0189592
> 21G total
>
> How can I find out what VMs these belong to?
>
> --
>
> *Charles Kozler*
> *Vice President, IT Operations*
>
> FIX Flyer, LLC
> 225 Broadway | Suite 1600 | New York, NY 10007
> 1-888-349-3593
> http://www.fixflyer.com <http://fixflyer.com>
>
> NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
> RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
> WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
> DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
> INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
> RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
> E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
>



-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] packet loss ingress to engine VM

2016-02-06 Thread Charles Kozler
I cannot figure this out for the life of me but its been an issue. it
eventually resolves itself once everything has "settled" after I bring the
hosted-engine VM up but it takes anywhere between 15 minutes and 1 hour to
completely settle

http://imgur.com/a/o4S5m

The picture is the best way I can describe it. From the node the engine is
running on I ping the engine VM and packet loss is great. When I ping FROM
the VM out to the outside world (or any IP really) there is no packet loss

I have ruled out physical connectivity issues and RSTP/STP is not enabled
on the switches that the ovirtmgmt network is connected to

This causes a problem because the inbound checks from the nodes fails (I
guess you use ICMP?) and then it starts to enter a constant flap state
where the engine keeps migrating around

-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Dumb question: exclamation mark next to VM?

2016-02-04 Thread Charles Kozler
Matt -

Same issue here!

On Thu, Feb 4, 2016 at 12:08 PM, Matthew Trent <
matthew.tr...@lewiscountywa.gov> wrote:

> ​When I upgraded to 3.6.1 (I think), I had exclamation points on several
> VMs, and hovering over (or looking at the bottom of the VM's General
> tab) gave a message about time zone mis-match. After 3.6.2, the message
> about time zone mis-match is gone, but the exclamation points remain.
>
>
> --
> Matthew Trent
> Network Engineer
> Lewis County IT Services
> 360.740.1247 - Helpdesk
> 360.740.3343 - Direct line
> --
> *From:* users-boun...@ovirt.org <users-boun...@ovirt.org> on behalf of
> Charles Kozler <char...@fixflyer.com>
> *Sent:* Thursday, February 4, 2016 7:46 AM
> *To:* Joe DiTommasso
> *Cc:* users
> *Subject:* Re: [ovirt-users] Dumb question: exclamation mark next to VM?
>
> You cant see my mouse (because scrot removes it when you take a picture)
> but it is hovering over the ! and it says up (almost like it thinks im over
> the green arrow but I'm not) http://i.imgur.com/5u2Yvay.png
>
> To that end I cannot see what the issue is
>
> On Thu, Feb 4, 2016 at 10:43 AM, Joe DiTommasso <jd...@domeyard.com>
> wrote:
>
>> If you mouse over the exclamation mark, you should get a tooltip that
>> tells you what it's complaining about. I've got it on pretty much all my
>> VMs, it's an issue with the timezone for me.
>>
>> On Thu, Feb 4, 2016 at 10:41 AM, Charles Kozler <char...@fixflyer.com>
>> wrote:
>>
>>> I have this too. Thank you, I was going to email about this as well
>>> http://i.imgur.com/cZ6P5dp.png
>>>
>>> On Thu, Feb 4, 2016 at 10:38 AM, Chris Adams <c...@cmadams.net> wrote:
>>>
>>>> I set up a new oVirt 3.6.2 cluster on CentOS 7.2 (everything up to date
>>>> as of yesterday).  I created a basic CentOS 7.2 VM with my local
>>>> customizations, created a template from it, and then created a VM from
>>>> that template.
>>>>
>>>> That new VM has an exclamation mark next to it in the web GUI (between
>>>> the up arror for "running" and the "server" icon).  Usually I would
>>>> expect that means something is wrong or needs attention, but I can't
>>>> find anything to fix/address/etc. (no messages in the Alerts, nothing
>>>> odd in the Events, etc.).  What does the exclamation mark mean, and how
>>>> do I clear it?
>>>>
>>>> --
>>>> Chris Adams <c...@cmadams.net>
>>>> ___
>>>> Users mailing list
>>>> Users@ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> *Charles Kozler*
>>> *Vice President, IT Operations*
>>>
>>> FIX Flyer, LLC
>>> 225 Broadway | Suite 1600 | New York, NY 10007
>>> 1-888-349-3593
>>> http://www.fixflyer.com <http://fixflyer.com>
>>>
>>> NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
>>> RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
>>> WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
>>> DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
>>> INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
>>> RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
>>> E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>
>
> --
>
> *Charles Kozler*
> *Vice President, IT Operations*
>
> FIX Flyer, LLC
> 225 Broadway | Suite 1600 | New York, NY 10007
> 1-888-349-3593
> http://www.fixflyer.com <http://fixflyer.com>
>
> NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
> RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
> WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
> DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
> INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
> RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
> E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
>



-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Dumb question: exclamation mark next to VM?

2016-02-04 Thread Charles Kozler
You cant see my mouse (because scrot removes it when you take a picture)
but it is hovering over the ! and it says up (almost like it thinks im over
the green arrow but I'm not) http://i.imgur.com/5u2Yvay.png

To that end I cannot see what the issue is

On Thu, Feb 4, 2016 at 10:43 AM, Joe DiTommasso <jd...@domeyard.com> wrote:

> If you mouse over the exclamation mark, you should get a tooltip that
> tells you what it's complaining about. I've got it on pretty much all my
> VMs, it's an issue with the timezone for me.
>
> On Thu, Feb 4, 2016 at 10:41 AM, Charles Kozler <char...@fixflyer.com>
> wrote:
>
>> I have this too. Thank you, I was going to email about this as well
>> http://i.imgur.com/cZ6P5dp.png
>>
>> On Thu, Feb 4, 2016 at 10:38 AM, Chris Adams <c...@cmadams.net> wrote:
>>
>>> I set up a new oVirt 3.6.2 cluster on CentOS 7.2 (everything up to date
>>> as of yesterday).  I created a basic CentOS 7.2 VM with my local
>>> customizations, created a template from it, and then created a VM from
>>> that template.
>>>
>>> That new VM has an exclamation mark next to it in the web GUI (between
>>> the up arror for "running" and the "server" icon).  Usually I would
>>> expect that means something is wrong or needs attention, but I can't
>>> find anything to fix/address/etc. (no messages in the Alerts, nothing
>>> odd in the Events, etc.).  What does the exclamation mark mean, and how
>>> do I clear it?
>>>
>>> --
>>> Chris Adams <c...@cmadams.net>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>
>>
>>
>> --
>>
>> *Charles Kozler*
>> *Vice President, IT Operations*
>>
>> FIX Flyer, LLC
>> 225 Broadway | Suite 1600 | New York, NY 10007
>> 1-888-349-3593
>> http://www.fixflyer.com <http://fixflyer.com>
>>
>> NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
>> RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
>> WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
>> DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
>> INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
>> RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
>> E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>


-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] matching / mapping ovirt storage domain images to VMs

2016-02-04 Thread Charles Kozler
is there an easy / intutive way to find out the underlying image associated
to a VM? for instance, looking at a storage domain from the server, it is
not easy to figure out what VM it actually belongs to

[storage[root@snode01 images]$ find -type f | grep -iv meta | grep -iv
lease  | xargs du -sch
20K
./bd765364-064d-487c-a6f8-a290249edca1/4f6dcb0e-e4c9-4ab6-af6d-d49f89228fa1
20K
./e69a0128-fddc-4ee7-b91c-04caf8bdd540/2ce9d1aa-70e3-4063-895d-c9848ec122e5
10G
./2d1eab4a-df47-4e8e-8a0c-c58ca9c0d6cf/1ab0bebd-57b4-45f3-8e77-7c1973282766
10G
./9ed4a196-bc18-4d6a-b7b6-f38bee01e102/32953e4c-8a3c-4252-96d9-9ecbb7c2a603
0
./1c3129dc-56fa-4e41-bd5b-3313f9f1aa86/d4221e11-bf3b-4226-a8a8-b53ff0189592
21G total

How can I find out what VMs these belong to?

-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] deleted host still showing up in hosted engine vm-status

2016-02-02 Thread Charles Kozler
Using a hosted engine on 3.6 and I deleted a host I recently added but the
system is still showing up in hosted-engine --vm-status

How can I ensure this system is fully deleted?

-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] importing existing storage domain

2016-02-02 Thread Charles Kozler
I did read that but that doc doesnt seem to have any real actionable items.
It seems more like a design spec rather than a walkthrough / README or
HOWTO on how to recover from a backed up storage domain

I tried 'An import screen for NFS Storage Domain' by doing the following :

system -> "storage tab" -> import existing domain

Below you can see the CLI of the backed up domain

[root@node01 t]# ll
total 4
drwxr-xr-x. 5 vdsm kvm 4096 Dec  8 10:05
c2ce3235-1e09-46b6-9dcd-3fffd6238879
-rwxr-xr-x. 1 vdsm kvm0 Feb  1 23:14 __DIRECT_IO_TEST__
[root@njsevcnp01 t]# du -sch ./* | grep G
29G ./c2ce3235-1e09-46b6-9dcd-3fffd6238879
29G total


and this is the result I received (before/after) http://imgur.com/a/UvGmr

On Tue, Feb 2, 2016 at 9:20 AM, Elad Ben Aharon <ebena...@redhat.com> wrote:

> Hi,
>
> Did you try to import the data domain as an existing one to the DC [1]?
>
> [1] http://www.ovirt.org/Features/ImportStorageDomain
>
> On Tue, Feb 2, 2016 at 3:25 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
>
>> Say I have backed up a storage domain to another NFS. I now want to
>> restore that backup. What would be the best method to restore all the disk
>> images on that domain? I have tried export domains and importing existing
>> domains but every time it says they are not empty and they "may be from
>> somewhere else". They are in fact from somewhere else but that is somewhat
>> the point of the exercise
>>
>> I am attempting to simulate a restore-from-backup (basically just an
>> rsync of the entire storage domain to an offsite location) and its not
>> going as planned. I have all the VM images and can ensure they are all in
>> tact but ovirt refuses to acknowledge it
>>
>> I have also tried setting up a new storage domain and then copying the
>> backup images directory to the new images directory under the newly
>> established storage domain. ovirt sees the used space go down but does not
>> reference / acknowledge / view any of the data actually there unless it
>> creates it itself
>>
>> How can I restore an rsync backup of a storage domain? I am feeling like
>> there isnt a way to readily do it so if not, how can I restore these backed
>> up images?
>>
>> --
>>
>> *Charles Kozler*
>> *Vice President, IT Operations*
>>
>> FIX Flyer, LLC
>> 225 Broadway | Suite 1600 | New York, NY 10007
>> 1-888-349-3593
>> http://www.fixflyer.com <http://fixflyer.com>
>>
>> NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
>> RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
>> WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
>> DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
>> INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
>> RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
>> E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>


-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] importing existing storage domain

2016-02-02 Thread Charles Kozler
Say I have backed up a storage domain to another NFS. I now want to restore
that backup. What would be the best method to restore all the disk images
on that domain? I have tried export domains and importing existing domains
but every time it says they are not empty and they "may be from somewhere
else". They are in fact from somewhere else but that is somewhat the point
of the exercise

I am attempting to simulate a restore-from-backup (basically just an rsync
of the entire storage domain to an offsite location) and its not going as
planned. I have all the VM images and can ensure they are all in tact but
ovirt refuses to acknowledge it

I have also tried setting up a new storage domain and then copying the
backup images directory to the new images directory under the newly
established storage domain. ovirt sees the used space go down but does not
reference / acknowledge / view any of the data actually there unless it
creates it itself

How can I restore an rsync backup of a storage domain? I am feeling like
there isnt a way to readily do it so if not, how can I restore these backed
up images?

-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] importing existing storage domain

2016-02-02 Thread Charles Kozler
Source was ovirt 3.5. Now using ovirt 3.6 on a new cluster
On Feb 2, 2016 12:16 PM, "Joop" <jvdw...@xs4all.nl> wrote:

> On 2-2-2016 15:30, Charles Kozler wrote:
>
> I did read that but that doc doesnt seem to have any real actionable
> items. It seems more like a design spec rather than a walkthrough / README
> or HOWTO on how to recover from a backed up storage domain
>
> I tried 'An import screen for NFS Storage Domain' by doing the following :
>
> system -> "storage tab" -> import existing domain
>
> Below you can see the CLI of the backed up domain
>
> [root@node01 t]# ll
> total 4
> drwxr-xr-x. 5 vdsm kvm 4096 Dec  8 10:05
> c2ce3235-1e09-46b6-9dcd-3fffd6238879
> -rwxr-xr-x. 1 vdsm kvm0 Feb  1 23:14 __DIRECT_IO_TEST__
> [root@njsevcnp01 t]# du -sch ./* | grep G
> 29G ./c2ce3235-1e09-46b6-9dcd-3fffd6238879
> 29G total
>
>
> and this is the result I received (before/after) http://imgur.com/a/UvGmr
>
> What version of oVirt did create this storage domain? and what version are
> you using to import it?
> I have imported NFS storage domains multiple times and didn't have
> problems. oVirt-3.5 or higher should be able to import a storage domain.
>
> Joop
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] importing existing storage domain

2016-02-02 Thread Charles Kozler
But to add to that I've tried 3.5 to 3.5 and I received the same issue
On Feb 2, 2016 12:29 PM, char...@fixflyer.com wrote:

> Source was ovirt 3.5. Now using ovirt 3.6 on a new cluster
> On Feb 2, 2016 12:16 PM, "Joop" <jvdw...@xs4all.nl> wrote:
>
>> On 2-2-2016 15:30, Charles Kozler wrote:
>>
>> I did read that but that doc doesnt seem to have any real actionable
>> items. It seems more like a design spec rather than a walkthrough / README
>> or HOWTO on how to recover from a backed up storage domain
>>
>> I tried 'An import screen for NFS Storage Domain' by doing the following :
>>
>> system -> "storage tab" -> import existing domain
>>
>> Below you can see the CLI of the backed up domain
>>
>> [root@node01 t]# ll
>> total 4
>> drwxr-xr-x. 5 vdsm kvm 4096 Dec  8 10:05
>> c2ce3235-1e09-46b6-9dcd-3fffd6238879
>> -rwxr-xr-x. 1 vdsm kvm0 Feb  1 23:14 __DIRECT_IO_TEST__
>> [root@njsevcnp01 t]# du -sch ./* | grep G
>> 29G ./c2ce3235-1e09-46b6-9dcd-3fffd6238879
>> 29G total
>>
>>
>> and this is the result I received (before/after) http://imgur.com/a/UvGmr
>>
>>
>> What version of oVirt did create this storage domain? and what version
>> are you using to import it?
>> I have imported NFS storage domains multiple times and didn't have
>> problems. oVirt-3.5 or higher should be able to import a storage domain.
>>
>> Joop
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] memory leak in 3.5.6 - not vdsm

2016-02-01 Thread Charles Kozler
Nir -

At this time I cant do anything. I succesfully upgraded the CentOS 6.7
engine VM to 3.6 but after migrating it to a new node packet loss on the VM
is significant. I was able to finally get access to the VM by running SSH
client command with -C and running 'service ovirt-engine stop' of which the
packet loss went away.

What can I do? What can I send to attach here? As of now I cannot add any
new systems because the ovirt-engine is pretty much dead. I am learning
towards my only option is to rebuild everything on CentOS 7. Is it possible
for me to configure an EL7 hosted engine along side the EL6's while I
migrate and upgrade?or should I just restart from scratch?

On Mon, Feb 1, 2016 at 3:14 PM, Charles Kozler <char...@fixflyer.com> wrote:

> Thank you. I am in the process of upgrading my nodes to 7 although since
> the upgrade of my engine to ovirt 3.6 the network on the VM itself has
> become very instable. Ping packets are unresponsive from time to time and I
> cannot get a reliable SSH connection. My guess is something inside ovirt
> 3.6 does not like to run on a 3.5 node (expected I guess) and something is
> occurring inside of the brains somewhere. My hope now is to get my EL7 node
> up and running and hopefully join it to my cluster then start my hosted
> engine there
>
> On Mon, Feb 1, 2016 at 3:12 PM, Nir Soffer <nsof...@redhat.com> wrote:
>
>> On Mon, Feb 1, 2016 at 7:52 PM, Charles Kozler <char...@fixflyer.com>
>> wrote:
>> > Sorry to be clear..the only way to resolve the memory leak I am facing
>> now
>> > is to upgrade to el7?
>>
>> If you want to use official packages, yes.
>>
>> But this is free software, so *you* are free to build your own package.
>>
>> The patches you need are in the ovirt-3.5 branch:
>> https://github.com/oVirt/vdsm/tree/ovirt-3.5
>>
>> You clone vdsm, checkout this branch, and build it. But you will have to
>> support
>> it yourself, because we are focusing on refining 3.6 and working on 4.0.
>>
>> Unless you cannot upgrade to el 7 or fedora >= 22, I recommend to upgrade
>> your
>> hosts and use official packages.
>>
>> >
>> > Also the engine can stay running on el6 yes?
>> Yes
>>
>> > I successfully upgraded my
>> > engine to ovirt 3.6 in el6. Do I need to make my engine vm el7 too?
>> No.
>>
>> Nir
>> >
>> > On Feb 1, 2016 12:49 PM, "Charles Kozler" <char...@fixflyer.com> wrote:
>> >>
>> >> So I will still have the memory leak?
>> >>
>> >> On Feb 1, 2016 12:39 PM, "Simone Tiraboschi" <stira...@redhat.com>
>> wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Feb 1, 2016 at 6:33 PM, Charles Kozler <char...@fixflyer.com>
>> >>> wrote:
>> >>>>
>> >>>> So what about the bug that I hit for vdsm as listed above by Nir?
>> Will I
>> >>>> have that patch to avoid the memory leak or no? Upgrading an entire
>> node to
>> >>>> centos 7 is not actually feasible and was previously outlined above
>> that I
>> >>>> just needed to upgrade to ovirt 3.6 and no mention of OS change ...
>> >>>
>> >>>
>> >>> You cannot install VDSM from 3.6 on el6:
>> >>>
>> >>>
>> http://www.ovirt.org/OVirt_3.6_Release_Notes#RHEL_6.7_-_CentOS_6.7_and_similar
>> >>> and there is no plan for a 3.5.8.
>> >>>
>> >>>
>> >>>>
>> >>>> On Feb 1, 2016 12:30 PM, "Simone Tiraboschi" <stira...@redhat.com>
>> >>>> wrote:
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Mon, Feb 1, 2016 at 5:40 PM, Charles Kozler <
>> char...@fixflyer.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Sandro / Nir -
>> >>>>>>
>> >>>>>> I followed your steps plus
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> http://www.ovirt.org/OVirt_3.6_Release_Notes#Fedora_.2F_CentOS_.2F_RHEL
>> >>>>>>
>> >>>>>> Engine upgraded fine but then when I got to upgrading a node I did:
>> >>>>>>
>> >>>>>> $ yum install
>> >>>>>> http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
>> >>>>>> $ yum update -y
>> >>&

Re: [ovirt-users] memory leak in 3.5.6 - not vdsm

2016-02-01 Thread Charles Kozler
Thank you. I am in the process of upgrading my nodes to 7 although since
the upgrade of my engine to ovirt 3.6 the network on the VM itself has
become very instable. Ping packets are unresponsive from time to time and I
cannot get a reliable SSH connection. My guess is something inside ovirt
3.6 does not like to run on a 3.5 node (expected I guess) and something is
occurring inside of the brains somewhere. My hope now is to get my EL7 node
up and running and hopefully join it to my cluster then start my hosted
engine there

On Mon, Feb 1, 2016 at 3:12 PM, Nir Soffer <nsof...@redhat.com> wrote:

> On Mon, Feb 1, 2016 at 7:52 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
> > Sorry to be clear..the only way to resolve the memory leak I am facing
> now
> > is to upgrade to el7?
>
> If you want to use official packages, yes.
>
> But this is free software, so *you* are free to build your own package.
>
> The patches you need are in the ovirt-3.5 branch:
> https://github.com/oVirt/vdsm/tree/ovirt-3.5
>
> You clone vdsm, checkout this branch, and build it. But you will have to
> support
> it yourself, because we are focusing on refining 3.6 and working on 4.0.
>
> Unless you cannot upgrade to el 7 or fedora >= 22, I recommend to upgrade
> your
> hosts and use official packages.
>
> >
> > Also the engine can stay running on el6 yes?
> Yes
>
> > I successfully upgraded my
> > engine to ovirt 3.6 in el6. Do I need to make my engine vm el7 too?
> No.
>
> Nir
> >
> > On Feb 1, 2016 12:49 PM, "Charles Kozler" <char...@fixflyer.com> wrote:
> >>
> >> So I will still have the memory leak?
> >>
> >> On Feb 1, 2016 12:39 PM, "Simone Tiraboschi" <stira...@redhat.com>
> wrote:
> >>>
> >>>
> >>>
> >>> On Mon, Feb 1, 2016 at 6:33 PM, Charles Kozler <char...@fixflyer.com>
> >>> wrote:
> >>>>
> >>>> So what about the bug that I hit for vdsm as listed above by Nir?
> Will I
> >>>> have that patch to avoid the memory leak or no? Upgrading an entire
> node to
> >>>> centos 7 is not actually feasible and was previously outlined above
> that I
> >>>> just needed to upgrade to ovirt 3.6 and no mention of OS change ...
> >>>
> >>>
> >>> You cannot install VDSM from 3.6 on el6:
> >>>
> >>>
> http://www.ovirt.org/OVirt_3.6_Release_Notes#RHEL_6.7_-_CentOS_6.7_and_similar
> >>> and there is no plan for a 3.5.8.
> >>>
> >>>
> >>>>
> >>>> On Feb 1, 2016 12:30 PM, "Simone Tiraboschi" <stira...@redhat.com>
> >>>> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Mon, Feb 1, 2016 at 5:40 PM, Charles Kozler <char...@fixflyer.com
> >
> >>>>> wrote:
> >>>>>>
> >>>>>> Sandro / Nir -
> >>>>>>
> >>>>>> I followed your steps plus
> >>>>>>
> >>>>>>
> >>>>>>
> http://www.ovirt.org/OVirt_3.6_Release_Notes#Fedora_.2F_CentOS_.2F_RHEL
> >>>>>>
> >>>>>> Engine upgraded fine but then when I got to upgrading a node I did:
> >>>>>>
> >>>>>> $ yum install
> >>>>>> http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
> >>>>>> $ yum update -y
> >>>>>>
> >>>>>> And then rebooted the node. I noticed libvirt was updated by a .1
> >>>>>> release number but vdsm (where the memory leak issue I thought
> was?) was not
> >>>>>> upgraded. In fact, very little of ovirt packages on the node were
> noticeably
> >>>>>> not updated
> >>>>>>
> >>>>>
> >>>>> We are not building vdsm for el6 in 3.6, you need also to upgrade to
> >>>>> el7 if you want that.
> >>>>>
> >>>>>>
> >>>>>> Updated node received the following updated packages during the
> >>>>>> install:
> >>>>>>
> >>>>>> http://pastebin.ca/3362714
> >>>>>>
> >>>>>> Note specifically the only packages updated via the ovirt3.6
> >>>>>> repository was ioprocess, otopi, ovirt-engine-sdk-python,
> ovirt-host-deploy,
> >>>>>> ovirt-release36, and python-iop

Re: [ovirt-users] memory leak in 3.5.6 - not vdsm

2016-02-01 Thread Charles Kozler
So I will still have the memory leak?
On Feb 1, 2016 12:39 PM, "Simone Tiraboschi" <stira...@redhat.com> wrote:

>
>
> On Mon, Feb 1, 2016 at 6:33 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
>
>> So what about the bug that I hit for vdsm as listed above by Nir? Will I
>> have that patch to avoid the memory leak or no? Upgrading an entire node to
>> centos 7 is not actually feasible and was previously outlined above that I
>> just needed to upgrade to ovirt 3.6 and no mention of OS change ...
>>
>
> You cannot install VDSM from 3.6 on el6:
>
> http://www.ovirt.org/OVirt_3.6_Release_Notes#RHEL_6.7_-_CentOS_6.7_and_similar
> and there is no plan for a 3.5.8.
>
>
>
>> On Feb 1, 2016 12:30 PM, "Simone Tiraboschi" <stira...@redhat.com> wrote:
>>
>>>
>>>
>>> On Mon, Feb 1, 2016 at 5:40 PM, Charles Kozler <char...@fixflyer.com>
>>> wrote:
>>>
>>>> Sandro / Nir -
>>>>
>>>> I followed your steps plus
>>>>
>>>> http://www.ovirt.org/OVirt_3.6_Release_Notes#Fedora_.2F_CentOS_.2F_RHEL
>>>>
>>>> Engine upgraded fine but then when I got to upgrading a node I did:
>>>>
>>>> $ yum install
>>>> http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
>>>> $ yum update -y
>>>>
>>>> And then rebooted the node. I noticed libvirt was updated by a .1
>>>> release number but vdsm (where the memory leak issue I thought was?) was
>>>> not upgraded. In fact, very little of ovirt packages on the node were
>>>> noticeably not updated
>>>>
>>>>
>>> We are not building vdsm for el6 in 3.6, you need also to upgrade to el7
>>> if you want that.
>>>
>>>
>>>> Updated node received the following updated packages during the install:
>>>>
>>>> http://pastebin.ca/3362714
>>>>
>>>> Note specifically the only packages updated via the ovirt3.6 repository
>>>> was ioprocess, otopi, ovirt-engine-sdk-python, ovirt-host-deploy,
>>>> ovirt-release36, and python-ioprocess. I had expected to see some packages
>>>> like vdsm and the likes updated - or was this not the case?
>>>>
>>>> Upgraded node:
>>>>
>>>> [compute[root@node02 yum.repos.d]$ rpm -qa | grep -i vdsm
>>>> vdsm-4.16.30-0.el6.x86_64
>>>> vdsm-python-zombiereaper-4.16.30-0.el6.noarch
>>>> vdsm-cli-4.16.30-0.el6.noarch
>>>> vdsm-yajsonrpc-4.16.30-0.el6.noarch
>>>> vdsm-jsonrpc-4.16.30-0.el6.noarch
>>>> vdsm-xmlrpc-4.16.30-0.el6.noarch
>>>> vdsm-python-4.16.30-0.el6.noarch
>>>>
>>>> Nonupgraded node
>>>>
>>>> [compute[root@node01 ~]$ rpm -qa | grep -i vdsm
>>>> vdsm-cli-4.16.30-0.el6.noarch
>>>> vdsm-jsonrpc-4.16.30-0.el6.noarch
>>>> vdsm-python-zombiereaper-4.16.30-0.el6.noarch
>>>> vdsm-xmlrpc-4.16.30-0.el6.noarch
>>>> vdsm-yajsonrpc-4.16.30-0.el6.noarch
>>>> vdsm-4.16.30-0.el6.x86_64
>>>> vdsm-python-4.16.30-0.el6.noarch
>>>>
>>>> Also, the docs stated that the engine VM would migrate to the freshly
>>>> upgraded node since it would have a higher number but it did not
>>>>
>>>> So I cant really confirm whether or not my issue will be resolved? Or
>>>> that if the node was actually updated properly?
>>>>
>>>> Please advise on how to confirm
>>>>
>>>> Thank you!
>>>>
>>>> On Sat, Jan 23, 2016 at 12:55 AM, Charles Kozler <char...@fixflyer.com>
>>>> wrote:
>>>>
>>>>> Thanks Sandro. Should clarify my storage is external on a redundant
>>>>> SAN. The steps I was concerned about was the actual upgrade. I tried to
>>>>> upgrade before and it brought my entire stack crumbling down so I'm
>>>>> hesitant. This bug seems like a huge bug that should at least somehow
>>>>> backported if at all possible because, to me, it renders the entire 3.5.6
>>>>> branch unusable as no VMs can be deployed since OOM will eventually kill
>>>>> them. In any case that's just my opinion and I'm a new user to ovirt. The
>>>>> docs I followed originally got me going how I need and somehow didn't work
>>>>> for 3.6 in the same fashion so naturally I'm hesitant to upgrade but
>>>>> clearly have no option if I want to continue my infr

Re: [ovirt-users] memory leak in 3.5.6 - not vdsm

2016-02-01 Thread Charles Kozler
Sorry to be clear..the only way to resolve the memory leak I am facing now
is to upgrade to el7?

Also the engine can stay running on el6 yes? I successfully upgraded my
engine to ovirt 3.6 in el6. Do I need to make my engine vm el7 too?
On Feb 1, 2016 12:49 PM, "Charles Kozler" <char...@fixflyer.com> wrote:

> So I will still have the memory leak?
> On Feb 1, 2016 12:39 PM, "Simone Tiraboschi" <stira...@redhat.com> wrote:
>
>>
>>
>> On Mon, Feb 1, 2016 at 6:33 PM, Charles Kozler <char...@fixflyer.com>
>> wrote:
>>
>>> So what about the bug that I hit for vdsm as listed above by Nir? Will I
>>> have that patch to avoid the memory leak or no? Upgrading an entire node to
>>> centos 7 is not actually feasible and was previously outlined above that I
>>> just needed to upgrade to ovirt 3.6 and no mention of OS change ...
>>>
>>
>> You cannot install VDSM from 3.6 on el6:
>>
>> http://www.ovirt.org/OVirt_3.6_Release_Notes#RHEL_6.7_-_CentOS_6.7_and_similar
>> and there is no plan for a 3.5.8.
>>
>>
>>
>>> On Feb 1, 2016 12:30 PM, "Simone Tiraboschi" <stira...@redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Feb 1, 2016 at 5:40 PM, Charles Kozler <char...@fixflyer.com>
>>>> wrote:
>>>>
>>>>> Sandro / Nir -
>>>>>
>>>>> I followed your steps plus
>>>>>
>>>>> http://www.ovirt.org/OVirt_3.6_Release_Notes#Fedora_.2F_CentOS_.2F_RHEL
>>>>>
>>>>> Engine upgraded fine but then when I got to upgrading a node I did:
>>>>>
>>>>> $ yum install
>>>>> http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
>>>>> $ yum update -y
>>>>>
>>>>> And then rebooted the node. I noticed libvirt was updated by a .1
>>>>> release number but vdsm (where the memory leak issue I thought was?) was
>>>>> not upgraded. In fact, very little of ovirt packages on the node were
>>>>> noticeably not updated
>>>>>
>>>>>
>>>> We are not building vdsm for el6 in 3.6, you need also to upgrade to
>>>> el7 if you want that.
>>>>
>>>>
>>>>> Updated node received the following updated packages during the
>>>>> install:
>>>>>
>>>>> http://pastebin.ca/3362714
>>>>>
>>>>> Note specifically the only packages updated via the ovirt3.6
>>>>> repository
>>>>> was ioprocess, otopi, ovirt-engine-sdk-python, ovirt-host-deploy,
>>>>> ovirt-release36, and python-ioprocess. I had expected to see some packages
>>>>> like vdsm and the likes updated - or was this not the case?
>>>>>
>>>>> Upgraded node:
>>>>>
>>>>> [compute[root@node02 yum.repos.d]$ rpm -qa | grep -i vdsm
>>>>> vdsm-4.16.30-0.el6.x86_64
>>>>> vdsm-python-zombiereaper-4.16.30-0.el6.noarch
>>>>> vdsm-cli-4.16.30-0.el6.noarch
>>>>> vdsm-yajsonrpc-4.16.30-0.el6.noarch
>>>>> vdsm-jsonrpc-4.16.30-0.el6.noarch
>>>>> vdsm-xmlrpc-4.16.30-0.el6.noarch
>>>>> vdsm-python-4.16.30-0.el6.noarch
>>>>>
>>>>> Nonupgraded node
>>>>>
>>>>> [compute[root@node01 ~]$ rpm -qa | grep -i vdsm
>>>>> vdsm-cli-4.16.30-0.el6.noarch
>>>>> vdsm-jsonrpc-4.16.30-0.el6.noarch
>>>>> vdsm-python-zombiereaper-4.16.30-0.el6.noarch
>>>>> vdsm-xmlrpc-4.16.30-0.el6.noarch
>>>>> vdsm-yajsonrpc-4.16.30-0.el6.noarch
>>>>> vdsm-4.16.30-0.el6.x86_64
>>>>> vdsm-python-4.16.30-0.el6.noarch
>>>>>
>>>>> Also, the docs stated that the engine VM would migrate to the freshly
>>>>> upgraded node since it would have a higher number but it did not
>>>>>
>>>>> So I cant really confirm whether or not my issue will be resolved? Or
>>>>> that if the node was actually updated properly?
>>>>>
>>>>> Please advise on how to confirm
>>>>>
>>>>> Thank you!
>>>>>
>>>>> On Sat, Jan 23, 2016 at 12:55 AM, Charles Kozler <char...@fixflyer.com
>>>>> > wrote:
>>>>>
>>>>>> Thanks Sandro. Should clarify my storage is external on a redundant
>>>>>> SAN. The steps I was

Re: [ovirt-users] memory leak in 3.5.6 - not vdsm

2016-02-01 Thread Charles Kozler
Sandro / Nir -

I followed your steps plus

http://www.ovirt.org/OVirt_3.6_Release_Notes#Fedora_.2F_CentOS_.2F_RHEL

Engine upgraded fine but then when I got to upgrading a node I did:

$ yum install http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
$ yum update -y

And then rebooted the node. I noticed libvirt was updated by a .1 release
number but vdsm (where the memory leak issue I thought was?) was not
upgraded. In fact, very little of ovirt packages on the node were
noticeably not updated

Updated node received the following updated packages during the install:

http://pastebin.ca/3362714

Note specifically the only packages updated via the ovirt3.6 repository
was ioprocess, otopi, ovirt-engine-sdk-python, ovirt-host-deploy,
ovirt-release36, and python-ioprocess. I had expected to see some packages
like vdsm and the likes updated - or was this not the case?

Upgraded node:

[compute[root@node02 yum.repos.d]$ rpm -qa | grep -i vdsm
vdsm-4.16.30-0.el6.x86_64
vdsm-python-zombiereaper-4.16.30-0.el6.noarch
vdsm-cli-4.16.30-0.el6.noarch
vdsm-yajsonrpc-4.16.30-0.el6.noarch
vdsm-jsonrpc-4.16.30-0.el6.noarch
vdsm-xmlrpc-4.16.30-0.el6.noarch
vdsm-python-4.16.30-0.el6.noarch

Nonupgraded node

[compute[root@node01 ~]$ rpm -qa | grep -i vdsm
vdsm-cli-4.16.30-0.el6.noarch
vdsm-jsonrpc-4.16.30-0.el6.noarch
vdsm-python-zombiereaper-4.16.30-0.el6.noarch
vdsm-xmlrpc-4.16.30-0.el6.noarch
vdsm-yajsonrpc-4.16.30-0.el6.noarch
vdsm-4.16.30-0.el6.x86_64
vdsm-python-4.16.30-0.el6.noarch

Also, the docs stated that the engine VM would migrate to the freshly
upgraded node since it would have a higher number but it did not

So I cant really confirm whether or not my issue will be resolved? Or that
if the node was actually updated properly?

Please advise on how to confirm

Thank you!

On Sat, Jan 23, 2016 at 12:55 AM, Charles Kozler <char...@fixflyer.com>
wrote:

> Thanks Sandro. Should clarify my storage is external on a redundant SAN.
> The steps I was concerned about was the actual upgrade. I tried to upgrade
> before and it brought my entire stack crumbling down so I'm hesitant. This
> bug seems like a huge bug that should at least somehow backported if at all
> possible because, to me, it renders the entire 3.5.6 branch unusable as no
> VMs can be deployed since OOM will eventually kill them. In any case that's
> just my opinion and I'm a new user to ovirt. The docs I followed originally
> got me going how I need and somehow didn't work for 3.6 in the same fashion
> so naturally I'm hesitant to upgrade but clearly have no option if I want
> to continue my infrastructure on ovirt. Thank you again for taking the time
> out to assist me, I truly appreciate it. I will try an upgrade next week
> and pray it all goes well :-)
> On Jan 23, 2016 12:40 AM, "Sandro Bonazzola" <sbona...@redhat.com> wrote:
>
>>
>>
>> On Fri, Jan 22, 2016 at 10:53 PM, Charles Kozler <char...@fixflyer.com>
>> wrote:
>>
>>> Sandro -
>>>
>>> Do you have available documentation that can support upgrading self
>>> hosted? I followed this
>>> http://community.redhat.com/blog/2014/10/up-and-running-with-ovirt-3-5/
>>>
>>> Would it be as easy as installing the RPM and then running yum upgrade?
>>>
>>>
>> Note that mentioned article describes an unsupported hyperconverged setup
>> running NFS over Gluster.
>> That said,
>> 1) put the hosted-engine storage domain into global maintenance mode
>> 2) upgrade the engine VM
>> 3) select the first host to upgrade and put it under maintenance from the
>> engine, wait for the engine vm to migrate if needed.
>> 4) yum upgrade the first host and wait until ovirt-ha-agent completes
>> 5) exit global and local maintenance mode
>> 6) repeat 3-5 on all the other hosts
>> 7) once all hosts are updated you can increase the cluster compatibility
>> level to 3.6. At this point the engine will trigger the auto-import of the
>> hosted-engine storage domain.
>>
>> Simone, Roy, can you confirm above steps? Maybe also you can update
>> http://www.ovirt.org/Hosted_Engine_Howto#Upgrade_Hosted_Engine
>>
>>
>>
>>> Thanks
>>>
>>> On Fri, Jan 22, 2016 at 4:42 PM, Sandro Bonazzola <sbona...@redhat.com>
>>> wrote:
>>>
>>>>
>>>> Il 22/Gen/2016 22:31, "Charles Kozler" <char...@fixflyer.com> ha
>>>> scritto:
>>>> >
>>>> > Hi Nir -
>>>> >
>>>> > do you have a release target date for 3.5.8? Any estimate would help.
>>>> >
>>>>
>>>> There won't be any supported release after 3.5.6. Please update to
>>>> 3.6.2 next week
>>>>

Re: [ovirt-users] memory leak in 3.5.6 - not vdsm

2016-02-01 Thread Charles Kozler
So what about the bug that I hit for vdsm as listed above by Nir? Will I
have that patch to avoid the memory leak or no? Upgrading an entire node to
centos 7 is not actually feasible and was previously outlined above that I
just needed to upgrade to ovirt 3.6 and no mention of OS change ...
On Feb 1, 2016 12:30 PM, "Simone Tiraboschi" <stira...@redhat.com> wrote:

>
>
> On Mon, Feb 1, 2016 at 5:40 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
>
>> Sandro / Nir -
>>
>> I followed your steps plus
>>
>> http://www.ovirt.org/OVirt_3.6_Release_Notes#Fedora_.2F_CentOS_.2F_RHEL
>>
>> Engine upgraded fine but then when I got to upgrading a node I did:
>>
>> $ yum install http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
>> $ yum update -y
>>
>> And then rebooted the node. I noticed libvirt was updated by a .1 release
>> number but vdsm (where the memory leak issue I thought was?) was not
>> upgraded. In fact, very little of ovirt packages on the node were
>> noticeably not updated
>>
>>
> We are not building vdsm for el6 in 3.6, you need also to upgrade to el7
> if you want that.
>
>
>> Updated node received the following updated packages during the install:
>>
>> http://pastebin.ca/3362714
>>
>> Note specifically the only packages updated via the ovirt3.6 repository
>> was ioprocess, otopi, ovirt-engine-sdk-python, ovirt-host-deploy,
>> ovirt-release36, and python-ioprocess. I had expected to see some packages
>> like vdsm and the likes updated - or was this not the case?
>>
>> Upgraded node:
>>
>> [compute[root@node02 yum.repos.d]$ rpm -qa | grep -i vdsm
>> vdsm-4.16.30-0.el6.x86_64
>> vdsm-python-zombiereaper-4.16.30-0.el6.noarch
>> vdsm-cli-4.16.30-0.el6.noarch
>> vdsm-yajsonrpc-4.16.30-0.el6.noarch
>> vdsm-jsonrpc-4.16.30-0.el6.noarch
>> vdsm-xmlrpc-4.16.30-0.el6.noarch
>> vdsm-python-4.16.30-0.el6.noarch
>>
>> Nonupgraded node
>>
>> [compute[root@node01 ~]$ rpm -qa | grep -i vdsm
>> vdsm-cli-4.16.30-0.el6.noarch
>> vdsm-jsonrpc-4.16.30-0.el6.noarch
>> vdsm-python-zombiereaper-4.16.30-0.el6.noarch
>> vdsm-xmlrpc-4.16.30-0.el6.noarch
>> vdsm-yajsonrpc-4.16.30-0.el6.noarch
>> vdsm-4.16.30-0.el6.x86_64
>> vdsm-python-4.16.30-0.el6.noarch
>>
>> Also, the docs stated that the engine VM would migrate to the freshly
>> upgraded node since it would have a higher number but it did not
>>
>> So I cant really confirm whether or not my issue will be resolved? Or
>> that if the node was actually updated properly?
>>
>> Please advise on how to confirm
>>
>> Thank you!
>>
>> On Sat, Jan 23, 2016 at 12:55 AM, Charles Kozler <char...@fixflyer.com>
>> wrote:
>>
>>> Thanks Sandro. Should clarify my storage is external on a redundant SAN.
>>> The steps I was concerned about was the actual upgrade. I tried to upgrade
>>> before and it brought my entire stack crumbling down so I'm hesitant. This
>>> bug seems like a huge bug that should at least somehow backported if at all
>>> possible because, to me, it renders the entire 3.5.6 branch unusable as no
>>> VMs can be deployed since OOM will eventually kill them. In any case that's
>>> just my opinion and I'm a new user to ovirt. The docs I followed originally
>>> got me going how I need and somehow didn't work for 3.6 in the same fashion
>>> so naturally I'm hesitant to upgrade but clearly have no option if I want
>>> to continue my infrastructure on ovirt. Thank you again for taking the time
>>> out to assist me, I truly appreciate it. I will try an upgrade next week
>>> and pray it all goes well :-)
>>> On Jan 23, 2016 12:40 AM, "Sandro Bonazzola" <sbona...@redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jan 22, 2016 at 10:53 PM, Charles Kozler <char...@fixflyer.com>
>>>> wrote:
>>>>
>>>>> Sandro -
>>>>>
>>>>> Do you have available documentation that can support upgrading self
>>>>> hosted? I followed this
>>>>> http://community.redhat.com/blog/2014/10/up-and-running-with-ovirt-3-5/
>>>>>
>>>>> Would it be as easy as installing the RPM and then running yum upgrade?
>>>>>
>>>>>
>>>> Note that mentioned article describes an unsupported hyperconverged
>>>> setup running NFS over Gluster.
>>>> That said,
>>>> 1) put the hosted-engine storage domain in

Re: [ovirt-users] memory leak in 3.5.6 - not vdsm

2016-01-22 Thread Charles Kozler
Hi Nir -

do you have a release target date for 3.5.8? Any estimate would help.

If its not VDSM, what is it exactly? Sorry, I understood from the ticket it
was something inside vdsm, was I mistaken?

CentOS 6 is the servers. 6.7 to be exact

I have done all forms of flushing that I can (page cache, inodes, dentry's,
etc) and as well moved VM's around to other nodes and nothing changes the
memory. How can I find the leak? Where is the leak? RES shows the following
of which, the totals dont add up to 20GB

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND


 19044 qemu  20   0 8876m 4.0g 5680 S  3.6 12.9   1571:44 qemu-kvm


 26143 qemu  20   0 5094m 1.1g 5624 S  9.2  3.7   6012:12 qemu-kvm


  5837 root   0 -20  964m 624m 3664 S  0.0  2.0  85:22.09 glusterfs


 14328 root   0 -20  635m 169m 3384 S  0.0  0.5  43:15.23 glusterfs


  5134 vdsm   0 -20 4368m 111m  10m S  5.9  0.3   3710:50 vdsm


  4095 root  15  -5  727m  43m  10m S  0.0  0.1   0:02.00
supervdsmServer

4.0G + 1.1G + 624M + 169 + 111M + 43M = ~7GB

This was top sorted by RES from highest to lowest

At that point I wouldnt know where else to look except slab / kernel
structures. Of which slab shows:

[compute[root@node1 ~]$ cat /proc/meminfo | grep -i slab
Slab:2549748 kB

So roughly 2-3GB. Adding that to the other use of 7GB we have still about
10GB unaccounted for

On Fri, Jan 22, 2016 at 4:24 PM, Nir Soffer <nsof...@redhat.com> wrote:

> On Fri, Jan 22, 2016 at 11:08 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
> > Hi Nir -
> >
> > Thanks for getting back to me. Will the patch to 3.6 be backported to
> 3.5?
>
> We plan to include them in 3.5.8.
>
> > As you can tell from the images, it takes days and days for it to
> increase
> > over time. I also wasnt sure if that was the right bug because VDSM
> memory
> > shows normal from top ...
> >
> >PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> >   5134 vdsm   0 -20 4368m 111m  10m S  2.0  0.3   3709:28 vdsm
>
> As you wrote, this issue is not related to vdsm.
>
> >
> > Res is only 111M. This is from node1 which is showing currently 20GB of
> 32GB
> > used with only 2 VMs running on it - 1 with 4G and another with ~1 GB of
> RAM
> > configured
> >
> > The images are from nagios and the value here is a direct correlation to
> > what you would see in the free command output. See below from an example
> of
> > node 1 and node 2
> >
> > [compute[root@node1 ~]$ free
> >  total   used   free sharedbuffers cached
> > Mem:  32765316   20318156   12447160252  30884 628948
> > -/+ buffers/cache:   19658324   13106992
> > Swap: 19247100  0   19247100
> > [compute[root@node1 ~]$ free -m
> >  total   used   free sharedbuffers cached
> > Mem: 31997  19843  12153  0 30614
> > -/+ buffers/cache:  19199  12798
> > Swap:18795  0  18795
> >
> > And its correlated image http://i.imgur.com/PZLEgyx.png (~19GB used)
> >
> > And as a control, node 2 that I just restarted today
> >
> > [compute[root@node2 ~]$ free
> >  total   used   free sharedbuffers cached
> > Mem:  327653161815324   30949992212  35784 717320
> > -/+ buffers/cache:1062220   31703096
> > Swap: 19247100  0   19247100
>
> Is this rhel/centos 6?
>
> > [compute[root@node2 ~]$ free -m
> >  total   used   free sharedbuffers cached
> > Mem: 31997   1772  30225  0 34700
> > -/+ buffers/cache:   1036  30960
> > Swap:18795  0  18795
> >
> > And its correlated image http://i.imgur.com/8ldPVqY.png  (~2GB used).
> Note
> > how 1772 in the image is exactly what is registered under 'used' in free
> > command
>
> I guess you should start looking at the processes running on these nodes.
>
> Maybe try to collect memory usage per process using ps?
>
> >
> > On Fri, Jan 22, 2016 at 3:59 PM, Nir Soffer <nsof...@redhat.com> wrote:
> >>
> >> On Fri, Jan 22, 2016 at 9:25 PM, Charles Kozler <char...@fixflyer.com>
> >> wrote:
> >> > Here is a screenshot of my three nodes and their increased memory
> usage
> >> > over
> >> > 30 days. Note that node #2 had 1 single VM that had 4GB of RAM
> assigned
> >> > to
> >> > it. I had since shut it down and saw no memory reclamation occur.
> &g

Re: [ovirt-users] memory leak in 3.5.6 - not vdsm

2016-01-22 Thread Charles Kozler
Hi Nir -

Thanks for getting back to me. Will the patch to 3.6 be backported to 3.5?
As you can tell from the images, it takes days and days for it to increase
over time. I also wasnt sure if that was the right bug because VDSM memory
shows normal from top ...

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND


  5134 vdsm   0 -20 4368m 111m  10m S  2.0  0.3   3709:28 vdsm

Res is only 111M. This is from node1 which is showing currently 20GB of
32GB used with only 2 VMs running on it - 1 with 4G and another with ~1 GB
of RAM configured

The images are from nagios and the value here is a direct correlation to
what you would see in the free command output. See below from an example of
node 1 and node 2

[compute[root@node1 ~]$ free
 total   used   free sharedbuffers cached
Mem:  32765316   20318156   12447160252  30884 628948
-/+ buffers/cache:   19658324   13106992
Swap: 19247100  0   19247100
[compute[root@node1 ~]$ free -m
 total   used   free sharedbuffers cached
Mem: 31997  19843  12153  0 30614
-/+ buffers/cache:  19199  12798
Swap:18795  0  18795

And its correlated image http://i.imgur.com/PZLEgyx.png (~19GB used)

And as a control, node 2 that I just restarted today

[compute[root@node2 ~]$ free
 total   used   free sharedbuffers cached
Mem:  327653161815324   30949992212  35784 717320
-/+ buffers/cache:1062220   31703096
Swap: 19247100  0   19247100
[compute[root@node2 ~]$ free -m
 total   used   free sharedbuffers cached
Mem: 31997   1772  30225  0 34700
-/+ buffers/cache:   1036  30960
Swap:18795  0  18795

And its correlated image http://i.imgur.com/8ldPVqY.png  (~2GB used). Note
how 1772 in the image is exactly what is registered under 'used' in free
command

On Fri, Jan 22, 2016 at 3:59 PM, Nir Soffer <nsof...@redhat.com> wrote:

> On Fri, Jan 22, 2016 at 9:25 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
> > Here is a screenshot of my three nodes and their increased memory usage
> over
> > 30 days. Note that node #2 had 1 single VM that had 4GB of RAM assigned
> to
> > it. I had since shut it down and saw no memory reclamation occur.
> Further, I
> > flushed page caches and inodes and ran 'sync'. I tried everything but
> > nothing brought the memory usage down. vdsm was low too (couple hundred
> MB)
>
> Note that there is an old leak in vdsm, will be fixed in next 3.6 build:
> https://bugzilla.redhat.com/1269424
>
> > and there was no qemu-kvm process running so I'm at a loss
> >
> > http://imgur.com/a/aFPcK
> >
> > Please advise on what I can do to debug this. Note I have restarted node
> 2
> > (which is why you see the drop) to see if it raises in memory use over
> tim
> > even with no VM's running
>
> Not sure what is "memory" that you show in the graphs. Theoretically this
> may be
> normal memory usage, Linux using free memory for the buffer cache.
>
> Can you instead show the output of "free", during one day, maybe run once
> per hour?
>
> You may also like to install sysstat for collecting and monitoring
> resources usage.
>
> >
> > [compute[root@node2 log]$ rpm -qa | grep -i ovirt
> > libgovirt-0.3.2-1.el6.x86_64
> > ovirt-release35-006-1.noarch
> > ovirt-hosted-engine-ha-1.2.8-1.el6.noarch
> > ovirt-hosted-engine-setup-1.2.6.1-1.el6.noarch
> > ovirt-engine-sdk-python-3.5.6.0-1.el6.noarch
> > ovirt-host-deploy-1.3.2-1.el6.noarch
> >
> >
> > --
> >
> > Charles Kozler
> > Vice President, IT Operations
> >
> > FIX Flyer, LLC
> > 225 Broadway | Suite 1600 | New York, NY 10007
> > 1-888-349-3593
> > http://www.fixflyer.com
> >
> > NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
> RECIPIENT(S)
> > OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION WHICH IS
> > PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
> DISTRIBUTION,
> > OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS INFORMATION
> IS
> > RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED RECIPIENT, PLEASE
> > CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS E-MAIL FROM
> YOUR
> > SYSTEM AND DESTROY ANY COPIES.
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>



-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, 

Re: [ovirt-users] memory leak in 3.5.6 - not vdsm

2016-01-22 Thread Charles Kozler
Sandro -

Do you have available documentation that can support upgrading self hosted?
I followed this
http://community.redhat.com/blog/2014/10/up-and-running-with-ovirt-3-5/

Would it be as easy as installing the RPM and then running yum upgrade?

Thanks

On Fri, Jan 22, 2016 at 4:42 PM, Sandro Bonazzola <sbona...@redhat.com>
wrote:

>
> Il 22/Gen/2016 22:31, "Charles Kozler" <char...@fixflyer.com> ha scritto:
> >
> > Hi Nir -
> >
> > do you have a release target date for 3.5.8? Any estimate would help.
> >
>
> There won't be any supported release after 3.5.6. Please update to 3.6.2
> next week
>
> > If its not VDSM, what is it exactly? Sorry, I understood from the ticket
> it was something inside vdsm, was I mistaken?
> >
> > CentOS 6 is the servers. 6.7 to be exact
> >
> > I have done all forms of flushing that I can (page cache, inodes,
> dentry's, etc) and as well moved VM's around to other nodes and nothing
> changes the memory. How can I find the leak? Where is the leak? RES shows
> the following of which, the totals dont add up to 20GB
> >
> >PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>
>
> >  19044 qemu  20   0 8876m 4.0g 5680 S  3.6 12.9   1571:44 qemu-kvm
>
>
> >  26143 qemu  20   0 5094m 1.1g 5624 S  9.2  3.7   6012:12 qemu-kvm
>
>
> >   5837 root   0 -20  964m 624m 3664 S  0.0  2.0  85:22.09 glusterfs
>
>
> >  14328 root   0 -20  635m 169m 3384 S  0.0  0.5  43:15.23 glusterfs
>
>
> >   5134 vdsm   0 -20 4368m 111m  10m S  5.9  0.3   3710:50 vdsm
>
>
> >   4095 root  15  -5  727m  43m  10m S  0.0  0.1   0:02.00
> supervdsmServer
> >
> > 4.0G + 1.1G + 624M + 169 + 111M + 43M = ~7GB
> >
> > This was top sorted by RES from highest to lowest
> >
> > At that point I wouldnt know where else to look except slab / kernel
> structures. Of which slab shows:
> >
> > [compute[root@node1 ~]$ cat /proc/meminfo | grep -i slab
> > Slab:        2549748 kB
> >
> > So roughly 2-3GB. Adding that to the other use of 7GB we have still
> about 10GB unaccounted for
> >
> > On Fri, Jan 22, 2016 at 4:24 PM, Nir Soffer <nsof...@redhat.com> wrote:
> >>
> >> On Fri, Jan 22, 2016 at 11:08 PM, Charles Kozler <char...@fixflyer.com>
> wrote:
> >> > Hi Nir -
> >> >
> >> > Thanks for getting back to me. Will the patch to 3.6 be backported to
> 3.5?
> >>
> >> We plan to include them in 3.5.8.
> >>
> >> > As you can tell from the images, it takes days and days for it to
> increase
> >> > over time. I also wasnt sure if that was the right bug because VDSM
> memory
> >> > shows normal from top ...
> >> >
> >> >PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> >> >   5134 vdsm   0 -20 4368m 111m  10m S  2.0  0.3   3709:28 vdsm
> >>
> >> As you wrote, this issue is not related to vdsm.
> >>
> >> >
> >> > Res is only 111M. This is from node1 which is showing currently 20GB
> of 32GB
> >> > used with only 2 VMs running on it - 1 with 4G and another with ~1 GB
> of RAM
> >> > configured
> >> >
> >> > The images are from nagios and the value here is a direct correlation
> to
> >> > what you would see in the free command output. See below from an
> example of
> >> > node 1 and node 2
> >> >
> >> > [compute[root@node1 ~]$ free
> >> >  total   used   free sharedbuffers
>  cached
> >> > Mem:  32765316   20318156   12447160252  30884
>  628948
> >> > -/+ buffers/cache:   19658324   13106992
> >> > Swap: 19247100  0   19247100
> >> > [compute[root@node1 ~]$ free -m
> >> >  total   used   free sharedbuffers
>  cached
> >> > Mem: 31997  19843  12153  0 30
> 614
> >> > -/+ buffers/cache:  19199  12798
> >> > Swap:18795  0  18795
> >> >
> >> > And its correlated image http://i.imgur.com/PZLEgyx.png (~19GB used)
> >> >
> >> > And as a control, node 2 that I just restarted today
> >> >
> >> > [compute[root@node2 ~]$ free
> >> >  total   used   free sharedbuffers
>  cached
> >> > Mem:  327653161815324   30949992212  35784
>  717320
> >> &

  1   2   >