[ovirt-users] Re: oVirt/Ceph iSCSI Issues

2022-12-05 Thread Matthew J Black
Hi All,

OK, so an update

TL;DR: I got it all working.

So it turns out that I had a typo in my /etc/multipath/conf.d/host.conf file, 
and because all three Ceph iSCSI Gateways are identical (except for IP, 
Hostname, etc) I did a basic copy-paste of the one file to all three Gateways. 
After discovering this (and fixing it), I have now attached my four LUNs to the 
oVirt Cluster and everything *seems* AOK, both from the oVirt-side and the 
Ceph-side, so I think we're all good and so we can mark this thread as solved.

Thanks to everyone who had a hand it making suggestions and helping out - I 
really do appreciate it

Cheers

Dulux-Oz
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5AEJIYT5SZNZY4XXYTEKJQSIJNZIGB2G/


[ovirt-users] Re: oVirt/Ceph iSCSI Issues

2022-12-05 Thread Matthew J Black
Yes - as I said, I've got all three oVirt Hosts talking to all three Ceph iSCSI 
Gateways, and can see all four LUNs (via each of the three Gateways). Just to 
be 100% clear: yes, this means that all three oVirt Hosts are logged into the 
Ceph Cluster (via all three of the iSCSI Gateways).

Yes, I believe that the way it works (but *please* do not take this as gospel) 
is that each iSCSI Gateway - and it is recommended that a Ceph Cluster have 2-4 
of them) - plays the part of of the Target ie there is one Target with each of 
the 3 Gateways as "subordinate", as are the four LUNs and the three oVirt 
Initiators, which in turn show each LUN as subordinate to them, as well. Very 
confusing on initial look-see, but it sort of makes sense, I suppose.

However, please see my next/other post in this thread - and thanks for the help.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MGDYRCT6ZSSYXRUNEJ4ECDN4RTR7BDNG/


[ovirt-users] Re: oVirt/Ceph iSCSI Issues

2022-12-04 Thread Strahil Nikolov via Users
Before taking that approach.Have you tried to run iscsiadm manually from host3 
in order to identify if it sees the targets and is able to login?
I have experience only with HA iSCSI and I'm not sure I understand this 3 
gateway setup. Does this mean that all Ceph hosts play the role of the iscsi 
target ?
Also, run tcpdump on the 3rd (problematic ) oVirt node and initiate a new iSCSI 
setup.Maybe it can give a clue what is going on.
Best Regards,Strahil Nikolov 
 
 
  On Fri, Dec 2, 2022 at 9:04, Matthew J Black wrote:  
 Hi All,

So, on further investigation/experimentation, it seems that each individual 
iSCSI Storage Domain is being created (there is visual evidence from watching 
the Ceph Monitoring Portal, and LVM volumes are being created on the selected 
LUN/Host) but it *appears* that, for some reason, vdsm is "timing out" (or, at 
least, that's my interpretation of what is happening) and thus resetting the 
Storage Domains back to the pre-creation state.

So it appears that the fact that there is a previous LUN on the Ceph Pool/Disk 
is *not* my underlying issue.

So my question is: How would it be if I adjusted the following settings (via 
engine-config), which I obtained from this thread: 
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/LRZLORSREVEYXJHD6GX5NZGV6NM7EL2Z/

- TimeoutToResetVdsInSeconds
- VDSAttemptsToResetCount
- VdsRecoveryTimeoutInMintues
- VdsRefreshRate
- vdsTimeout

Anyone have any opinions (on this list and/or other settings)?

Cheers

Dulux-Oz
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z73EIUTOEQEIRKVXDVHSBJ3DSCG7OIZW/
  
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3IUZWKK3VMJ3UW2YONRYCXEDSDQF265Q/


[ovirt-users] Re: oVirt/Ceph iSCSI Issues

2022-12-01 Thread Matthew J Black
Hi All,

So, on further investigation/experimentation, it seems that each individual 
iSCSI Storage Domain is being created (there is visual evidence from watching 
the Ceph Monitoring Portal, and LVM volumes are being created on the selected 
LUN/Host) but it *appears* that, for some reason, vdsm is "timing out" (or, at 
least, that's my interpretation of what is happening) and thus resetting the 
Storage Domains back to the pre-creation state.

So it appears that the fact that there is a previous LUN on the Ceph Pool/Disk 
is *not* my underlying issue.

So my question is: How would it be if I adjusted the following settings (via 
engine-config), which I obtained from this thread: 
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/LRZLORSREVEYXJHD6GX5NZGV6NM7EL2Z/

- TimeoutToResetVdsInSeconds
- VDSAttemptsToResetCount
- VdsRecoveryTimeoutInMintues
- VdsRefreshRate
- vdsTimeout

Anyone have any opinions (on this list and/or other settings)?

Cheers

Dulux-Oz
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z73EIUTOEQEIRKVXDVHSBJ3DSCG7OIZW/


[ovirt-users] Re: oVirt/Ceph iSCSI Issues

2022-12-01 Thread Matthew J Black
Hi All,

So I just tried to create an iSCSI Storage Domain again, and got the same 
results.

I grabbed the relevant(?) part of the engine.log file and placed it into 
Dropbox: https://www.dropbox.com/s/0kt9z21zkewvn3j/extract_engine.log?dl=0

I can see the ERRORs, but I can't figure out what's causing them, because I 
can't see anything wrong with the system.

Would someone be so kind as to take a look and let me know what I'm missing, 
please?

Also, if there's other logs I should be looking at could someone please let me 
know that as well - I had a root-around and I couldn't see anything else 
relevant, but then, I'm not one of the devs who knows the system inside and out.

Thanks in advance

Dulux-Oz
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OBYJ2NUEX64A46DTSRYSN2NEKB73U7SO/


[ovirt-users] Re: oVirt/Ceph iSCSI Issues

2022-12-01 Thread Matthew J Black
Hi Murlio,

Thanks for that.

Yes, iSCSI Target is configured with ACL

Yes, all the Gateways have the same amount of sessions.

Yes, I followed your 6 step-suggestion.

So what happened is - and I won't go into all the boring details of how, why, 
and how - I discovered the remains of another iSCSI Target on the 2nd Gateway. 
Installing and using tergetcli I removed this old target, which allowed me to 
then remove and then re-attach the 2nd Gateway to the "real" target (using the 
Ceph GUI - resetting the gateways to be identical once again), and then ran 
through the 6-steps. I now have all three oVirt Hosts logged into all three 
Ceph iSCSI Gateways, with each of the four LUNs reporting connectivity to all 
three Gateways, etc.

So that's the Gateways-connectivity issue resolved.

However, I still have the other issue: I still come up with the "The following 
LUNs are already in use..." message and I still have the same results - ie log 
messages re: connectivity issues with the new (iSCSI) Storage domain briefly 
appearing and then being removed again (see 1st post in this thread).

This happens with all four LUNs no matter which of the three Gateways I point 
oVirt Admin GUI at. (Yes, I tried this twelve times to cover the twelve 
possible combinations).

So, anyone got any further ideas?  :-)

Cheers

Dulux-Oz
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ULAQSCBLPSOKYMZIBXLTCFJNG5BO435H/


[ovirt-users] Re: oVirt/Ceph iSCSI Issues

2022-11-29 Thread Murilo Morais
Matthew, good morning.

Is iSCSI Target configured with ACL?
Do all Gateways have the same amount of active sessions? It could be that
some Gateway has crashed the sessions (specifically gateway 3).

If you are not actually using Storage Domain on iSCSI, I recommend the
following:
1- Logout through oVirt
2- Check if there is still an initiator in the multipath on each oVirt Host
3- Log out of all sessions and delete through iscsiadm on each oVirt Host
4- Check if there is still an active session in CEPH
5- Restart all Gateway Daemons in CEPH, it may take a while if there is a
stuck session
6- Try to perform Discovery again through oVirt

Em ter., 29 de nov. de 2022 às 03:28, Matthew J Black <
matt...@peregrineit.net> escreveu:

> Hi All,
>
> I've got some issues with connecting my oVirt Cluster to my Ceph Cluster
> via iSCSI. There are two issues, and I don't know if one is causing the
> other, if they are related at all, or if they are two separate, unrelated
> issues. Let me explain.
>
> The Situation
> -
> - I have a working three node Ceph Cluster (Ceph Quincy on Rocky Linux 8.6)
> - The Ceph Cluster has four Storage Pools of between 4 and 8 TB each
> - The Ceph Cluster has three iSCSI Gateways
> - There is a single iSCSI Target on the Ceph Cluster
> - The iSCSI Target has all three iSCSI Gateways attached
> - The iSCSI Target has all four Storage Pools attached
> - The four Storage Pools have been assigned LUNs 0-3
> - I have set up (Discovery) CHAP Authorisation on the iSCSI Target
> - I have a working three node self-hosted oVirt Cluster (oVirt v4.5.3 on
> Rocky Linux 8.6)
> - The oVirt Cluster has (in addition to the hosted_storage Storage Domain)
> three GlusterFS Storage Domains
> - I can ping all three Ceph Cluster Nodes to/from all three oVirt Hosts
> - The iSCSI Target on the Ceph Cluster has all three oVirt Hosts
> Initiators attached
> - Each Initiator has all four Ceph Storage Pools attached
> - I have set up CHAP Authorisation on the iSCSI Target's Initiators
> - The Ceph Cluster Admin Portal reports that all three Initiators are
> "logged_in"
> - I have previous connected Ceph iSCSI LUNs to the oVirt Cluster
> successfully (as an experiment), but had to remove and re-instate them for
> the "final" version(?).
> - The oVirt Admin Portal (ie HostedEngine) reports that Initiators are 1 &
> 2 (ie oVirt Hosts 1 & 2) are "logged_in" to all three iSCSI Gateways
> - The oVirt Admin Portal reports that Initiator 3 (ie oVirt Host 3) is
> "logged_in" to iSCSI Gateways 1 & 2
> - I can "force" Initiator 3 to become "logged_in" to iSCSI Gateway 3, but
> when I do this it is *not* persistent
> - oVirt Hosts 1 & 2 can/have discovered all three iSCSI Gateways
> - oVirt Hosts 1 & 2 can/have discovered all four LUNs/Targets on all three
> iSCSI Gateways
> - oVirt Host 3 can only discover 2 of the iSCSI Gateways
> - For Target/LUN 0 oVirt Host 3 can only "see" the LUN provided by iSCSI
> Gateway 1
> - For Targets/LUNs 1-3 oVirt Host 3 can only "see" the LUNs provided by
> iSCSI Gateways 1 & 2
> - oVirt Host 3 can *not* "see" any of the Targets/LUNs provided by iSCSI
> Gateway 3
> - When I create a new oVirt Storage Domain for any of the four LUNs:
>   - I am presented with a message saying "The following LUNs are already
> in use..."
>   - I am asked to "Approve operation" via a checkbox, which I do
>   - As I watch the oVirt Admin Portal I can see the new iSCSI Storage
> Domain appear in the Storage Domain list, and then after a few minutes it
> is removed
>   - After those few minutes I am presented with this failure message:
> "Error while executing action New SAN Storage Domain: Network error during
> communication with the Host."
> - I have looked in the engine.log and all I could find that was relevant
> (as far as I know) was this:
> ~~~
> 2022-11-28 19:59:20,506+11 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand]
> (default task-1) [77b0c12d] Command 'CreateStorageDomainVDSCommand(HostName
> = ovirt_node_1.mynet.local,
> CreateStorageDomainVDSCommandParameters:{hostId='967301de-be9f-472a-8e66-03c24f01fa71',
> storageDomain='StorageDomainStatic:{name='data',
> id='2a14e4bd-c273-40a0-9791-6d683d145558'}',
> args='s0OGKR-80PH-KVPX-Fi1q-M3e4-Jsh7-gv337P'})' execution failed:
> VDSGenericException: VDSNetworkException: Message timeout which can be
> caused by communication issues
>
> 2022-11-28 19:59:20,507+11 ERROR
> [org.ovirt.engine.core.bll.storage.domain.AddSANStorageDomainCommand]
> (default task-1) [77b0c12d] Command
> 'org.ovirt.engine.core.bll.storage.domain.AddSANStorageDomainCommand'
> failed: EngineException:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
> VDSGenericException: VDSNetworkException: Message timeout which can be
> caused by communication issues (Failed with error VDS_NETWORK_ERROR and
> code 5022)
> ~~~
>
> I cannot see/detect any "communication issue" - but then again I'm not
> 100% sure what I should be looking