Hi Sahina, Many thanks for your response.
I have now raised a bug against this issue. For your reference it is bug #1578257 - https://bugzilla.redhat.com/show_bug.cgi?id=1578257 <https://bugzilla.redhat.com/show_bug.cgi?id=1578257> I will enable debuging today as requested and attach the logs to the bug report. Many thanks, Mark Betham > On 14 May 2018, at 12:34, Sahina Bose <[email protected]> wrote: > > > > On Mon, May 14, 2018 at 4:07 PM, Mark Betham <[email protected] > <mailto:[email protected]>> wrote: > Hi Sahina, > > Many thanks for your response and apologies for my delay in getting back to > you. > > >> How was the schedule created - is this using the Remote Data Sync Setup >> under Storage domain? > > > Ovirt is configured in ‘Gluster’ mode, no VM support. When snapshotting we > are taking a snapshot of the full Gluster volume. > > To configure the snapshot schedule I did the following; > Login to Ovirt WebUI > From left hand menu select ‘Storage’ and ‘Volumes' > I then selected the volume I wanted to snapshot by clicking on the link > within the ‘Name’ column > From here I selected the ‘Snapshots’ tab > From the top menu options I selected the drop down ‘Snapshot’ > From the drop down options I selected ‘New’ > A new window appeared titled ‘Create/Schedule Snapshot’ > I entered a snapshot prefix and description into the available fields and > selected the ‘Schedule’ page > On the schedule page I selected ‘Minute’ from the ‘Recurrence’ drop down > Set ‘Interval’ to every ’30’ minutes > Changed timezone to ‘Europe/London=(GMT+00:00) London Standard Time’ > Left value in ‘Start Schedule by’ at default value > Set schedule to ‘No End Date’ > Click 'OK' > > Interestingly I get the following message on the ‘Create/Schedule Snapshot’ > page before clicking on OK; > Frequent creation of snapshots would overload the cluster > Gluster CLI based snapshot scheduling is enabled. It would be disabled once > volume snapshots scheduled from UI. > > What is interesting is that I have not enabled 'Gluster CLI based snapshot > scheduling’. > > After clicking OK I am returned to the Volume Snapshots tab. > > From this point I get no snapshots created according to the schedule set. > > At the time of clicking OK in the WebUI to enable the schedule I get the > following in the engine log; > 2018-05-14 09:24:11,068Z WARN > [org.ovirt.engine.core.dal.job.ExecutionMessageDirector] (default task-128) > [85d0b16f-2c0c-464f-bbf1-682c062a4871] The message key > 'ScheduleGlusterVolumeSnapshot' is missing from 'bundles/ExecutionMessages' > 2018-05-14 09:24:11,090Z INFO > [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Before acquiring > and wait lock > 'EngineLock:{exclusiveLocks='[712da1df-4c11-405a-8fb6-f99aebc185c1=GLUSTER_SNAPSHOT]', > sharedLocks=''}' > 2018-05-14 09:24:11,090Z INFO > [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Lock-wait acquired > to object > 'EngineLock:{exclusiveLocks='[712da1df-4c11-405a-8fb6-f99aebc185c1=GLUSTER_SNAPSHOT]', > sharedLocks=''}' > 2018-05-14 09:24:11,111Z INFO > [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Running command: > ScheduleGlusterVolumeSnapshotCommand internal: false. Entities affected : > ID: 712da1df-4c11-405a-8fb6-f99aebc185c1 Type: GlusterVolumeAction group > MANIPULATE_GLUSTER_VOLUME with role type ADMIN > 2018-05-14 09:24:11,148Z INFO > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] EVENT_ID: > GLUSTER_VOLUME_SNAPSHOT_SCHEDULED(4,134), Snapshots scheduled on volume > glustervol0 of cluster NOSS-LD5. > 2018-05-14 09:24:11,156Z INFO > [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Lock freed to > object > 'EngineLock:{exclusiveLocks='[712da1df-4c11-405a-8fb6-f99aebc185c1=GLUSTER_SNAPSHOT]', > sharedLocks=''}' > >> Could you please provide the engine.log from the time the schedule was setup >> and including the time the schedule was supposed to run? > > > The original log file is no longer present, so I removed the old schedule and > created a new schedule, as per the instructions above, earlier today. I have > therefor attached the engine log from today. The new schedule, which was set > to run every 30 minutes, has not produced any snapshots after around 2 hours. > > Please let me know if you require any further information. > > > I see the following messages in logs: > 2018-05-14 04:30:00,018Z ERROR [org.ovirt.engine.core.utils.timer.JobWrapper] > (QuartzOvirtDBScheduler9) [d0c31a9] Failed to invoke scheduled method > onTimer: null > > Can you log a bug - and we will dig into this further. > > To speed thing up, if you could enable debug logs (I think using > https://www.ovirt.org/develop/developer-guide/engine/engine-development-environment/#enable-debug-log---restart-required > > <https://www.ovirt.org/develop/developer-guide/engine/engine-development-environment/#enable-debug-log---restart-required>) > , and attach the exception that would help a lot > > > Many thanks, > > Mark Betham. > > > > > > > > >> >> >> >> On Thu, May 3, 2018 at 4:37 PM, Mark Betham <[email protected] >> <mailto:[email protected]>> wrote: >> Hi Ovirt community, >> >> I am hoping you will be able to help with a problem I am experiencing when >> trying to schedule a snapshot of my Gluster volumes using the Ovirt portal. >> >> Below is an overview of the environment; >> >> I have an Ovirt instance running which is managing our Gluster storage. We >> are running Ovirt version "4.2.2.6-1.el7.centos", Gluster version >> "glusterfs-3.13.2-2.el7" on a base OS of "CentOS Linux release 7.4.1708 >> (Core)", Kernel "3.10.0 - 693.21.1.el7.x86_64", VDSM version >> "vdsm-4.20.23-1.el7.centos". All of the versions of software are the latest >> release and have been fully patched where necessary. >> >> Ovirt has been installed and configured in "Gluster" mode only, no >> virtualisation. The Ovirt platform runs from one of the Gluster storage >> nodes. >> >> Gluster runs with 2 clusters, each located at a different physical site (UK >> and DE). Each of the storage clusters contain 3 storage nodes. Each >> storage cluster contains a single gluster volume. The Gluster volume is 3 >> * Replicated. The Gluster volume runs on top of a LVM thin vol which has >> been provisioned with a XFS filesystem. The system is running a Geo-rep >> between the 2 geo-diverse clusters. >> >> The host servers running at the primary site are of specification 1 * >> Intel(R) Xeon(R) CPU E3-1270 v5 @ 3.60GHz (8 core with HT), 64GB Ram, LSI >> MegaRAID SAS 9271 with bbu and cache, 8 * SAS 10K 2.5" 1.8TB enterprise >> drives configured in a RAID 10 array to give 6.52TB of useable space. The >> host servers running at the secondary site are of specification 1 * Intel(R) >> Xeon(R) CPU E3-1271 v3 @ 3.60GHz (8 core with HT), 32GB Ram, LSI MegaRAID >> SAS 9260 with bbu and cache, 8 * SAS 10K 2.5" 1.8TB enterprise drives >> configured in a RAID 10 array to give 6.52TB of useable space. The >> secondary site is for DR use only. >> >> When I first starting experiencing the issue and was unable to resolve it, I >> carried out a full rebuild from scratch across the two storage clusters. I >> had spent some time troubleshooting the issue but felt it worthwhile to >> ensure I had a clean platform, void of any potential issues which may be >> there due to some of the previous work carried out. The platform was >> rebuilt and data re-ingested. It is probably worth mentioning that this >> environment will become our new production platform, we will be migrating >> data and services to this new platform from our existing Gluster storage >> cluster. The date for the migration activity is getting closer so available >> time has become an issue and will not permit another full rebuild of the >> platform without impacting delivery date. >> >> After the rebuild with both storage clusters online, available and managed >> within the Ovirt platform I conducted some basic commissioning checks and I >> found no issues. The next step I took at this point was to setup the >> Geo-replication. This was brought online with no issues and data was seen >> to be synchronised without any problems. At this point the data >> re-ingestion was started and the new data was synchronised by the >> Geo-replication. >> >> The first step in bringing the snapshot schedule online was to validate that >> snapshots could be taken outside of the scheduler. Taking a manual snapshot >> via the OVirt portal worked without issue. Several were taken on both >> primary and secondary clusters. At this point a schedule was created on the >> primary site cluster via the Ovirt portal to create a snapshot of the >> storage at hourly intervals. The schedule was created successfully however >> no snapshots were ever created. Examining the logs did not show anything >> which I believed was a direct result of the faulty schedule but it is quite >> possible I missed something. >> >> How was the schedule created - is this using the Remote Data Sync Setup >> under Storage domain? >> >> >> I reviewed many online articles, bug reports and application manuals in >> relation to snapshotting. There were several loosely related support >> articles around snapshotting but none of the recommendations seemed to work. >> I did the same with manuals and again nothing that seemed to work. What I >> did find were several references to running snapshots along with >> geo-replication and that the geo-replication should be paused when creating. >> So I removed all existing references to any snapshot schedule, paused the >> Geo-repl and recreated the snapshot schedule. The schedule was never >> actioned and no snapshots were created. Removed Geo-repl entirely, remove >> all schedules and carried out a reboot of the entire platform. When the >> system was fully back online and no pending heal operations the schedule was >> re-added for the primary site only. No difference in the results and no >> snapshots were created from the schedule. >> >> I have now reached the point where I feel I require assistance and hence >> this email request. >> >> If you require any further data then please let me know and I will do my >> best to get it for you. >> >> Could you please provide the engine.log from the time the schedule was setup >> and including the time the schedule was supposed to run? >> >> >> >> Any help you can give would be greatly appreciated. >> >> Many thanks, >> >> Mark Betham >> >> _______________________________________________ >> Users mailing list >> [email protected] <mailto:[email protected]> >> http://lists.ovirt.org/mailman/listinfo/users >> <http://lists.ovirt.org/mailman/listinfo/users> >> >> > > >
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

