On Mon, May 14, 2018 at 4:07 PM, Mark Betham <mark.bet...@googlemail.com> wrote:
> Hi Sahina, > > Many thanks for your response and apologies for my delay in getting back > to you. > > > How was the schedule created - is this using the Remote Data Sync Setup > under Storage domain? > > > Ovirt is configured in ‘Gluster’ mode, no VM support. When snapshotting > we are taking a snapshot of the full Gluster volume. > > To configure the snapshot schedule I did the following; > Login to Ovirt WebUI > From left hand menu select ‘Storage’ and ‘Volumes' > I then selected the volume I wanted to snapshot by clicking on the link > within the ‘Name’ column > From here I selected the ‘Snapshots’ tab > From the top menu options I selected the drop down ‘Snapshot’ > From the drop down options I selected ‘New’ > A new window appeared titled ‘Create/Schedule Snapshot’ > I entered a snapshot prefix and description into the available fields and > selected the ‘Schedule’ page > On the schedule page I selected ‘Minute’ from the ‘Recurrence’ drop down > Set ‘Interval’ to every ’30’ minutes > Changed timezone to ‘Europe/London=(GMT+00:00) London Standard Time’ > Left value in ‘Start Schedule by’ at default value > Set schedule to ‘No End Date’ > Click 'OK' > > Interestingly I get the following message on the ‘Create/Schedule > Snapshot’ page before clicking on OK; > *Frequent creation of snapshots would overload the cluster* > *Gluster CLI based snapshot scheduling is enabled. It would be disabled > once volume snapshots scheduled from UI.* > > What is interesting is that I have not enabled 'Gluster CLI based snapshot > scheduling’. > > After clicking OK I am returned to the Volume Snapshots tab. > > From this point I get no snapshots created according to the schedule set. > > At the time of clicking OK in the WebUI to enable the schedule I get the > following in the engine log; > *2018-05-14 09:24:11,068Z WARN > [org.ovirt.engine.core.dal.job.ExecutionMessageDirector] (default > task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] The message key > 'ScheduleGlusterVolumeSnapshot' is missing from 'bundles/ExecutionMessages'* > *2018-05-14 09:24:11,090Z INFO > [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Before acquiring > and wait lock > 'EngineLock:{exclusiveLocks='[712da1df-4c11-405a-8fb6-f99aebc185c1=GLUSTER_SNAPSHOT]', > sharedLocks=''}'* > *2018-05-14 09:24:11,090Z INFO > [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Lock-wait > acquired to object > 'EngineLock:{exclusiveLocks='[712da1df-4c11-405a-8fb6-f99aebc185c1=GLUSTER_SNAPSHOT]', > sharedLocks=''}'* > *2018-05-14 09:24:11,111Z INFO > [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Running command: > ScheduleGlusterVolumeSnapshotCommand internal: false. Entities affected : > ID: 712da1df-4c11-405a-8fb6-f99aebc185c1 Type: GlusterVolumeAction group > MANIPULATE_GLUSTER_VOLUME with role type ADMIN* > *2018-05-14 09:24:11,148Z INFO > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] EVENT_ID: > GLUSTER_VOLUME_SNAPSHOT_SCHEDULED(4,134), Snapshots scheduled on volume > glustervol0 of cluster NOSS-LD5.* > *2018-05-14 09:24:11,156Z INFO > [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand] > (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Lock freed to > object > 'EngineLock:{exclusiveLocks='[712da1df-4c11-405a-8fb6-f99aebc185c1=GLUSTER_SNAPSHOT]', > sharedLocks=''}'* > > Could you please provide the engine.log from the time the schedule was > setup and including the time the schedule was supposed to run? > > > The original log file is no longer present, so I removed the old schedule > and created a new schedule, as per the instructions above, earlier today. > I have therefor attached the engine log from today. The new schedule, > which was set to run every 30 minutes, has not produced any snapshots after > around 2 hours. > > Please let me know if you require any further information. > I see the following messages in logs: 2018-05-14 04:30:00,018Z ERROR [org.ovirt.engine.core.utils.timer.JobWrapper] (QuartzOvirtDBScheduler9) [d0c31a9] Failed to invoke scheduled method onTimer: null Can you log a bug - and we will dig into this further. To speed thing up, if you could enable debug logs (I think using https://www.ovirt.org/develop/developer-guide/engine/engine-development-environment/#enable-debug-log---restart-required) , and attach the exception that would help a lot > Many thanks, > > Mark Betham. > > > > > > > > > > > > On Thu, May 3, 2018 at 4:37 PM, Mark Betham <mark.bet...@googlemail.com> > wrote: > > Hi Ovirt community, > > I am hoping you will be able to help with a problem I am experiencing when > trying to schedule a snapshot of my Gluster volumes using the Ovirt portal. > > Below is an overview of the environment; > > I have an Ovirt instance running which is managing our Gluster storage. > We are running Ovirt version "4.2.2.6-1.el7.centos", Gluster version > "glusterfs-3.13.2-2.el7" on a base OS of "CentOS Linux release 7.4.1708 > (Core)", Kernel "3.10.0 - 693.21.1.el7.x86_64", VDSM version > "vdsm-4.20.23-1.el7.centos". All of the versions of software are the > latest release and have been fully patched where necessary. > > Ovirt has been installed and configured in "Gluster" mode only, no > virtualisation. The Ovirt platform runs from one of the Gluster storage > nodes. > > Gluster runs with 2 clusters, each located at a different physical site > (UK and DE). Each of the storage clusters contain 3 storage nodes. Each > storage cluster contains a single gluster volume. The Gluster volume is 3 > * Replicated. The Gluster volume runs on top of a LVM thin vol which has > been provisioned with a XFS filesystem. The system is running a Geo-rep > between the 2 geo-diverse clusters. > > The host servers running at the primary site are of specification 1 * > Intel(R) Xeon(R) CPU E3-1270 v5 @ 3.60GHz (8 core with HT), 64GB Ram, LSI > MegaRAID SAS 9271 with bbu and cache, 8 * SAS 10K 2.5" 1.8TB enterprise > drives configured in a RAID 10 array to give 6.52TB of useable space. The > host servers running at the secondary site are of specification 1 * > Intel(R) Xeon(R) CPU E3-1271 v3 @ 3.60GHz (8 core with HT), 32GB Ram, LSI > MegaRAID SAS 9260 with bbu and cache, 8 * SAS 10K 2.5" 1.8TB enterprise > drives configured in a RAID 10 array to give 6.52TB of useable space. The > secondary site is for DR use only. > > When I first starting experiencing the issue and was unable to resolve it, > I carried out a full rebuild from scratch across the two storage clusters. > I had spent some time troubleshooting the issue but felt it worthwhile to > ensure I had a clean platform, void of any potential issues which may be > there due to some of the previous work carried out. The platform was > rebuilt and data re-ingested. It is probably worth mentioning that this > environment will become our new production platform, we will be migrating > data and services to this new platform from our existing Gluster storage > cluster. The date for the migration activity is getting closer so > available time has become an issue and will not permit another full rebuild > of the platform without impacting delivery date. > > After the rebuild with both storage clusters online, available and managed > within the Ovirt platform I conducted some basic commissioning checks and I > found no issues. The next step I took at this point was to setup the > Geo-replication. This was brought online with no issues and data was seen > to be synchronised without any problems. At this point the data > re-ingestion was started and the new data was synchronised by the > Geo-replication. > > The first step in bringing the snapshot schedule online was to validate > that snapshots could be taken outside of the scheduler. Taking a manual > snapshot via the OVirt portal worked without issue. Several were taken on > both primary and secondary clusters. At this point a schedule was created > on the primary site cluster via the Ovirt portal to create a snapshot of > the storage at hourly intervals. The schedule was created successfully > however no snapshots were ever created. Examining the logs did not show > anything which I believed was a direct result of the faulty schedule but it > is quite possible I missed something. > > > How was the schedule created - is this using the Remote Data Sync Setup > under Storage domain? > > > I reviewed many online articles, bug reports and application manuals in > relation to snapshotting. There were several loosely related support > articles around snapshotting but none of the recommendations seemed to > work. I did the same with manuals and again nothing that seemed to work. > What I did find were several references to running snapshots along with > geo-replication and that the geo-replication should be paused when > creating. So I removed all existing references to any snapshot schedule, > paused the Geo-repl and recreated the snapshot schedule. The schedule was > never actioned and no snapshots were created. Removed Geo-repl entirely, > remove all schedules and carried out a reboot of the entire platform. When > the system was fully back online and no pending heal operations the > schedule was re-added for the primary site only. No difference in the > results and no snapshots were created from the schedule. > > I have now reached the point where I feel I require assistance and hence > this email request. > > If you require any further data then please let me know and I will do my > best to get it for you. > > > Could you please provide the engine.log from the time the schedule was > setup and including the time the schedule was supposed to run? > > > > Any help you can give would be greatly appreciated. > > Many thanks, > > Mark Betham > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > > > >
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org