On Mon, May 14, 2018 at 4:07 PM, Mark Betham <mark.bet...@googlemail.com>
wrote:

> Hi Sahina,
>
> Many thanks for your response and apologies for my delay in getting back
> to you.
>
>
> How was the schedule created - is this using the Remote Data Sync Setup
> under Storage domain?
>
>
> Ovirt is configured in ‘Gluster’ mode, no VM support.  When snapshotting
> we are taking a snapshot of the full Gluster volume.
>
> To configure the snapshot schedule I did the following;
> Login to Ovirt WebUI
> From left hand menu select ‘Storage’ and ‘Volumes'
> I then selected the volume I wanted to snapshot by clicking on the link
> within the ‘Name’ column
> From here I selected the ‘Snapshots’ tab
> From the top menu options I selected the drop down ‘Snapshot’
> From the drop down options I selected ‘New’
> A new window appeared titled ‘Create/Schedule Snapshot’
> I entered a snapshot prefix and description into the available fields and
> selected the ‘Schedule’ page
> On the schedule page I selected ‘Minute’ from the ‘Recurrence’ drop down
> Set ‘Interval’ to every ’30’ minutes
> Changed timezone to ‘Europe/London=(GMT+00:00) London Standard Time’
> Left value in ‘Start Schedule by’ at default value
> Set schedule to ‘No End Date’
> Click 'OK'
>
> Interestingly I get the following message on the ‘Create/Schedule
> Snapshot’ page before clicking on OK;
> *Frequent creation of snapshots would overload the cluster*
> *Gluster CLI based snapshot scheduling is enabled. It would be disabled
> once volume snapshots scheduled from UI.*
>
> What is interesting is that I have not enabled 'Gluster CLI based snapshot
> scheduling’.
>
> After clicking OK I am returned to the Volume Snapshots tab.
>
> From this point I get no snapshots created according to the schedule set.
>
> At the time of clicking OK in the WebUI to enable the schedule I get the
> following in the engine log;
> *2018-05-14 09:24:11,068Z WARN
>  [org.ovirt.engine.core.dal.job.ExecutionMessageDirector] (default
> task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] The message key
> 'ScheduleGlusterVolumeSnapshot' is missing from 'bundles/ExecutionMessages'*
> *2018-05-14 09:24:11,090Z INFO
>  [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand]
> (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Before acquiring
> and wait lock
> 'EngineLock:{exclusiveLocks='[712da1df-4c11-405a-8fb6-f99aebc185c1=GLUSTER_SNAPSHOT]',
> sharedLocks=''}'*
> *2018-05-14 09:24:11,090Z INFO
>  [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand]
> (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Lock-wait
> acquired to object
> 'EngineLock:{exclusiveLocks='[712da1df-4c11-405a-8fb6-f99aebc185c1=GLUSTER_SNAPSHOT]',
> sharedLocks=''}'*
> *2018-05-14 09:24:11,111Z INFO
>  [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand]
> (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Running command:
> ScheduleGlusterVolumeSnapshotCommand internal: false. Entities affected :
>  ID: 712da1df-4c11-405a-8fb6-f99aebc185c1 Type: GlusterVolumeAction group
> MANIPULATE_GLUSTER_VOLUME with role type ADMIN*
> *2018-05-14 09:24:11,148Z INFO
>  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] EVENT_ID:
> GLUSTER_VOLUME_SNAPSHOT_SCHEDULED(4,134), Snapshots scheduled on volume
> glustervol0 of cluster NOSS-LD5.*
> *2018-05-14 09:24:11,156Z INFO
>  [org.ovirt.engine.core.bll.gluster.ScheduleGlusterVolumeSnapshotCommand]
> (default task-128) [85d0b16f-2c0c-464f-bbf1-682c062a4871] Lock freed to
> object
> 'EngineLock:{exclusiveLocks='[712da1df-4c11-405a-8fb6-f99aebc185c1=GLUSTER_SNAPSHOT]',
> sharedLocks=''}'*
>
> Could you please provide the engine.log from the time the schedule was
> setup and including the time the schedule was supposed to run?
>
>
> The original log file is no longer present, so I removed the old schedule
> and created a new schedule, as per the instructions above, earlier today.
> I have therefor attached the engine log from today.  The new schedule,
> which was set to run every 30 minutes, has not produced any snapshots after
> around 2 hours.
>
> Please let me know if you require any further information.
>


I see the following messages in logs:
2018-05-14 04:30:00,018Z ERROR
[org.ovirt.engine.core.utils.timer.JobWrapper] (QuartzOvirtDBScheduler9)
[d0c31a9] Failed to invoke scheduled method onTimer: null

Can you log a bug - and we will dig into this further.

To speed thing up, if you could enable debug logs (I think using
https://www.ovirt.org/develop/developer-guide/engine/engine-development-environment/#enable-debug-log---restart-required)
, and attach the exception that would help a lot


> Many thanks,
>
> Mark Betham.
>
>
>
>
>
>
>
>
>
>
>
> On Thu, May 3, 2018 at 4:37 PM, Mark Betham <mark.bet...@googlemail.com>
> wrote:
>
> Hi Ovirt community,
>
> I am hoping you will be able to help with a problem I am experiencing when
> trying to schedule a snapshot of my Gluster volumes using the Ovirt portal.
>
> Below is an overview of the environment;
>
> I have an Ovirt instance running which is managing our Gluster storage.
> We are running Ovirt version "4.2.2.6-1.el7.centos", Gluster version
> "glusterfs-3.13.2-2.el7" on a base OS of "CentOS Linux release 7.4.1708
> (Core)", Kernel "3.10.0 - 693.21.1.el7.x86_64", VDSM version
> "vdsm-4.20.23-1.el7.centos".  All of the versions of software are the
> latest release and have been fully patched where necessary.
>
> Ovirt has been installed and configured in "Gluster" mode only, no
> virtualisation.  The Ovirt platform runs from one of the Gluster storage
> nodes.
>
> Gluster runs with 2 clusters, each located at a different physical site
> (UK and DE).  Each of the storage clusters contain 3 storage nodes.  Each
> storage cluster contains a single  gluster volume.  The Gluster volume is 3
> * Replicated.  The Gluster volume runs on top of a LVM thin vol which has
> been provisioned with a XFS filesystem.  The system is running a Geo-rep
> between the 2 geo-diverse clusters.
>
> The host servers running at the primary site are of specification 1 *
> Intel(R) Xeon(R) CPU E3-1270 v5 @ 3.60GHz (8 core with HT), 64GB Ram, LSI
> MegaRAID SAS 9271 with bbu and cache, 8 * SAS 10K 2.5" 1.8TB enterprise
> drives configured in a RAID 10 array to give 6.52TB of useable space.  The
> host servers running at the secondary site are of specification 1 *
> Intel(R) Xeon(R) CPU E3-1271 v3 @ 3.60GHz (8 core with HT), 32GB Ram, LSI
> MegaRAID SAS 9260 with bbu and cache, 8 * SAS 10K 2.5" 1.8TB enterprise
> drives configured in a RAID 10 array to give 6.52TB of useable space.  The
> secondary site is for DR use only.
>
> When I first starting experiencing the issue and was unable to resolve it,
> I carried out a full rebuild from scratch across the two storage clusters.
> I had spent some time troubleshooting the issue but felt it worthwhile to
> ensure I had a clean platform, void of any potential issues which may be
> there due to some of the previous work carried out.  The platform was
> rebuilt and data re-ingested.  It is probably worth mentioning that this
> environment will become our new production platform, we will be migrating
> data and services to this new platform from our existing Gluster storage
> cluster.  The date for the migration activity is getting closer so
> available time has become an issue and will not permit another full rebuild
> of the platform without impacting delivery date.
>
> After the rebuild with both storage clusters online, available and managed
> within the Ovirt platform I conducted some basic commissioning checks and I
> found no issues.  The next step I took at this point was to setup the
> Geo-replication.  This was brought online with no issues and data was seen
> to be synchronised without any problems.  At this point the data
> re-ingestion was started and the new data was synchronised by the
> Geo-replication.
>
> The first step in bringing the snapshot schedule online was to validate
> that snapshots could be taken outside of the scheduler.  Taking a manual
> snapshot via the OVirt portal worked without issue.  Several were taken on
> both primary and secondary clusters.  At this point a schedule was created
> on the primary site cluster via the Ovirt portal to create a snapshot of
> the storage at hourly intervals.  The schedule was created successfully
> however no snapshots were ever created.  Examining the logs did not show
> anything which I believed was a direct result of the faulty schedule but it
> is quite possible I missed something.
>
>
> How was the schedule created - is this using the Remote Data Sync Setup
> under Storage domain?
>
>
> I reviewed many online articles, bug reports and application manuals in
> relation to snapshotting.  There were several loosely related support
> articles around snapshotting but none of the recommendations seemed to
> work.  I did the same with manuals and again nothing that seemed to work.
> What I did find were several references to running snapshots along with
> geo-replication and that the geo-replication should be paused when
> creating.  So I removed all existing references to any snapshot schedule,
> paused the Geo-repl and recreated the snapshot schedule.  The schedule was
> never actioned and no snapshots were created.  Removed Geo-repl entirely,
> remove all schedules and carried out a reboot of the entire platform.  When
> the system was fully back online and no pending heal operations the
> schedule was re-added for the primary site only.  No difference in the
> results and no snapshots were created from the schedule.
>
> I have now reached the point where I feel I require assistance and hence
> this email request.
>
> If you require any further data then please let me know and I will do my
> best to get it for you.
>
>
> Could you please provide the engine.log from the time the schedule was
> setup and including the time the schedule was supposed to run?
>
>
>
> Any help you can give would be greatly appreciated.
>
> Many thanks,
>
> Mark Betham
>
> _______________________________________________
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
>
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org

Reply via email to