Ilya,

Point to be noted is that my job didn't failed coz of the  timeout, but
rather coz of some VDI parameter at XENServer with below exception.

[SR_BACKEND_FAILURE_80, , Failed to mark VDI hidden [opterr=SR
96e879bf-93aa-47ca-e2d5-e595afbab294: error aborting existing process]]

I am still digging on this error from SMlogs etc on XEN server. But in
reality volume was migrated and I think that's important.


I, off course, faced timeout error during initial testing and after some
trial and error I realised that there is this "not so properly named
parameter" called *wait* (1800 default value) that needs to be modified in
end to make timeout error go away.

So all in all I modified parameters as below:-

migratewait: 36000
storage.pool.max.waitseconds: 36000
vm.op.cancel.interval: 36000
vm.op.cleanup.wait: 36000
wait:18000





--
Best,
Makrand


On Tue, Aug 9, 2016 at 6:07 AM, ilya <[email protected]> wrote:

> this happened to us on non XEN hypervisor as well.
>
> CloudStack has a timeout for a long running jobs - which i assume in
> your case - it has exceeded.
>
> Changing volumes table should be enough by referencing proper pool_id.
> Just make sure that data size matches on both ends.
>
> consider changing
> "copy.volume.wait" (if that does not help) also "vm.job.timeout"
>
>
> Regards
> ilya
>
> On 8/8/16 3:54 AM, Makrand wrote:
> > Guys,
> >
> > My setup:- ACS 4.4.2. Hypervisor: XENserver 6.2.
> >
> > I tried moving a volume in running VM from primary storage A to primary
> > storage B (using GUI of cloudstack). Please note, primary storage A LUN
> > (LUN7)is coming out of one storage box and  primary storage  B LUN
> (LUN14)
> > is from another.
> >
> > For VM1 with 250GB data volume (51 GB used space), I was able to move
> this
> > volume without any glitch in about 26mins.
> >
> > But for VM2 with 250Gb data volume (182 GB used space), the migration
> >  continued for about ~110 mins and then failed with follwing exception in
> > very end with message like:-
> >
> > 2016-08-06 14:30:57,481 WARN  [c.c.h.x.r.CitrixResourceBase]
> > (DirectAgent-192:ctx-5716ad6d) Task failed! Task record:
> > uuid: 308a8326-2622-e4c5-2019-3beb
> > 87b0d183
> >            nameLabel: Async.VDI.pool_migrate
> >      nameDescription:
> >    allowedOperations: []
> >    currentOperations: {}
> >              created: Sat Aug 06 12:36:27 UTC 2016
> >             finished: Sat Aug 06 14:30:32 UTC 2016
> >               status: failure
> >           residentOn: com.xensource.xenapi.Host@f242d3ca
> >             progress: 1.0
> >                 type: <none/>
> >               result:
> >            errorInfo: [SR_BACKEND_FAILURE_80, , Failed to mark VDI hidden
> > [opterr=SR 96e879bf-93aa-47ca-e2d5-e595afbab294: error aborting existing
> > process]]
> >          otherConfig: {}
> >            subtaskOf: com.xensource.xenapi.Task@aaf13f6f
> >             subtasks: []
> >
> >
> > So cloudstack just removed the JOB telling it failed, says the mangement
> > server log.
> >
> > A) But when I am checking it at hyeprvisor level, the volume is on new SR
> > i.e. on LUN14. Strange huh? So now the new uuid for this volume from XE
> cli
> > is like
> >
> > [root@gcx-bom-compute1 ~]# xe vbd-list
> > vm-uuid=3fcb3070-e373-3cf9-d0aa-0a657142a38d
> > uuid ( RO)             : f15dc54a-3868-8de8-5427-314e341879c6
> >           vm-uuid ( RO): 3fcb3070-e373-3cf9-d0aa-0a657142a38d
> >     vm-name-label ( RO): i-22-803-VM
> >          vdi-uuid ( RO): cc1f8e83-f224-44b7-9359-282a1c1e3db1
> >             empty ( RO): false
> >            device ( RO): hdb
> >
> > B) But luckily I had the entry taken before migration  and it shows
> like:-
> >
> > uuid ( RO) : f15dc54a-3868-8de8-5427-314e341879c6
> > vm-uuid ( RO): 3fcb3070-e373-3cf9-d0aa-0a657142a38d
> > vm-name-label ( RO): i-22-803-VM
> > vdi-uuid ( RO): 7c073522-a077-41a0-b9a7-7b61847d413b
> > empty ( RO): false
> > device ( RO): hdb
> >
> > C) Since this failed at cloudstack, the DB is still holding old value.
> > Here is current volume table entry in DB
> >
> > id: 1004
> >>                 account_id: 22
> >>                  domain_id: 15
> >>                    pool_id: 18
> >>               last_pool_id: NULL
> >>                instance_id: 803
> >>                  device_id: 1
> >>                       name:
> >> cloudx_globalcloudxchange_com_W2797T2808S3112_V1462960751
> >>                       uuid: a8f01042-d0de-4496-98fa-a0b13648bef7
> >>                       size: 268435456000
> >>                     folder: NULL
> >>                       path: 7c073522-a077-41a0-b9a7-7b61847d413b
> >>                     pod_id: NULL
> >>             data_center_id: 2
> >>                 iscsi_name: NULL
> >>                    host_ip: NULL
> >>                volume_type: DATADISK
> >>                  pool_type: NULL
> >>           disk_offering_id: 6
> >>                template_id: NULL
> >> first_snapshot_backup_uuid: NULL
> >>                recreatable: 0
> >>                    created: 2016-05-11 09:59:12
> >>                   attached: 2016-05-11 09:59:21
> >>                    updated: 2016-08-06 14:30:57
> >>                    removed: NULL
> >>                      state: Ready
> >>                 chain_info: NULL
> >>               update_count: 42
> >>                  disk_type: NULL
> >>     vm_snapshot_chain_size: NULL
> >>                     iso_id: NULL
> >>             display_volume: 1
> >>                     format: VHD
> >>                   min_iops: NULL
> >>                   max_iops: NULL
> >>              hv_ss_reserve: 0
> >> 1 row in set (0.00 sec)
> >>
> >
> >
> > So the path variable shows value as 7c073522-a077-41a0-b9a7-7b61847d413b
> > and pool id as 18.
> >
> > The VM is running as of now, but I am sure the moment I will reboot, this
> > volume will be gone or worst VM won't boot. This is production VM BTW.
> >
> > D) So I think I need to edit volume table for path and pool_id parameters
> > and need to place new values in place and then reboot VM. Do I need to
> make
> > any more changes in DB in some other tables for same? Any comment/help is
> > much appreciated.
> >
> >
> >
> >
> > --
> > Best,
> > Makrand
> >
>

Reply via email to