[ovirt-users] Re: Weird problem starting VMs in oVirt-4.4

2020-06-17 Thread Krutika Dhananjay
Yes, so the bug has been fixed upstream and the backports to release-7 and
release-8 of gluster pending merge. The fix should be available in the next
.x release of gluster-7 and 8. Until then like Nir suggested, please turn
off performance.stat-prefetch on your volumes.

-Krutika

On Wed, Jun 17, 2020 at 5:59 AM Nir Soffer  wrote:

> On Mon, Jun 8, 2020 at 3:10 PM Joop  wrote:
> >
> > On 3-6-2020 14:58, Joop wrote:
> > > Hi All,
> > >
> > > Just had a rather new experience in that starting a VM worked but the
> > > kernel entered grub2 rescue console due to the fact that something was
> > > wrong with its virtio-scsi disk.
> > > The message is Booting from Hard Disk 
> > > error: ../../grub-core/kern/dl.c:266:invalid arch-independent ELF
> maginc.
> > > entering rescue mode...
> > >
> > > Doing a CTRL-ALT-Del through the spice console let the VM boot
> > > correctly. Shutting it down and repeating the procedure I get a disk
> > > problem everytime. Weird thing is if I activate the BootMenu and then
> > > straight away start the VM all is OK.
> > > I don't see any ERROR messages in either vdsm.log, engine.log
> > >
> > > If I would have to guess it looks like the disk image isn't connected
> > > yet when the VM boots but thats weird isn't it?
> > >
> > >
> > As an update to this:
> > Just had the same problem with a Windows VM but more importantly also
> > with HostedEngine itself.
> > On the host did:
> > hosted-engine --set-maintenance --mode=global
> > hosted-engine --vm-shutdown
> >
> > Stopped all oVirt related services, cleared all oVirt related logs from
> > /var/log/..., restarted the host, ran hosted-engine --set-maintenance
> > --mode=none
> > Watched /var/spool/mail/root to see the engine coming up. It went to
> > starting but never came into the Up status.
> > Set a password and used vncviewer to see the console, see attached
> > screenschot.
>
> The screenshot "engine.png" show gluster bug we discovered a few weeks ago:
> https://bugzilla.redhat.com/1823423
>
> Until you get a fixed version, this may fix the issues:
>
> # gluster volume set engine performance.stat-prefetch off
>
> See https://bugzilla.redhat.com/show_bug.cgi?id=1823423#c55.
>
> Krutica, can this bug affect upstream gluster?
>
> Joop, please share the gluster version in your setup.
>
>
>
>
> > hosted-engine --vm-poweroff, and tried again, same result
> > hosted-engine --vm-start, works
> > Let it startup and then shut it down after enabling maintenance mode.
> > Copied, hopefully, all relevant logs and attached them.
> >
> > A sosreport is also available, size 12Mb. I can provide a download link
> > if needed.
> >
> > Hopefully someone is able to spot what is going wrong.
> >
> > Regards,
> >
> > Joop
> >
> >
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/VJ7ZOXHCOKBNNUV4KF5OS7TKU2TXNN3I/
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/E3PUZLLKFDKKX7H5X7KTQMANW76ETO4I/


[ovirt-users] Re: Sometimes paused due to unknown storage error on gluster

2020-04-08 Thread Krutika Dhananjay
On Tue, Apr 7, 2020 at 7:36 PM Gianluca Cecchi 
wrote:

>
> OK. So I set log at least at INFO level on all subsystems and tried a
> redeploy of Openshift with 3 mater nodes and 7 worker nodes.
> One worker got the error and VM in paused mode
>
> Apr 7, 2020, 3:27:28 PM VM worker-6 has been paused due to unknown storage
> error.
>
> The vm has only one 100Gb virtual disk on gluster volume named vmstore
>
>
> Here below all the logs around time at the different layers.
> Let me know if you need another log file not yet considered.
>
> From what I see, the matching error is found in
>
> - rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log
>
> [2020-04-07 13:27:28.721262] E [MSGID: 133010]
> [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on
> shard 523 failed. Base file gfid = d22530cf-2e50-4059-8924-0aafe38497b1 [No
> such file or directory]
> [2020-04-07 13:27:28.721432] W [fuse-bridge.c:2918:fuse_writev_cbk]
> 0-glusterfs-fuse: 4435189: WRITE => -1
> gfid=d22530cf-2e50-4059-8924-0aafe38497b1 fd=0x7f3c4c07ab38 (No such file
> or directory)
>
>
This ^^, right here is the reason the VM paused. Are you using a plain
distribute volume here?
Can you share some of the log messages that occur right above these errors?
Also, can you check if the file
$VMSTORE_BRICKPATH/.glusterfs/d2/25/d22530cf-2e50-4059-8924-0aafe38497b1
exists on the brick?

-Krutika

and
>
> - gluster_bricks-vmstore-vmstore.log
>
> [2020-04-07 13:27:28.719391] W [MSGID: 113020]
> [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on
> /gluster_bricks/vmstore
> /vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed  [File
> exists]
> [2020-04-07 13:27:28.719978] E [MSGID: 113020]
> [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on
> /gluster_bricks/vmstore/v
> mstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists]
>
>
> Here below all the files checked.
> Any hint?
>
> Gianluca
>
> - qemu.log of the vm
>
> 2020-04-07T12:30:29.954084Z qemu-kvm: -drive
> file=/rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/04ec5a8d-4ee6-4661-a832-0c094968aa5b/716fdc98-2d0e-44c3-b5fe-c0187cdad751,format=raw,if=none,id=drive-ua-04ec5a8d-4ee6-4661-a832-0c094968aa5b,serial=04ec5a8d-4ee6-4661-a832-0c094968aa5b,werror=stop,rerror=stop,cache=none,aio=threads:
> 'serial' is deprecated, please use the corresponding option of '-device'
> instead
> Spice-Message: 14:30:29.963: setting TLS option 'CipherString' to
> 'kECDHE+FIPS:kDHE+FIPS:kRSA+FIPS:!eNULL:!aNULL' from /etc/pki/tls/spice.cnf
> configuration file
> 2020-04-07T12:30:29.976216Z qemu-kvm: warning: CPU(s) not present in any
> NUMA nodes: CPU 8 [socket-id: 2, core-id: 0, thread-id: 0], CPU 9
> [socket-id: 2, core-id: 1, thread-id: 0], CPU 10 [socket-id: 2, core-id: 2,
> thread-id: 0], CPU 11 [socket-id: 2, core-id: 3, thread-id: 0], CPU 12
> [socket-id: 3, core-id: 0, thread-id: 0], CPU 13 [socket-id: 3, core-id: 1,
> thread-id: 0], CPU 14 [socket-id: 3, core-id: 2, thread-id: 0], CPU 15
> [socket-id: 3, core-id: 3, thread-id: 0], CPU 16 [socket-id: 4, core-id: 0,
> thread-id: 0], CPU 17 [socket-id: 4, core-id: 1, thread-id: 0], CPU 18
> [socket-id: 4, core-id: 2, thread-id: 0], CPU 19 [socket-id: 4, core-id: 3,
> thread-id: 0], CPU 20 [socket-id: 5, core-id: 0, thread-id: 0], CPU 21
> [socket-id: 5, core-id: 1, thread-id: 0], CPU 22 [socket-id: 5, core-id: 2,
> thread-id: 0], CPU 23 [socket-id: 5, core-id: 3, thread-id: 0], CPU 24
> [socket-id: 6, core-id: 0, thread-id: 0], CPU 25 [socket-id: 6, core-id: 1,
> thread-id: 0], CPU 26 [socket-id: 6, core-id: 2, thread-id: 0], CPU 27
> [socket-id: 6, core-id: 3, thread-id: 0], CPU 28 [socket-id: 7, core-id: 0,
> thread-id: 0], CPU 29 [socket-id: 7, core-id: 1, thread-id: 0], CPU 30
> [socket-id: 7, core-id: 2, thread-id: 0], CPU 31 [socket-id: 7, core-id: 3,
> thread-id: 0], CPU 32 [socket-id: 8, core-id: 0, thread-id: 0], CPU 33
> [socket-id: 8, core-id: 1, thread-id: 0], CPU 34 [socket-id: 8, core-id: 2,
> thread-id: 0], CPU 35 [socket-id: 8, core-id: 3, thread-id: 0], CPU 36
> [socket-id: 9, core-id: 0, thread-id: 0], CPU 37 [socket-id: 9, core-id: 1,
> thread-id: 0], CPU 38 [socket-id: 9, core-id: 2, thread-id: 0], CPU 39
> [socket-id: 9, core-id: 3, thread-id: 0], CPU 40 [socket-id: 10, core-id:
> 0, thread-id: 0], CPU 41 [socket-id: 10, core-id: 1, thread-id: 0], CPU 42
> [socket-id: 10, core-id: 2, thread-id: 0], CPU 43 [socket-id: 10, core-id:
> 3, thread-id: 0], CPU 44 [socket-id: 11, core-id: 0, thread-id: 0], CPU 45
> [socket-id: 11, core-id: 1, thread-id: 0], CPU 46 [socket-id: 11, core-id:
> 2, thread-id: 0], CPU 47 [socket-id: 11, core-id: 3, thread-id: 0], CPU 48
> [socket-id: 12, core-id: 0, thread-id: 0], CPU 49 [socket-id: 12, core-id:
> 1, thread-id: 0], CPU 50 [socket-id: 12, core-id: 2, thread-id: 0], CPU 51
> [socket-id: 12, core-id: 3, thread-id: 0], CPU 52 [socket-id: 13, core-id:

[ovirt-users] Re: HCI cluster single node error making template

2020-03-30 Thread Krutika Dhananjay
Agreed. Please share the bug report when you're done filing it.  In
addition to the logs Nir requested, include gluster version and the
`gluster volume info` output in your report.

We'll take the discussion forward on the bz.

-Krutika

On Wed, Mar 25, 2020 at 11:39 PM Nir Soffer  wrote:

> On Wed, Mar 25, 2020 at 2:06 PM Gianluca Cecchi
>  wrote:
> >
> > Hello,
> > I'm on 4.3.9
> > I have created a VM with 4vcpu, 16gb mem and a nic and a thin
> provisioned disk of 120Gb
> > Installed nothing on it, only defined.
> > Now I'm trying to make template from it but I get error.
> > I leave the prefille value of raw for format
> > Target storage domain has 900 Gb free, almost empty
> >
> > In events pane:
> >
> > Creation of Template ocp_node from VM ocp_node_template was initiated by
> admin@internal-authz.
> > 3/25/20 12:42:35 PM
> >
> > Then I get error in events pane is:
> >
> > VDSM ovirt.example.local command HSMGetAllTasksStatusesVDS failed: low
> level Image copy failed: ("Command ['/usr/bin/qemu-img', 'convert', '-p',
> '-t', 'none', '-T', 'none', '-f', 'raw',
> u'/rhev/data-center/mnt/glusterSD/ovirtst.example.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/61689cb2-fdce-41a5-a6d9-7d06aefeb636/30009efb-83ed-4b0d-b243-3160195ae46e',
> '-O', 'qcow2', '-o', 'compat=1.1',
> u'/rhev/data-center/mnt/glusterSD/ovirtst.example.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/5642b52f-d7e8-48a8-adf9-f79022ce4594/982dd5cc-5f8f-41cb-b2e7-3cbdf2a656cf']
> failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading
> sector 18620412: No such file or directory\\n')",)
>
> This is the second time I see this error - this error should be
> impossible, reading should never
> fail with ENOENT.
>
> > Error: Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none',
> '-T', 'none', '-f', 'raw',
> u'/rhev/data-center/mnt/glusterSD/ovirtst.example.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/61689cb2-fdce-41a5-a6d9-7d06aefeb636/30009efb-83ed-4b0d-b243-3160195ae46e',
> '-O', 'raw',
> u'/rhev/data-center/mnt/glusterSD/ovirtst.example.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/bb167b28-94fa-434c-8fb6-c4bedfc06c62/53d3ab96-e5d1-453a-9989-2f858e6a9e0a']
> failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading
> sector 22028283: No data available\n')
>
> This error ENODATA is also strange, preadv() is not documented to
> return this error.
>
> This is the second report here about impossible errors with gluster
> storage. First report was here:
>
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/KBNSTWBZFN7PGWW74AGAGQVPNJ2DIZ6S/
>
> Please file gluster bug and attach gluster client logs from
> /var/log/glusterfs/.
>
>
> Nir
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GHQ2NJMRMKDU374TWEAKNDRYDRM7K3VI/


[ovirt-users] Re: [ANN] oVirt 4.3.7 Third Release Candidate is now available for testing

2019-12-01 Thread Krutika Dhananjay
Sorry about the late response.

I looked at the logs. These errors are originating from posix-acl
translator -



*[2019-11-17 07:55:47.090065] E [MSGID: 115050]
[server-rpc-fops_v2.c:158:server4_lookup_cbk] 0-data_fast-server: 162496:
LOOKUP /.shard/5985adcb-0f4d-4317-8a26-1652973a2350.6
(be318638-e8a0-4c6d-977d-7a937aa84806/5985adcb-0f4d-4317-8a26-1652973a2350.6),
client:
CTX_ID:8bff2d95-4629-45cb-a7bf-2412e48896bc-GRAPH_ID:0-PID:13394-HOST:ovirt1.localdomain-PC_NAME:data_fast-client-0-RECON_NO:-0,
error-xlator: data_fast-access-control [Permission denied][2019-11-17
07:55:47.090174] I [MSGID: 139001]
[posix-acl.c:263:posix_acl_log_permit_denied] 0-data_fast-access-control:
client:
CTX_ID:8bff2d95-4629-45cb-a7bf-2412e48896bc-GRAPH_ID:0-PID:13394-HOST:ovirt1.localdomain-PC_NAME:data_fast-client-0-RECON_NO:-0,
gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
req(uid:36,gid:36,perm:1,ngrps:3),
ctx(uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
[Permission denied][2019-11-17 07:55:47.090209] E [MSGID: 115050]
[server-rpc-fops_v2.c:158:server4_lookup_cbk] 0-data_fast-server: 162497:
LOOKUP /.shard/5985adcb-0f4d-4317-8a26-1652973a2350.7
(be318638-e8a0-4c6d-977d-7a937aa84806/5985adcb-0f4d-4317-8a26-1652973a2350.7),
client:
CTX_ID:8bff2d95-4629-45cb-a7bf-2412e48896bc-GRAPH_ID:0-PID:13394-HOST:ovirt1.localdomain-PC_NAME:data_fast-client-0-RECON_NO:-0,
error-xlator: data_fast-access-control [Permission denied][2019-11-17
07:55:47.090299] I [MSGID: 139001]
[posix-acl.c:263:posix_acl_log_permit_denied] 0-data_fast-access-control:
client:
CTX_ID:8bff2d95-4629-45cb-a7bf-2412e48896bc-GRAPH_ID:0-PID:13394-HOST:ovirt1.localdomain-PC_NAME:data_fast-client-0-RECON_NO:-0,
gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
req(uid:36,gid:36,perm:1,ngrps:3),
ctx(uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
[Permission denied]*

Jiffin/Raghavendra Talur,
Can you help?

-Krutika

On Wed, Nov 27, 2019 at 2:11 PM Strahil Nikolov 
wrote:

> Hi Nir,All,
>
> it seems that 4.3.7 RC3 (and even RC4) are not the problem here(attached
> screenshot of oVirt running on v7 gluster).
> It seems strange that both my serious issues with oVirt are related to
> gluster issue (1st gluster v3  to v5 migration and now this one).
>
> I have just updated to gluster v7.0 (Centos 7 repos), and rebooted all
> nodes.
> Now both Engine and all my VMs are back online - so if you hit issues with
> 6.6 , you should give a try to 7.0 (and even 7.1 is coming soon) before
> deciding to wipe everything.
>
> @Krutika,
>
> I guess you will ask for the logs, so let's switch to gluster-users about
> this one ?
>
> Best Regards,
> Strahil Nikolov
>
> В понеделник, 25 ноември 2019 г., 16:45:48 ч. Гринуич-5, Strahil Nikolov <
> hunter86...@yahoo.com> написа:
>
>
> Hi Krutika,
>
> I have enabled TRACE log level for the volume data_fast,
>
> but the issue is not much clear:
> FUSE reports:
>
> [2019-11-25 21:31:53.478130] I [MSGID: 133022]
> [shard.c:3674:shard_delete_shards] 0-data_fast-shard: Deleted shards of
> gfid=6d9ed2e5-d4f2-4749-839b-2f1
> 3a68ed472 from backend
> [2019-11-25 21:32:43.564694] W [MSGID: 114031]
> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0:
> remote operation failed. Path:
> /.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
> (----) [Permission denied]
> [2019-11-25 21:32:43.565653] W [MSGID: 114031]
> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1:
> remote operation failed. Path:
> /.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
> (----) [Permission denied]
> [2019-11-25 21:32:43.565689] W [MSGID: 114031]
> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
> remote operation failed. Path:
> /.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
> (----) [Permission denied]
> [2019-11-25 21:32:43.565770] E [MSGID: 133010]
> [shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
> shard 79 failed. Base file gfid = b0af2b81-22cf-482e-9b2f-c431b6449dae
> [Permission denied]
> [2019-11-25 21:32:43.565858] W [fuse-bridge.c:2830:fuse_readv_cbk]
> 0-glusterfs-fuse: 279: READ => -1 gfid=b0af2b81-22cf-482e-9b2f-c431b6449dae
> fd=0x7fbf40005ea8 (Permission denied)
>
>
> While the BRICK logs on ovirt1/gluster1 report:
> 2019-11-25 21:32:43.564177] D [MSGID: 0] [io-threads.c:376:iot_schedule]
> 0-data_fast-io-threads: LOOKUP scheduled as fast priority fop
> [2019-11-25 21:32:43.564194] T [MSGID: 0]
> [defaults.c:2008:default_lookup_resume] 0-stack-trace: stack-address:
> 0x7fc02c00bbf8, winding from data_fast-io-threads to data_fast-upcall
> [2019-11-25 21:32:43.564206] T [MSGID: 0] [upcall.c:790:up_lookup]
> 0-stack-trace: stack-address: 0x7fc02c00bbf8, winding from data_fast-upcall
> to data_fast-leases
> [2019-11-25 21:32:43.564215] T [MSGID: 0] [defaults.c:2766:default_lookup]
> 0-stack-trace: stack-address: 0x7fc02c00bbf8, winding from data_fast-leases
> to 

[ovirt-users] Re: [ANN] oVirt 4.3.7 Third Release Candidate is now available for testing

2019-11-25 Thread Krutika Dhananjay
On Sat, Nov 23, 2019 at 3:14 AM Nir Soffer  wrote:

> On Fri, Nov 22, 2019 at 10:41 PM Strahil Nikolov 
> wrote:
>
>> On Thu, Nov 21, 2019 at 8:20 AM Sahina Bose  wrote:
>>
>>
>>
>> On Thu, Nov 21, 2019 at 6:03 AM Strahil Nikolov 
>> wrote:
>>
>> Hi All,
>>
>> another clue in the logs :
>> [2019-11-21 00:29:50.536631] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1:
>> remote operation failed. Path:
>> /.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
>> (----) [Permission denied]
>> [2019-11-21 00:29:50.536798] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0:
>> remote operation failed. Path:
>> /.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
>> (----) [Permission denied]
>> [2019-11-21 00:29:50.536959] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
>> remote operation failed. Path:
>> /.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
>> (----) [Permission denied]
>> [2019-11-21 00:29:50.537007] E [MSGID: 133010]
>> [shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
>> shard 79 failed. Base file gfid = b0af2b81-22cf-482e-9b2f-c431b6449dae
>> [Permission denied]
>> [2019-11-21 00:29:50.537066] W [fuse-bridge.c:2830:fuse_readv_cbk]
>> 0-glusterfs-fuse: 12458: READ => -1
>> gfid=b0af2b81-22cf-482e-9b2f-c431b6449dae fd=0x7fc63c00fe18 (Permission
>> denied)
>> [2019-11-21 00:30:01.177665] I [MSGID: 133022]
>> [shard.c:3674:shard_delete_shards] 0-data_fast-shard: Deleted shards of
>> gfid=eb103fbf-80dc-425d-882f-1e4efe510db5 from backend
>> [2019-11-21 00:30:13.132756] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0:
>> remote operation failed. Path:
>> /.shard/17c663c2-f582-455b-b806-3b9d01fb2c6c.79
>> (----) [Permission denied]
>> [2019-11-21 00:30:13.132824] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1:
>> remote operation failed. Path:
>> /.shard/17c663c2-f582-455b-b806-3b9d01fb2c6c.79
>> (----) [Permission denied]
>> [2019-11-21 00:30:13.133217] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
>> remote operation failed. Path:
>> /.shard/17c663c2-f582-455b-b806-3b9d01fb2c6c.79
>> (----) [Permission denied]
>> [2019-11-21 00:30:13.133238] E [MSGID: 133010]
>> [shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
>> shard 79 failed. Base file gfid = 17c663c2-f582-455b-b806-3b9d01fb2c6c
>> [Permission denied]
>> [2019-11-21 00:30:13.133264] W [fuse-bridge.c:2830:fuse_readv_cbk]
>> 0-glusterfs-fuse: 12660: READ => -1
>> gfid=17c663c2-f582-455b-b806-3b9d01fb2c6c fd=0x7fc63c007038 (Permission
>> denied)
>> [2019-11-21 00:30:38.489449] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0:
>> remote operation failed. Path:
>> /.shard/a10a5ae8-108b-4d78-9e65-cca188c27fc4.6
>> (----) [Permission denied]
>> [2019-11-21 00:30:38.489520] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1:
>> remote operation failed. Path:
>> /.shard/a10a5ae8-108b-4d78-9e65-cca188c27fc4.6
>> (----) [Permission denied]
>> [2019-11-21 00:30:38.489669] W [MSGID: 114031]
>> [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
>> remote operation failed. Path:
>> /.shard/a10a5ae8-108b-4d78-9e65-cca188c27fc4.6
>> (----) [Permission denied]
>> [2019-11-21 00:30:38.489717] E [MSGID: 133010]
>> [shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
>> shard 6 failed. Base file gfid = a10a5ae8-108b-4d78-9e65-cca188c27fc4
>> [Permission denied]
>> [2019-11-21 00:30:38.489777] W [fuse-bridge.c:2830:fuse_readv_cbk]
>> 0-glusterfs-fuse: 12928: READ => -1
>> gfid=a10a5ae8-108b-4d78-9e65-cca188c27fc4 fd=0x7fc63c01a058 (Permission
>> denied)
>>
>>
>> Anyone got an idea why is it happening?
>> I checked user/group and selinux permissions - all OK
>>
>>
>> >Can you share the commands (and output) used to check this?
>> I first thought that the file is cached in memory and that's why vdsm
>> user can read the file , but the following shows opposite:
>>
>> [root@ovirt1 94f763e9-fd96-4bee-a6b2-31af841a918b]# ll
>> total 562145
>> -rw-rw. 1 vdsm kvm 5368709120 Nov 12 23:29
>> 5b1d3113-5cca-4582-9029-634b16338a2f
>> -rw-rw. 1 vdsm kvm1048576 Nov 11 14:11
>> 5b1d3113-5cca-4582-9029-634b16338a2f.lease
>> -rw-r--r--. 1 vdsm kvm313 Nov 11 14:11
>> 5b1d3113-5cca-4582-9029-634b16338a2f.meta
>> [root@ovirt1 94f763e9-fd96-4bee-a6b2-31af841a918b]# pwd
>>
>> 

[ovirt-users] Re: [ovirt-announce] Re: [ANN] oVirt 4.3.4 First Release Candidate is now available

2019-05-21 Thread Krutika Dhananjay
On Tue, May 21, 2019 at 8:13 PM Strahil  wrote:

> Dear Krutika,
>
> Yes I did but I use 6 ports (1 gbit/s each) and this is the reason that
> reads get slower.
> Do you know a way to force gluster to open more connections (client to
> server & server to server)?
>

The idea was explored sometime back here -
https://review.gluster.org/c/glusterfs/+/19133
But there were some issues that were identified with the approach, so it
had to be dropped.

-Krutika

Thanks for the detailed explanation.
>
> Best Regards,
> Strahil Nikolov
> On May 21, 2019 08:36, Krutika Dhananjay  wrote:
>
> So in our internal tests (with nvme ssd drives, 10g n/w), we found read
> performance to be better with choose-local
> disabled in hyperconverged setup.  See
> https://bugzilla.redhat.com/show_bug.cgi?id=1566386 for more information.
>
> With choose-local off, the read replica is chosen randomly (based on hash
> value of the gfid of that shard).
> And when it is enabled, the reads always go to the local replica.
> We attributed better performance with the option disabled to bottlenecks
> in gluster's rpc/socket layer. Imagine all read
> requests lined up to be sent over the same mount-to-brick connection as
> opposed to (nearly) randomly getting distributed
> over three (because replica count = 3) such connections.
>
> Did you run any tests that indicate "choose-local=on" is giving better
> read perf as opposed to when it's disabled?
>
> -Krutika
>
> On Sun, May 19, 2019 at 5:11 PM Strahil Nikolov 
> wrote:
>
> Ok,
>
> so it seems that Darell's case and mine are different as I use vdo.
>
> Now I have destroyed Storage Domains, gluster volumes and vdo and
> recreated again (4 gluster volumes on a single vdo).
> This time vdo has '--emulate512=true' and no issues have been observed.
>
> Gluster volume options before 'Optimize for virt':
>
> Volume Name: data_fast
> Type: Replicate
> Volume ID: 378804bf-2975-44d8-84c2-b541aa87f9ef
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster1:/gluster_bricks/data_fast/data_fast
> Brick2: gluster2:/gluster_bricks/data_fast/data_fast
> Brick3: ovirt3:/gluster_bricks/data_fast/data_fast (arbiter)
> Options Reconfigured:
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
> cluster.enable-shared-storage: enable
>
> Gluster volume after 'Optimize for virt':
>
> Volume Name: data_fast
> Type: Replicate
> Volume ID: 378804bf-2975-44d8-84c2-b541aa87f9ef
> Status: Stopped
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster1:/gluster_bricks/data_fast/data_fast
> Brick2: gluster2:/gluster_bricks/data_fast/data_fast
> Brick3: ovirt3:/gluster_bricks/data_fast/data_fast (arbiter)
> Options Reconfigured:
> network.ping-timeout: 30
> performance.strict-o-direct: on
> storage.owner-gid: 36
> storage.owner-uid: 36
> server.event-threads: 4
> client.event-threads: 4
> cluster.choose-local: off
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 1
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: on
> cluster.enable-shared-storage: enable
>
> After that adding the volumes as storage domains (via UI) worked without
> any issues.
>
> Can someone clarify why we have now 'cluster.choose-local: off' when in
> oVirt 4.2.7 (gluster v3.12.15) we didn't have that ?
> I'm using storage that is faster than network and reading from local brick
> gives very high read speed.
>
> Best Regards,
> Strahil Nikolov
>
>
>
> В неделя, 19 май 2019 г., 9:47:27 ч. Г�
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OHWZ7Y3T7QKP6CVCC34KDOFSXVILJ332/


[ovirt-users] Re: [ovirt-announce] Re: [ANN] oVirt 4.3.4 First Release Candidate is now available

2019-05-20 Thread Krutika Dhananjay
So in our internal tests (with nvme ssd drives, 10g n/w), we found read
performance to be better with choose-local
disabled in hyperconverged setup.  See
https://bugzilla.redhat.com/show_bug.cgi?id=1566386 for more information.

With choose-local off, the read replica is chosen randomly (based on hash
value of the gfid of that shard).
And when it is enabled, the reads always go to the local replica.
We attributed better performance with the option disabled to bottlenecks in
gluster's rpc/socket layer. Imagine all read
requests lined up to be sent over the same mount-to-brick connection as
opposed to (nearly) randomly getting distributed
over three (because replica count = 3) such connections.

Did you run any tests that indicate "choose-local=on" is giving better read
perf as opposed to when it's disabled?

-Krutika

On Sun, May 19, 2019 at 5:11 PM Strahil Nikolov 
wrote:

> Ok,
>
> so it seems that Darell's case and mine are different as I use vdo.
>
> Now I have destroyed Storage Domains, gluster volumes and vdo and
> recreated again (4 gluster volumes on a single vdo).
> This time vdo has '--emulate512=true' and no issues have been observed.
>
> Gluster volume options before 'Optimize for virt':
>
> Volume Name: data_fast
> Type: Replicate
> Volume ID: 378804bf-2975-44d8-84c2-b541aa87f9ef
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster1:/gluster_bricks/data_fast/data_fast
> Brick2: gluster2:/gluster_bricks/data_fast/data_fast
> Brick3: ovirt3:/gluster_bricks/data_fast/data_fast (arbiter)
> Options Reconfigured:
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
> cluster.enable-shared-storage: enable
>
> Gluster volume after 'Optimize for virt':
>
> Volume Name: data_fast
> Type: Replicate
> Volume ID: 378804bf-2975-44d8-84c2-b541aa87f9ef
> Status: Stopped
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster1:/gluster_bricks/data_fast/data_fast
> Brick2: gluster2:/gluster_bricks/data_fast/data_fast
> Brick3: ovirt3:/gluster_bricks/data_fast/data_fast (arbiter)
> Options Reconfigured:
> network.ping-timeout: 30
> performance.strict-o-direct: on
> storage.owner-gid: 36
> storage.owner-uid: 36
> server.event-threads: 4
> client.event-threads: 4
> cluster.choose-local: off
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 1
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: on
> cluster.enable-shared-storage: enable
>
> After that adding the volumes as storage domains (via UI) worked without
> any issues.
>
> Can someone clarify why we have now 'cluster.choose-local: off' when in
> oVirt 4.2.7 (gluster v3.12.15) we didn't have that ?
> I'm using storage that is faster than network and reading from local brick
> gives very high read speed.
>
> Best Regards,
> Strahil Nikolov
>
>
>
> В неделя, 19 май 2019 г., 9:47:27 ч. Гринуич+3, Strahil <
> hunter86...@yahoo.com> написа:
>
>
> On this one
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html-single/configuring_red_hat_virtualization_with_red_hat_gluster_storage/index#proc-To_Configure_Volumes_Using_the_Command_Line_Interface
> We should have the following options:
>
> performance.quick-read=off performance.read-ahead=off performance.io-cache=off
> performance.stat-prefetch=off performance.low-prio-threads=32
> network.remote-dio=enable cluster.eager-lock=enable
> cluster.quorum-type=auto cluster.server-quorum-type=server
> cluster.data-self-heal-algorithm=full cluster.locking-scheme=granular
> cluster.shd-max-threads=8 cluster.shd-wait-qlength=1 features.shard=on
> user.cifs=off
>
> By the way the 'virt' gluster group disables 'cluster.choose-local' and I
> think it wasn't like that.
> Any reasons behind that , as I use it to speedup my reads, as local
> storage is faster than the network?
>
> Best Regards,
> Strahil Nikolov
> On May 19, 2019 09:36, Strahil  wrote:
>
> OK,
>
> Can we summarize it:
> 1. VDO must 'emulate512=true'
> 2. 'network.remote-dio' should be off ?
>
> As per this:
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/configuring_red_hat_openstack_with_red_hat_storage/sect-setting_up_red_hat_storage_trusted_storage_pool
>
> We should have these:
>
> quick-read=off
> read-ahead=off
> io-cache=off
> stat-prefetch=off
> eager-lock=enable
> remote-dio=on
> quorum-type=auto
> server-quorum-type=server
>
> I'm a little bit confused here.
>
> Best Regards,
> Strahil Nikolov
> On May 19, 2019 07:44, Sahina Bose  wrote:
>
>
>
> On 

[ovirt-users] Re: [Gluster-users] Announcing Gluster release 5.5

2019-03-31 Thread Krutika Dhananjay
Adding back gluster-users
Comments inline ...

On Fri, Mar 29, 2019 at 8:11 PM Olaf Buitelaar 
wrote:

> Dear Krutika,
>
>
>
> 1. I’ve made 2 profile runs of around 10 minutes (see files
> profile_data.txt and profile_data2.txt). Looking at it, most time seems be
> spent at the  fop’s fsync and readdirp.
>
> Unfortunate I don’t have the profile info for the 3.12.15 version so it’s
> a bit hard to compare.
>
> One additional thing I do notice on 1 machine (10.32.9.5) the iowait time
> increased a lot, from an average below the 1% it’s now around the 12% after
> the upgrade.
>
> So first suspicion with be lighting strikes twice, and I’ve also just now
> a bad disk, but that doesn’t appear to be the case, since all smart status
> report ok.
>
> Also dd shows performance I would more or less expect;
>
> dd if=/dev/zero of=/data/test_file  bs=100M count=1  oflag=dsync
>
> 1+0 records in
>
> 1+0 records out
>
> 104857600 bytes (105 MB) copied, 0.686088 s, 153 MB/s
>
> dd if=/dev/zero of=/data/test_file  bs=1G count=1  oflag=dsync
>
> 1+0 records in
>
> 1+0 records out
>
> 1073741824 bytes (1.1 GB) copied, 7.61138 s, 141 MB/s
>
> if=/dev/urandom of=/data/test_file  bs=1024 count=100
>
> 100+0 records in
>
> 100+0 records out
>
> 102400 bytes (1.0 GB) copied, 6.35051 s, 161 MB/s
>
> dd if=/dev/zero of=/data/test_file  bs=1024 count=100
>
> 100+0 records in
>
> 100+0 records out
>
> 102400 bytes (1.0 GB) copied, 1.6899 s, 606 MB/s
>
> When I disable this brick (service glusterd stop; pkill glusterfsd)
> performance in gluster is better, but not on par with what it was. Also the
> cpu usages on the “neighbor” nodes which hosts the other bricks in the same
> subvolume increases quite a lot in this case, which I wouldn’t expect
> actually since they shouldn't handle much more work, except flagging shards
> to heal. Iowait  also goes to idle once gluster is stopped, so it’s for
> sure gluster which waits for io.
>
>
>

So I see that FSYNC %-latency is on the higher side. And I also noticed you
don't have direct-io options enabled on the volume.
Could you set the following options on the volume -
# gluster volume set  network.remote-dio off
# gluster volume set  performance.strict-o-direct on
and also disable choose-local
# gluster volume set  cluster.choose-local off

let me know if this helps.

2. I’ve attached the mnt log and volume info, but I couldn’t find anything
> relevant in in those logs. I think this is because we run the VM’s with
> libgfapi;
>
> [root@ovirt-host-01 ~]# engine-config  -g LibgfApiSupported
>
> LibgfApiSupported: true version: 4.2
>
> LibgfApiSupported: true version: 4.1
>
> LibgfApiSupported: true version: 4.3
>
> And I can confirm the qemu process is invoked with the gluster:// address
> for the images.
>
> The message is logged in the /var/lib/libvert/qemu/  file, which
> I’ve also included. For a sample case see around; 2019-03-28 20:20:07
>
> Which has the error; E [MSGID: 133010]
> [shard.c:2294:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on
> shard 109886 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c
> [Stale file handle]
>

Could you also attach the brick logs for this volume?


>
> 3. yes I see multiple instances for the same brick directory, like;
>
> /usr/sbin/glusterfsd -s 10.32.9.6 --volfile-id
> ovirt-core.10.32.9.6.data-gfs-bricks-brick1-ovirt-core -p
> /var/run/gluster/vols/ovirt-core/10.32.9.6-data-gfs-bricks-brick1-ovirt-core.pid
> -S /var/run/gluster/452591c9165945d9.socket --brick-name
> /data/gfs/bricks/brick1/ovirt-core -l
> /var/log/glusterfs/bricks/data-gfs-bricks-brick1-ovirt-core.log
> --xlator-option *-posix.glusterd-uuid=fb513da6-f3bd-4571-b8a2-db5efaf60cc1
> --process-name brick --brick-port 49154 --xlator-option
> ovirt-core-server.listen-port=49154
>
>
>
> I’ve made an export of the output of ps from the time I observed these
> multiple processes.
>
> In addition the brick_mux bug as noted by Atin. I might also have another
> possible cause, as ovirt moves nodes from none-operational state or
> maintenance state to active/activating, it also seems to restart gluster,
> however I don’t have direct proof for this theory.
>
>
>

+Atin Mukherjee  ^^
+Mohit Agrawal   ^^

-Krutika

Thanks Olaf
>
> Op vr 29 mrt. 2019 om 10:03 schreef Sandro Bonazzola  >:
>
>>
>>
>> Il giorno gio 28 mar 2019 alle ore 17:48  ha
>> scritto:
>>
>>> Dear All,
>>>
>>> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. While
>>> previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one was a
>>> different experience. After first trying a test upgrade on a 3 node setup,
>>> which went fine. i headed to upgrade the 9 node production platform,
>>> unaware of the backward compatibility issues between gluster 3.12.15 ->
>>> 5.3. After upgrading 2 nodes, the HA engine stopped and wouldn't start.
>>> Vdsm wasn't able to mount the engine storage domain, since /dom_md/metadata
>>> was missing or couldn't be accessed. 

[ovirt-users] Re: [Gluster-users] Announcing Gluster release 5.5

2019-03-29 Thread Krutika Dhananjay
Questions/comments inline ...

On Thu, Mar 28, 2019 at 10:18 PM  wrote:

> Dear All,
>
> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. While
> previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one was a
> different experience. After first trying a test upgrade on a 3 node setup,
> which went fine. i headed to upgrade the 9 node production platform,
> unaware of the backward compatibility issues between gluster 3.12.15 ->
> 5.3. After upgrading 2 nodes, the HA engine stopped and wouldn't start.
> Vdsm wasn't able to mount the engine storage domain, since /dom_md/metadata
> was missing or couldn't be accessed. Restoring this file by getting a good
> copy of the underlying bricks, removing the file from the underlying bricks
> where the file was 0 bytes and mark with the stickybit, and the
> corresponding gfid's. Removing the file from the mount point, and copying
> back the file on the mount point. Manually mounting the engine domain,  and
> manually creating the corresponding symbolic links in /rhev/data-center and
> /var/run/vdsm/storage and fixing the ownership back to vdsm.kvm (which was
> root.root), i was able to start the HA engine again. Since the engine was
> up again, and things seemed rather unstable i decided to continue the
> upgrade on the other nodes suspecting an incompatibility in gluster
> versions, i thought would be best to have them all on the same version
> rather soonish. However things went from bad to worse, the engine stopped
> again, and all vm’s stopped working as well.  So on a machine outside the
> setup and restored a backup of the engine taken from version 4.2.8 just
> before the upgrade. With this engine I was at least able to start some vm’s
> again, and finalize the upgrade. Once the upgraded, things didn’t stabilize
> and also lose 2 vm’s during the process due to image corruption. After
> figuring out gluster 5.3 had quite some issues I was as lucky to see
> gluster 5.5 was about to be released, on the moment the RPM’s were
> available I’ve installed those. This helped a lot in terms of stability,
> for which I’m very grateful! However the performance is unfortunate
> terrible, it’s about 15% of what the performance was running gluster
> 3.12.15. It’s strange since a simple dd shows ok performance, but our
> actual workload doesn’t. While I would expect the performance to be better,
> due to all improvements made since gluster version 3.12. Does anybody share
> the same experience?
> I really hope gluster 6 will soon be tested with ovirt and released, and
> things start to perform and stabilize again..like the good old days. Of
> course when I can do anything, I’m happy to help.
>
> I think the following short list of issues we have after the migration;
> Gluster 5.5;
> -   Poor performance for our workload (mostly write dependent)
>

For this, could you share the volume-profile output specifically for the
affected volume(s)? Here's what you need to do -

1. # gluster volume profile $VOLNAME stop
2. # gluster volume profile $VOLNAME start
3. Run the test inside the vm wherein you see bad performance
4. # gluster volume profile $VOLNAME info # save the output of this command
into a file
5. # gluster volume profile $VOLNAME stop
6. and attach the output file gotten in step 4

-   VM’s randomly pause on un
>
known storage errors, which are “stale file’s”. corresponding log; Lookup
> on shard 797 failed. Base file gfid = 8a27b91a-ff02-42dc-bd4c-caa019424de8
> [Stale file handle]
>

Could you share the complete gluster client log file (it would be a
filename matching the pattern rhev-data-center-mnt-glusterSD-*)
Also the output of `gluster volume info $VOLNAME`



> -   Some files are listed twice in a directory (probably related the
> stale file issue?)
> Example;
> ls -la
> /rhev/data-center/59cd53a9-0003-02d7-00eb-01e3/313f5d25-76af-4ecd-9a20-82a2fe815a3c/images/4add6751-3731-4bbd-ae94-aaeed12ea450/
> total 3081
> drwxr-x---.  2 vdsm kvm4096 Mar 18 11:34 .
> drwxr-xr-x. 13 vdsm kvm4096 Mar 19 09:42 ..
> -rw-rw.  1 vdsm kvm 1048576 Mar 28 12:55
> 1a7cf259-6b29-421d-9688-b25dfaafb13c
> -rw-rw.  1 vdsm kvm 1048576 Mar 28 12:55
> 1a7cf259-6b29-421d-9688-b25dfaafb13c
> -rw-rw.  1 vdsm kvm 1048576 Jan 27  2018
> 1a7cf259-6b29-421d-9688-b25dfaafb13c.lease
> -rw-r--r--.  1 vdsm kvm 290 Jan 27  2018
> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta
> -rw-r--r--.  1 vdsm kvm 290 Jan 27  2018
> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta
>

Adding DHT and readdir-ahead maintainers regarding entries getting listed
twice.
@Nithya Balachandran  ^^
@Gowdappa, Raghavendra  ^^
@Poornima Gurusiddaiah  ^^


>
> - brick processes sometimes starts multiple times. Sometimes I’ve 5 brick
> processes for a single volume. Killing all glusterfsd’s for the volume on
> the machine and running gluster v start  force usually just starts one
> after the event, from then on things look all right.
>

Did you mean 5 brick processes for 

[ovirt-users] Re: Gluster VM image Resync Time

2019-03-28 Thread Krutika Dhananjay
On Thu, Mar 28, 2019 at 2:28 PM Krutika Dhananjay 
wrote:

> Gluster 5.x does have two important performance-related fixes that are not
> part of 3.12.x -
> i.  in shard-replicate interaction -
> https://bugzilla.redhat.com/show_bug.cgi?id=1635972
>

Sorry, wrong bug-id. This should be
https://bugzilla.redhat.com/show_bug.cgi?id=1549606.

-Krutika

ii. in qemu-gluster-fuse interaction -
> https://bugzilla.redhat.com/show_bug.cgi?id=1635980
>
> The two fixes do improve write performance in vm-storage workload. Do let
> us know your experience if you happen to move to gluster-5.x.
>
> -Krutika
>
>
>
> On Thu, Mar 28, 2019 at 1:13 PM Strahil  wrote:
>
>> Hi Krutika,
>>
>> I have noticed some performance penalties  (10%-15%) when using sharing
>> in v3.12  .
>> What is the situation now with 5.5 ?
>> Best Regards,
>> Strahil Nikolov
>> On Mar 28, 2019 08:56, Krutika Dhananjay  wrote:
>>
>> Right. So Gluster stores what are called "indices" for each modified file
>> (or shard)
>> under a special hidden directory of the "good" bricks at
>> $BRICK_PATH/.glusterfs/indices/xattrop.
>> When the offline brick comes back up, the file corresponding to each
>> index is healed, and then the index deleted
>> to mark the fact that the file has been healed.
>>
>> You can try this and see it for yourself. Just create a 1x3 plain
>> replicate volume, and enable shard on it.
>> Create a big file (big enough to have multiple shards). Check that the
>> shards are created under $BRICK_PATH/.shard.
>> Now kill a brick. Modify a small portion of the file. Hit `ls` on
>> $BRICK_PATH/.glusterfs/indices/xattrop of the online bricks.
>> You'll notice there will be entries named after the gfid (unique
>> identifier in gluster for each file) of the shards.
>> And only for those shards that the write modified, and not ALL shards of
>> this really big file.
>> And then when you bring the brick back up using `gluster volume start
>> $VOL force`, the
>> shards get healed and the directory eventually becomes empty.
>>
>> -Krutika
>>
>>
>> On Thu, Mar 28, 2019 at 12:14 PM Indivar Nair 
>> wrote:
>>
>> Hi Krutika,
>>
>> So how does the Gluster node know which shards were modified after it
>> went down?
>> Do the other Gluster nodes keep track of it?
>>
>> Regards,
>>
>>
>> Indivar Nair
>>
>>
>> On Thu, Mar 28, 2019 at 9:45 AM Krutika Dhananjay 
>> wrote:
>>
>> Each shard is a separate file of size equal to value of
>> "features.shard-block-size".
>> So when a brick/node was down, only those shards belonging to the VM that
>> were modified will be sync'd later when the brick's back up.
>> Does that answer your question?
>>
>> -Krutika
>>
>> On Wed, Mar 27, 2019 at 7:48 PM Sahina Bose  wrote:
>>
>> On Wed, Mar 27, 2019 at 7:40 PM Indivar Nair 
>> wrote:
>> >
>> > Hi Strahil,
>> >
>> > Ok. Looks like sharding should make the resyncs faster.
>> >
>> > I searched for more info on it, but couldn't find much.
>> > I believe it will still have to compare each shard to determine whether
>> there are any changes that need to be replicated.
>> > Am I right?
>>
>> +Krutika Dhananjay
>> >
>> > Regards,
>> >
>> > Indivar Nair
>> >
>> >
>> >
>> > On Wed, Mar 27, 2019 at 4:34 PM Strahil  wrote:
>> >>
>> >> By default ovirt uses 'sharding' which splits the files into logical
>> chunks. This greatly reduces healing time, as VM's disk is not always
>> completely overwritten and only the shards that are different will be
>> healed.
>> >>
>> >> Maybe you should change the default shard size.
>> >>
>> >> Best Regards,
>> >> Strahil Nikolov
>> >>
>> >> On Mar 27, 2019 08:24, Indivar Nair  wrote:
>> >>
>> >> Hi All,
>> >>
>> >> We are planning a 2 + 1 arbitrated mirrored Gluster setup.
>> >> We would have around 50 - 60 VMs, with an average 500GB disk size.
>> >>
>> >> Now in case one of the Gluster Nodes go completely out of sync,
>> roughly, how long would it take to resync? (as per your experience)
>> >> Will it impact the working of VMs in any way?
>> >> Is there anything to be taken care of, in advance, to prepare for such
>> a situation?
>> >>
>> >> Regards,
>> >>
>> >>
>> >> Indivar Nair
>> >>
>> > __
>>
>>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/H46WHUIHFJ5EZPR4MX27WEGDM264VTGQ/


[ovirt-users] Re: Gluster VM image Resync Time

2019-03-28 Thread Krutika Dhananjay
Gluster 5.x does have two important performance-related fixes that are not
part of 3.12.x -
i.  in shard-replicate interaction -
https://bugzilla.redhat.com/show_bug.cgi?id=1635972
ii. in qemu-gluster-fuse interaction -
https://bugzilla.redhat.com/show_bug.cgi?id=1635980

The two fixes do improve write performance in vm-storage workload. Do let
us know your experience if you happen to move to gluster-5.x.

-Krutika



On Thu, Mar 28, 2019 at 1:13 PM Strahil  wrote:

> Hi Krutika,
>
> I have noticed some performance penalties  (10%-15%) when using sharing in
> v3.12  .
> What is the situation now with 5.5 ?
> Best Regards,
> Strahil Nikolov
> On Mar 28, 2019 08:56, Krutika Dhananjay  wrote:
>
> Right. So Gluster stores what are called "indices" for each modified file
> (or shard)
> under a special hidden directory of the "good" bricks at
> $BRICK_PATH/.glusterfs/indices/xattrop.
> When the offline brick comes back up, the file corresponding to each index
> is healed, and then the index deleted
> to mark the fact that the file has been healed.
>
> You can try this and see it for yourself. Just create a 1x3 plain
> replicate volume, and enable shard on it.
> Create a big file (big enough to have multiple shards). Check that the
> shards are created under $BRICK_PATH/.shard.
> Now kill a brick. Modify a small portion of the file. Hit `ls` on
> $BRICK_PATH/.glusterfs/indices/xattrop of the online bricks.
> You'll notice there will be entries named after the gfid (unique
> identifier in gluster for each file) of the shards.
> And only for those shards that the write modified, and not ALL shards of
> this really big file.
> And then when you bring the brick back up using `gluster volume start $VOL
> force`, the
> shards get healed and the directory eventually becomes empty.
>
> -Krutika
>
>
> On Thu, Mar 28, 2019 at 12:14 PM Indivar Nair 
> wrote:
>
> Hi Krutika,
>
> So how does the Gluster node know which shards were modified after it went
> down?
> Do the other Gluster nodes keep track of it?
>
> Regards,
>
>
> Indivar Nair
>
>
> On Thu, Mar 28, 2019 at 9:45 AM Krutika Dhananjay 
> wrote:
>
> Each shard is a separate file of size equal to value of
> "features.shard-block-size".
> So when a brick/node was down, only those shards belonging to the VM that
> were modified will be sync'd later when the brick's back up.
> Does that answer your question?
>
> -Krutika
>
> On Wed, Mar 27, 2019 at 7:48 PM Sahina Bose  wrote:
>
> On Wed, Mar 27, 2019 at 7:40 PM Indivar Nair 
> wrote:
> >
> > Hi Strahil,
> >
> > Ok. Looks like sharding should make the resyncs faster.
> >
> > I searched for more info on it, but couldn't find much.
> > I believe it will still have to compare each shard to determine whether
> there are any changes that need to be replicated.
> > Am I right?
>
> +Krutika Dhananjay
> >
> > Regards,
> >
> > Indivar Nair
> >
> >
> >
> > On Wed, Mar 27, 2019 at 4:34 PM Strahil  wrote:
> >>
> >> By default ovirt uses 'sharding' which splits the files into logical
> chunks. This greatly reduces healing time, as VM's disk is not always
> completely overwritten and only the shards that are different will be
> healed.
> >>
> >> Maybe you should change the default shard size.
> >>
> >> Best Regards,
> >> Strahil Nikolov
> >>
> >> On Mar 27, 2019 08:24, Indivar Nair  wrote:
> >>
> >> Hi All,
> >>
> >> We are planning a 2 + 1 arbitrated mirrored Gluster setup.
> >> We would have around 50 - 60 VMs, with an average 500GB disk size.
> >>
> >> Now in case one of the Gluster Nodes go completely out of sync,
> roughly, how long would it take to resync? (as per your experience)
> >> Will it impact the working of VMs in any way?
> >> Is there anything to be taken care of, in advance, to prepare for such
> a situation?
> >>
> >> Regards,
> >>
> >>
> >> Indivar Nair
> >>
> > __
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/B4ZZLTQ6BORAC2YETOQLBJSKF6JVVDRT/


[ovirt-users] Re: Gluster VM image Resync Time

2019-03-28 Thread Krutika Dhananjay
Right. So Gluster stores what are called "indices" for each modified file
(or shard)
under a special hidden directory of the "good" bricks at
$BRICK_PATH/.glusterfs/indices/xattrop.
When the offline brick comes back up, the file corresponding to each index
is healed, and then the index deleted
to mark the fact that the file has been healed.

You can try this and see it for yourself. Just create a 1x3 plain replicate
volume, and enable shard on it.
Create a big file (big enough to have multiple shards). Check that the
shards are created under $BRICK_PATH/.shard.
Now kill a brick. Modify a small portion of the file. Hit `ls` on
$BRICK_PATH/.glusterfs/indices/xattrop of the online bricks.
You'll notice there will be entries named after the gfid (unique identifier
in gluster for each file) of the shards.
And only for those shards that the write modified, and not ALL shards of
this really big file.
And then when you bring the brick back up using `gluster volume start $VOL
force`, the
shards get healed and the directory eventually becomes empty.

-Krutika


On Thu, Mar 28, 2019 at 12:14 PM Indivar Nair 
wrote:

> Hi Krutika,
>
> So how does the Gluster node know which shards were modified after it went
> down?
> Do the other Gluster nodes keep track of it?
>
> Regards,
>
>
> Indivar Nair
>
>
> On Thu, Mar 28, 2019 at 9:45 AM Krutika Dhananjay 
> wrote:
>
>> Each shard is a separate file of size equal to value of
>> "features.shard-block-size".
>> So when a brick/node was down, only those shards belonging to the VM that
>> were modified will be sync'd later when the brick's back up.
>> Does that answer your question?
>>
>> -Krutika
>>
>> On Wed, Mar 27, 2019 at 7:48 PM Sahina Bose  wrote:
>>
>>> On Wed, Mar 27, 2019 at 7:40 PM Indivar Nair 
>>> wrote:
>>> >
>>> > Hi Strahil,
>>> >
>>> > Ok. Looks like sharding should make the resyncs faster.
>>> >
>>> > I searched for more info on it, but couldn't find much.
>>> > I believe it will still have to compare each shard to determine
>>> whether there are any changes that need to be replicated.
>>> > Am I right?
>>>
>>> +Krutika Dhananjay
>>> >
>>> > Regards,
>>> >
>>> > Indivar Nair
>>> >
>>> >
>>> >
>>> > On Wed, Mar 27, 2019 at 4:34 PM Strahil  wrote:
>>> >>
>>> >> By default ovirt uses 'sharding' which splits the files into logical
>>> chunks. This greatly reduces healing time, as VM's disk is not always
>>> completely overwritten and only the shards that are different will be
>>> healed.
>>> >>
>>> >> Maybe you should change the default shard size.
>>> >>
>>> >> Best Regards,
>>> >> Strahil Nikolov
>>> >>
>>> >> On Mar 27, 2019 08:24, Indivar Nair 
>>> wrote:
>>> >>
>>> >> Hi All,
>>> >>
>>> >> We are planning a 2 + 1 arbitrated mirrored Gluster setup.
>>> >> We would have around 50 - 60 VMs, with an average 500GB disk size.
>>> >>
>>> >> Now in case one of the Gluster Nodes go completely out of sync,
>>> roughly, how long would it take to resync? (as per your experience)
>>> >> Will it impact the working of VMs in any way?
>>> >> Is there anything to be taken care of, in advance, to prepare for
>>> such a situation?
>>> >>
>>> >> Regards,
>>> >>
>>> >>
>>> >> Indivar Nair
>>> >>
>>> > ___
>>> > Users mailing list -- users@ovirt.org
>>> > To unsubscribe send an email to users-le...@ovirt.org
>>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> > oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> > List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WZW5RRVHFRMAIBUZDUSTXTIF4Z4WW5Y5/
>>>
>>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NTCTLHSFEQF5ZVXZHHXP7QUPAEUEEZF6/


[ovirt-users] Re: Gluster VM image Resync Time

2019-03-27 Thread Krutika Dhananjay
Each shard is a separate file of size equal to value of
"features.shard-block-size".
So when a brick/node was down, only those shards belonging to the VM that
were modified will be sync'd later when the brick's back up.
Does that answer your question?

-Krutika

On Wed, Mar 27, 2019 at 7:48 PM Sahina Bose  wrote:

> On Wed, Mar 27, 2019 at 7:40 PM Indivar Nair 
> wrote:
> >
> > Hi Strahil,
> >
> > Ok. Looks like sharding should make the resyncs faster.
> >
> > I searched for more info on it, but couldn't find much.
> > I believe it will still have to compare each shard to determine whether
> there are any changes that need to be replicated.
> > Am I right?
>
> +Krutika Dhananjay
> >
> > Regards,
> >
> > Indivar Nair
> >
> >
> >
> > On Wed, Mar 27, 2019 at 4:34 PM Strahil  wrote:
> >>
> >> By default ovirt uses 'sharding' which splits the files into logical
> chunks. This greatly reduces healing time, as VM's disk is not always
> completely overwritten and only the shards that are different will be
> healed.
> >>
> >> Maybe you should change the default shard size.
> >>
> >> Best Regards,
> >> Strahil Nikolov
> >>
> >> On Mar 27, 2019 08:24, Indivar Nair  wrote:
> >>
> >> Hi All,
> >>
> >> We are planning a 2 + 1 arbitrated mirrored Gluster setup.
> >> We would have around 50 - 60 VMs, with an average 500GB disk size.
> >>
> >> Now in case one of the Gluster Nodes go completely out of sync,
> roughly, how long would it take to resync? (as per your experience)
> >> Will it impact the working of VMs in any way?
> >> Is there anything to be taken care of, in advance, to prepare for such
> a situation?
> >>
> >> Regards,
> >>
> >>
> >> Indivar Nair
> >>
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WZW5RRVHFRMAIBUZDUSTXTIF4Z4WW5Y5/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UV7TOE7AYY7433T4UWHGIEEYRUOT5DXN/


[ovirt-users] Re: VM disk corruption with LSM on Gluster

2019-03-27 Thread Krutika Dhananjay
This is needed to prevent any inconsistencies stemming from buffered
writes/caching file data during live VM migration.
Besides, for Gluster to truly honor direct-io behavior in qemu's
'cache=none' mode (which is what oVirt uses),
one needs to turn on performance.strict-o-direct and disable remote-dio.

-Krutika

On Wed, Mar 27, 2019 at 12:24 PM Leo David  wrote:

> Hi,
> I can confirm that after setting these two options, I haven't encountered
> disk corruptions anymore.
> The downside, is that at least for me it had a pretty big impact on
> performance.
> The iops really went down - performing  inside vm fio tests.
>
> On Wed, Mar 27, 2019, 07:03 Krutika Dhananjay  wrote:
>
>> Could you enable strict-o-direct and disable remote-dio on the src volume
>> as well, restart the vms on "old" and retry migration?
>>
>> # gluster volume set  performance.strict-o-direct on
>> # gluster volume set  network.remote-dio off
>>
>> -Krutika
>>
>> On Tue, Mar 26, 2019 at 10:32 PM Sander Hoentjen 
>> wrote:
>>
>>> On 26-03-19 14:23, Sahina Bose wrote:
>>> > +Krutika Dhananjay and gluster ml
>>> >
>>> > On Tue, Mar 26, 2019 at 6:16 PM Sander Hoentjen 
>>> wrote:
>>> >> Hello,
>>> >>
>>> >> tl;dr We have disk corruption when doing live storage migration on
>>> oVirt
>>> >> 4.2 with gluster 3.12.15. Any idea why?
>>> >>
>>> >> We have a 3-node oVirt cluster that is both compute and
>>> gluster-storage.
>>> >> The manager runs on separate hardware. We are running out of space on
>>> >> this volume, so we added another Gluster volume that is bigger, put a
>>> >> storage domain on it and then we migrated VM's to it with LSM. After
>>> >> some time, we noticed that (some of) the migrated VM's had corrupted
>>> >> filesystems. After moving everything back with export-import to the
>>> old
>>> >> domain where possible, and recovering from backups where needed we set
>>> >> off to investigate this issue.
>>> >>
>>> >> We are now at the point where we can reproduce this issue within a
>>> day.
>>> >> What we have found so far:
>>> >> 1) The corruption occurs at the very end of the replication step, most
>>> >> probably between START and FINISH of diskReplicateFinish, before the
>>> >> START merge step
>>> >> 2) In the corrupted VM, at some place where data should be, this data
>>> is
>>> >> replaced by zero's. This can be file-contents or a directory-structure
>>> >> or whatever.
>>> >> 3) The source gluster volume has different settings then the
>>> destination
>>> >> (Mostly because the defaults were different at creation time):
>>> >>
>>> >> Setting old(src)  new(dst)
>>> >> cluster.op-version  30800 30800 (the same)
>>> >> cluster.max-op-version  31202 31202 (the same)
>>> >> cluster.metadata-self-heal  off   on
>>> >> cluster.data-self-heal  off   on
>>> >> cluster.entry-self-heal off   on
>>> >> performance.low-prio-threads1632
>>> >> performance.strict-o-direct off   on
>>> >> network.ping-timeout4230
>>> >> network.remote-dio  enableoff
>>> >> transport.address-family- inet
>>> >> performance.stat-prefetch   off   on
>>> >> features.shard-block-size   512MB 64MB
>>> >> cluster.shd-max-threads 1 8
>>> >> cluster.shd-wait-qlength1024  1
>>> >> cluster.locking-scheme  full  granular
>>> >> cluster.granular-entry-heal noenable
>>> >>
>>> >> 4) To test, we migrate some VM's back and forth. The corruption does
>>> not
>>> >> occur every time. To this point it only occurs from old to new, but we
>>> >> don't have enough data-points to be sure about that.
>>> >>
>>> >> Anybody an idea what is causing the corruption? Is this the best list
>>> to
>>> >> ask, or should I ask on a Gluster list? I am not sure if this is oVirt
>

[ovirt-users] Re: VM disk corruption with LSM on Gluster

2019-03-26 Thread Krutika Dhananjay
Could you enable strict-o-direct and disable remote-dio on the src volume
as well, restart the vms on "old" and retry migration?

# gluster volume set  performance.strict-o-direct on
# gluster volume set  network.remote-dio off

-Krutika

On Tue, Mar 26, 2019 at 10:32 PM Sander Hoentjen  wrote:

> On 26-03-19 14:23, Sahina Bose wrote:
> > +Krutika Dhananjay and gluster ml
> >
> > On Tue, Mar 26, 2019 at 6:16 PM Sander Hoentjen 
> wrote:
> >> Hello,
> >>
> >> tl;dr We have disk corruption when doing live storage migration on oVirt
> >> 4.2 with gluster 3.12.15. Any idea why?
> >>
> >> We have a 3-node oVirt cluster that is both compute and gluster-storage.
> >> The manager runs on separate hardware. We are running out of space on
> >> this volume, so we added another Gluster volume that is bigger, put a
> >> storage domain on it and then we migrated VM's to it with LSM. After
> >> some time, we noticed that (some of) the migrated VM's had corrupted
> >> filesystems. After moving everything back with export-import to the old
> >> domain where possible, and recovering from backups where needed we set
> >> off to investigate this issue.
> >>
> >> We are now at the point where we can reproduce this issue within a day.
> >> What we have found so far:
> >> 1) The corruption occurs at the very end of the replication step, most
> >> probably between START and FINISH of diskReplicateFinish, before the
> >> START merge step
> >> 2) In the corrupted VM, at some place where data should be, this data is
> >> replaced by zero's. This can be file-contents or a directory-structure
> >> or whatever.
> >> 3) The source gluster volume has different settings then the destination
> >> (Mostly because the defaults were different at creation time):
> >>
> >> Setting old(src)  new(dst)
> >> cluster.op-version  30800 30800 (the same)
> >> cluster.max-op-version  31202 31202 (the same)
> >> cluster.metadata-self-heal  off   on
> >> cluster.data-self-heal  off   on
> >> cluster.entry-self-heal off   on
> >> performance.low-prio-threads1632
> >> performance.strict-o-direct off   on
> >> network.ping-timeout4230
> >> network.remote-dio  enableoff
> >> transport.address-family- inet
> >> performance.stat-prefetch   off   on
> >> features.shard-block-size   512MB 64MB
> >> cluster.shd-max-threads 1 8
> >> cluster.shd-wait-qlength1024  1
> >> cluster.locking-scheme  full  granular
> >> cluster.granular-entry-heal noenable
> >>
> >> 4) To test, we migrate some VM's back and forth. The corruption does not
> >> occur every time. To this point it only occurs from old to new, but we
> >> don't have enough data-points to be sure about that.
> >>
> >> Anybody an idea what is causing the corruption? Is this the best list to
> >> ask, or should I ask on a Gluster list? I am not sure if this is oVirt
> >> specific or Gluster specific though.
> > Do you have logs from old and new gluster volumes? Any errors in the
> > new volume's fuse mount logs?
>
> Around the time of corruption I see the message:
> The message "I [MSGID: 133017] [shard.c:4941:shard_seek]
> 0-ZoneA_Gluster1-shard: seek called on
> 7fabc273-3d8a-4a49-8906-b8ccbea4a49f. [Operation not supported]" repeated
> 231 times between [2019-03-26 13:14:22.297333] and [2019-03-26
> 13:15:42.912170]
>
> I also see this message at other times, when I don't see the corruption
> occur, though.
>
> --
> Sander
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/M3T2VGGGV6DE643ZKKJUAF274VSWTJFH/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZUIRM5PT4Y4USOSDGSUEP3YEE23LE4WG/


[ovirt-users] Re: oVirt Performance (Horrific)

2019-03-13 Thread Krutika Dhananjay
Hi,

OK, thanks. I'd also asked for gluster version you're running.  Could you
share that information as well?

-Krutika

On Thu, Mar 7, 2019 at 9:38 PM Drew Rash  wrote:

> Here is the output for our ssd gluster which exhibits the same issue as
> the hdd glusters.
> However, I can replicate the issue on an 8TB WD Gold disk NFS mounted as
> well ( removed the gluster part )  Which is the reason I'm on the oVirt
> site.  I can start a file copy that writes at max speed, then after a gb or
> 2 it drops down to 3-10 MBps maxing at 13.3 ish overall.
> Testing outside of oVirt using dd doesn't have the same behavior. Outside
> ovirt (directly on the ovirtnode to the gluster or 8tb nfs mount results is
> max drive speeds consistently for large file copies)
>
> I enabled writeback (as someone suggested) on the virtio-scsi windows disk
> and one of our windows 10 installs speed up. Still suffers from sustained
> write issue which causes the whole box to cripple. Opening chrome for
> example cripples the box or sql server management studio also.
>
> Volume Name: gv1
> Type: Replicate
> Volume ID: 7340a436-d971-4d69-84f9-12a23cd76ec8
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: 10.30.30.121:/gluster_bricks/gv1/brick
> Brick2: 10.30.30.122:/gluster_bricks/gv1/brick
> Brick3: 10.30.30.123:/gluster_bricks/gv1/brick (arbiter)
> Options Reconfigured:
> network.ping-timeout: 30
> performance.strict-o-direct: on
> storage.owner-gid: 36
> storage.owner-uid: 36
> user.cifs: off
> cluster.shd-wait-qlength: 1
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> features.shard: off
> cluster.granular-entry-heal: enable
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
>
> On Thu, Mar 7, 2019 at 1:00 AM Krutika Dhananjay 
> wrote:
>
>> So from the profile, it appears the XATTROPs and FINODELKs are way higher
>> than the number of WRITEs:
>>
>> 
>> ...
>> ...
>> %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
>>  Fop
>>  -   ---   ---   ---   
>>   
>>   0.43 384.83 us  51.00 us   65375.00 us  13632
>> FXATTROP
>>   7.54   13535.70 us 225.00 us  210298.00 us   6816
>>  WRITE
>>  45.99   28508.86 us   7.00 us 2591280.00 us  19751
>> FINODELK
>>
>> ...
>> ...
>> 
>>
>> We'd noticed something similar in our internal tests and found
>> inefficiencies in gluster's eager-lock implementation. This was fixed at
>> https://review.gluster.org/c/glusterfs/+/19503.
>> I need the two things I asked for in the prev mail to confirm if you're
>> hitting the same issue.
>>
>> -Krutika
>>
>> On Thu, Mar 7, 2019 at 12:24 PM Krutika Dhananjay 
>> wrote:
>>
>>> Hi,
>>>
>>> Could you share the following pieces of information to begin with -
>>>
>>> 1. output of `gluster volume info $AFFECTED_VOLUME_NAME`
>>> 2. glusterfs version you're running
>>>
>>> -Krutika
>>>
>>>
>>> On Sat, Mar 2, 2019 at 3:38 AM Drew R  wrote:
>>>
>>>> Saw some people asking for profile info.  So I had started a migration
>>>> from a 6TB WDGold 2+1arb replicated gluster to a 1TB samsung ssd 2+1 rep
>>>> gluster and it's been running a while for a 100GB file thin provisioned
>>>> with like 28GB actually used.  Here is the profile info.  I started the
>>>> profiler like 5 minutes ago. The migration had been running for like
>>>> 30minutes:
>>>>
>>>> gluster volume profile gv2 info
>>>> Brick: 10.30.30.122:/gluster_bricks/gv2/brick
>>>> -
>>>> Cumulative Stats:
>>>>Block Size:256b+ 512b+
>>>>   1024b+
>>>>  No. of Reads: 1189 8
>>>>   12
>>>> No. of Writes:4  3245
>>>>  883
>>>>
>>>>Block Size:   2048b+4096b+
>>>>   8192b+
>>>>  No. of Reads:

[ovirt-users] Re: oVirt Performance (Horrific)

2019-03-06 Thread Krutika Dhananjay
So from the profile, it appears the XATTROPs and FINODELKs are way higher
than the number of WRITEs:


...
...
%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
 Fop
 -   ---   ---   ---   

  0.43 384.83 us  51.00 us   65375.00 us  13632
FXATTROP
  7.54   13535.70 us 225.00 us  210298.00 us   6816
 WRITE
 45.99   28508.86 us   7.00 us 2591280.00 us  19751
FINODELK

...
...


We'd noticed something similar in our internal tests and found
inefficiencies in gluster's eager-lock implementation. This was fixed at
https://review.gluster.org/c/glusterfs/+/19503.
I need the two things I asked for in the prev mail to confirm if you're
hitting the same issue.

-Krutika

On Thu, Mar 7, 2019 at 12:24 PM Krutika Dhananjay 
wrote:

> Hi,
>
> Could you share the following pieces of information to begin with -
>
> 1. output of `gluster volume info $AFFECTED_VOLUME_NAME`
> 2. glusterfs version you're running
>
> -Krutika
>
>
> On Sat, Mar 2, 2019 at 3:38 AM Drew R  wrote:
>
>> Saw some people asking for profile info.  So I had started a migration
>> from a 6TB WDGold 2+1arb replicated gluster to a 1TB samsung ssd 2+1 rep
>> gluster and it's been running a while for a 100GB file thin provisioned
>> with like 28GB actually used.  Here is the profile info.  I started the
>> profiler like 5 minutes ago. The migration had been running for like
>> 30minutes:
>>
>> gluster volume profile gv2 info
>> Brick: 10.30.30.122:/gluster_bricks/gv2/brick
>> -
>> Cumulative Stats:
>>Block Size:256b+ 512b+
>> 1024b+
>>  No. of Reads: 1189 8
>> 12
>> No. of Writes:4  3245
>>883
>>
>>Block Size:   2048b+4096b+
>> 8192b+
>>  No. of Reads:   1020
>>  2
>> No. of Writes: 1087312228
>> 124080
>>
>>Block Size:  16384b+   32768b+
>>  65536b+
>>  No. of Reads:0 1
>> 52
>> No. of Writes: 5188  3617
>>   5532
>>
>>Block Size: 131072b+
>>  No. of Reads:70191
>> No. of Writes:   634192
>>  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
>>Fop
>>  -   ---   ---   ---   
>>   
>>   0.00   0.00 us   0.00 us   0.00 us  2
>> FORGET
>>   0.00   0.00 us   0.00 us   0.00 us202
>>  RELEASE
>>   0.00   0.00 us   0.00 us   0.00 us   1297
>> RELEASEDIR
>>   0.00  14.50 us   9.00 us  20.00 us  4
>>  READDIR
>>   0.00  38.00 us   9.00 us 120.00 us  7
>> GETXATTR
>>   0.00  66.00 us  34.00 us 128.00 us  6
>>   OPEN
>>   0.00 137.25 us  52.00 us 195.00 us  4
>>  SETATTR
>>   0.00  23.19 us  11.00 us  46.00 us 26
>>  INODELK
>>   0.00  41.58 us  18.00 us  79.00 us 24
>>  OPENDIR
>>   0.00 166.70 us  15.00 us 775.00 us 27
>> READDIRP
>>   0.01 135.29 us  12.00 us   11695.00 us221
>> STATFS
>>   0.01 176.54 us  22.00 us   22944.00 us364
>>  FSTAT
>>   0.02 626.21 us  13.00 us   17308.00 us168
>>   STAT
>>   0.09 834.84 us   9.00 us   34337.00 us607
>> LOOKUP
>>   0.73 146.18 us   6.00 us   52255.00 us  29329
>> FINODELK
>>   1.03 298.20 us  42.00 us   43711.00 us  20204
>> FXATTROP
>>  15.388903.40 us 213.00 us  213832.00 us  10102
>>  WRITE
>>  39.14   26796.37 us 222.00 us  122696.00 us   8538
>>   READ
>>  43.59   39536.79 us 259.00 us  183630.00 us   6446
>>  FSYNC
>>
>> Duration: 15078 seconds
>>Data Read: 9207377205 bytes
>> Data Written: 86214017762 bytes
>>
>> Interval 2 Stats:
>>Block Size:256b+ 512b+
>> 1024b+
>>  No. of Reads:   17 0
>>  0
>> No. of Writes:0

[ovirt-users] Re: oVirt Performance (Horrific)

2019-03-06 Thread Krutika Dhananjay
Hi,

Could you share the following pieces of information to begin with -

1. output of `gluster volume info $AFFECTED_VOLUME_NAME`
2. glusterfs version you're running

-Krutika


On Sat, Mar 2, 2019 at 3:38 AM Drew R  wrote:

> Saw some people asking for profile info.  So I had started a migration
> from a 6TB WDGold 2+1arb replicated gluster to a 1TB samsung ssd 2+1 rep
> gluster and it's been running a while for a 100GB file thin provisioned
> with like 28GB actually used.  Here is the profile info.  I started the
> profiler like 5 minutes ago. The migration had been running for like
> 30minutes:
>
> gluster volume profile gv2 info
> Brick: 10.30.30.122:/gluster_bricks/gv2/brick
> -
> Cumulative Stats:
>Block Size:256b+ 512b+
> 1024b+
>  No. of Reads: 1189 8
>   12
> No. of Writes:4  3245
>  883
>
>Block Size:   2048b+4096b+
> 8192b+
>  No. of Reads:   1020
>2
> No. of Writes: 1087312228
> 124080
>
>Block Size:  16384b+   32768b+
>  65536b+
>  No. of Reads:0 1
>   52
> No. of Writes: 5188  3617
> 5532
>
>Block Size: 131072b+
>  No. of Reads:70191
> No. of Writes:   634192
>  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
>  Fop
>  -   ---   ---   ---   
> 
>   0.00   0.00 us   0.00 us   0.00 us  2
> FORGET
>   0.00   0.00 us   0.00 us   0.00 us202
>  RELEASE
>   0.00   0.00 us   0.00 us   0.00 us   1297
> RELEASEDIR
>   0.00  14.50 us   9.00 us  20.00 us  4
>  READDIR
>   0.00  38.00 us   9.00 us 120.00 us  7
> GETXATTR
>   0.00  66.00 us  34.00 us 128.00 us  6
> OPEN
>   0.00 137.25 us  52.00 us 195.00 us  4
>  SETATTR
>   0.00  23.19 us  11.00 us  46.00 us 26
>  INODELK
>   0.00  41.58 us  18.00 us  79.00 us 24
>  OPENDIR
>   0.00 166.70 us  15.00 us 775.00 us 27
> READDIRP
>   0.01 135.29 us  12.00 us   11695.00 us221
> STATFS
>   0.01 176.54 us  22.00 us   22944.00 us364
>  FSTAT
>   0.02 626.21 us  13.00 us   17308.00 us168
> STAT
>   0.09 834.84 us   9.00 us   34337.00 us607
> LOOKUP
>   0.73 146.18 us   6.00 us   52255.00 us  29329
> FINODELK
>   1.03 298.20 us  42.00 us   43711.00 us  20204
> FXATTROP
>  15.388903.40 us 213.00 us  213832.00 us  10102
>  WRITE
>  39.14   26796.37 us 222.00 us  122696.00 us   8538
> READ
>  43.59   39536.79 us 259.00 us  183630.00 us   6446
>  FSYNC
>
> Duration: 15078 seconds
>Data Read: 9207377205 bytes
> Data Written: 86214017762 bytes
>
> Interval 2 Stats:
>Block Size:256b+ 512b+
> 1024b+
>  No. of Reads:   17 0
>0
> No. of Writes:043
>7
>
>Block Size:   2048b+4096b+
> 8192b+
>  No. of Reads:0 7
>0
> No. of Writes:   16  1881
> 1010
>
>Block Size:  16384b+   32768b+
>  65536b+
>  No. of Reads:0 0
>6
> No. of Writes:  305   586
> 2359
>
>Block Size: 131072b+
>  No. of Reads: 7162
> No. of Writes:  610
>  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
>  Fop
>  -   ---   ---   ---   
> 
>   0.00   0.00 us   0.00 us   0.00 us  6
>  RELEASE
>   0.00   0.00 us   0.00 us   0.00 us 20
> RELEASEDIR
>   0.00  14.50 us   9.00 us  20.00 us  4
>  READDIR
>   0.00  38.00 us   9.00 us 120.00 us  7
> GETXATTR
>   0.00  66.00 us  34.00 us 128.00 us  6
> OPEN
>   0.00 137.25 us  52.00 us 195.00 us  4
>  SETATTR
>   0.00  23.19 us  11.00 us  46.00 us 26
>  INODELK
>   0.00  40.05 us  18.00 us  79.00 us 20
>  OPENDIR
>   0.00 180.33 us  16.00 us 775.00 us 21
> READDIRP
>   0.01 181.77 us  12.00 us   11695.00 us149
> STATFS
>   0.01 511.23 

[ovirt-users] Re: Tracking down high writes in GlusterFS volume

2019-02-25 Thread Krutika Dhananjay
On Fri, Feb 15, 2019 at 12:30 AM Jayme  wrote:

> Running an oVirt 4.3 HCI 3-way replica cluster with SSD backed storage.
> I've noticed that my SSD writes (smart Total_LBAs_Written) are quite high
> on one particular drive.  Specifically I've noticed one volume is much much
> higher total bytes written than others (despite using less overall space).
>

Writes are higher on one particular volume? Or did one brick witness more
writes than its two replicas within the same volume? Could you share the
volume info output of the affected volume plus the name of the affected
brick if at all the issue is with one single brick?

Also, did you check if the volume was undergoing any heals (`gluster volume
heal  info`)?

-Krutika

My volume is writing over 1TB of data per day (by my manual calculation,
> and with glusterfs profiling) and wearing my SSDs quickly, how can I best
> determine which VM or process is at fault here?
>
> There are 5 low use VMs using the volume in question.  I'm attempting to
> track iostats on each of the vm's individually but so far I'm not seeing
> anything obvious that would account for 1TB of writes per day that the
> gluster volume is reporting.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/OZHZXQS4GUPPJXOZSBTO6X5ZL6CATFXK/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VFI6HEFQ3JRB47VQ2CNIAUAQDUKTKPT6/


[ovirt-users] Re: Gluster - performance.strict-o-direct and other performance tuning in different storage backends

2019-02-25 Thread Krutika Dhananjay
Gluster's write-behind translator by default buffers writes for flushing to
disk later, *even* when the file is opened with O_DIRECT flag. Not honoring
O_DIRECT could mean a reader from another client could be READing stale
data from bricks because some WRITEs may not yet be flushed to disk.
performance.strict-o-direct=on is one of the options needed to truly honor
O_DIRECT behavior which is what qemu uses by virtue of cache=none option
being set (the other being network.remote-dio=off) on the vm(s)

-Krutika


On Mon, Feb 25, 2019 at 2:50 PM Leo David  wrote:

> Hello Everyone,
> As per some previous posts,  this "performance.strict-o-direct=on"
> setting caused trouble or poor vm iops.  I've noticed that this option is
> still part of default setup or automatically configured with
> "Optimize for virt. store" button.
> In the end... is this setting a good or a bad practice for setting the vm
> storage volume ?
> Does it depends ( like maybe other gluster performance options ) on the
> storage backend:
> - raid type /  jbod
> - raid controller cache size
> I am usually using jbod disks attached to lsi hba card ( no cache ). Any
> gluster recommendations regarding this setup ?
> Is there any documentation for best practices on configurating ovirt's
> gluster for different types of storage backends ?
> Thank you very much !
>
> Have a great week,
>
> Leo
>
> --
> Best regards, Leo David
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/7FKL42JSHIKPMKLLMDPKYM4XT4V5GT4W/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VS764WDBR2PLGGDZVRGBEM6OJCAFEM3R/


[ovirt-users] Re: HE + Gluster : Engine corrupted?

2018-07-02 Thread Krutika Dhananjay
Hi,

So it seems some of the files in the volume have mismatching gfids. I see
the following logs from 15th June, ~8pm EDT:


...
...
[2018-06-16 04:00:10.264690] E [MSGID: 108008]
[afr-self-heal-common.c:335:afr_gfid_split_brain_source]
0-engine-replicate-0: Gfid mismatch detected for
/hosted-engine.lockspace>,
6bbe6097-8520-4a61-971e-6e30c2ee0abe on engine-client-2 and
ef21a706-41cf-4519-8659-87ecde4bbfbf on engine-client-0.
[2018-06-16 04:00:10.265861] W [fuse-bridge.c:540:fuse_entry_cbk]
0-glusterfs-fuse: 4411: LOOKUP()
/c65e03f0-d553-4d5d-ba4f-9d378c153b9b/ha_agent/hosted-engine.lockspace =>
-1 (Input/output error)
[2018-06-16 04:00:11.522600] E [MSGID: 108008]
[afr-self-heal-common.c:212:afr_gfid_split_brain_source]
0-engine-replicate-0: All the bricks should be up to resolve the gfid split
barin
[2018-06-16 04:00:11.522632] E [MSGID: 108008]
[afr-self-heal-common.c:335:afr_gfid_split_brain_source]
0-engine-replicate-0: Gfid mismatch detected for
/hosted-engine.lockspace>,
6bbe6097-8520-4a61-971e-6e30c2ee0abe on engine-client-2 and
ef21a706-41cf-4519-8659-87ecde4bbfbf on engine-client-0.
[2018-06-16 04:00:11.523750] W [fuse-bridge.c:540:fuse_entry_cbk]
0-glusterfs-fuse: 4493: LOOKUP()
/c65e03f0-d553-4d5d-ba4f-9d378c153b9b/ha_agent/hosted-engine.lockspace =>
-1 (Input/output error)
[2018-06-16 04:00:12.864393] E [MSGID: 108008]
[afr-self-heal-common.c:212:afr_gfid_split_brain_source]
0-engine-replicate-0: All the bricks should be up to resolve the gfid split
barin
[2018-06-16 04:00:12.864426] E [MSGID: 108008]
[afr-self-heal-common.c:335:afr_gfid_split_brain_source]
0-engine-replicate-0: Gfid mismatch detected for
/hosted-engine.lockspace>,
6bbe6097-8520-4a61-971e-6e30c2ee0abe on engine-client-2 and
ef21a706-41cf-4519-8659-87ecde4bbfbf on engine-client-0.
[2018-06-16 04:00:12.865392] W [fuse-bridge.c:540:fuse_entry_cbk]
0-glusterfs-fuse: 4575: LOOKUP()
/c65e03f0-d553-4d5d-ba4f-9d378c153b9b/ha_agent/hosted-engine.lockspace =>
-1 (Input/output error)
[2018-06-16 04:00:18.716007] W [fuse-bridge.c:540:fuse_entry_cbk]
0-glusterfs-fuse: 4657: LOOKUP()
/c65e03f0-d553-4d5d-ba4f-9d378c153b9b/ha_agent/hosted-engine.lockspace =>
-1 (Input/output error)
[2018-06-16 04:00:20.553365] W [fuse-bridge.c:540:fuse_entry_cbk]
0-glusterfs-fuse: 4739: LOOKUP()
/c65e03f0-d553-4d5d-ba4f-9d378c153b9b/ha_agent/hosted-engine.lockspace =>
-1 (Input/output error)
[2018-06-16 04:00:21.771698] W [fuse-bridge.c:540:fuse_entry_cbk]
0-glusterfs-fuse: 4821: LOOKUP()
/c65e03f0-d553-4d5d-ba4f-9d378c153b9b/ha_agent/hosted-engine.lockspace =>
-1 (Input/output error)
[2018-06-16 04:00:23.871647] W [fuse-bridge.c:540:fuse_entry_cbk]
0-glusterfs-fuse: 4906: LOOKUP()
/c65e03f0-d553-4d5d-ba4f-9d378c153b9b/ha_agent/hosted-engine.lockspace =>
-1 (Input/output error)
[2018-06-16 04:00:25.034780] W [fuse-bridge.c:540:fuse_entry_cbk]
0-glusterfs-fuse: 4987: LOOKUP()
/c65e03f0-d553-4d5d-ba4f-9d378c153b9b/ha_agent/hosted-engine.lockspace =>
-1 (Input/output error)
...
...


Adding Ravi who works on replicate component to hep resolve the mismatches.

-Krutika


On Mon, Jul 2, 2018 at 12:27 PM, Krutika Dhananjay 
wrote:

> Hi,
>
> Sorry, I was out sick on Friday. I am looking into the logs. Will get back
> to you in some time.
>
> -Krutika
>
> On Fri, Jun 29, 2018 at 7:47 PM, Hanson Turner  > wrote:
>
>> Hi Krutika,
>>
>> Did you need any other logs?
>>
>>
>> Thanks,
>>
>> Hanson
>>
>> On 06/27/2018 02:04 PM, Hanson Turner wrote:
>>
>> Hi Krutika,
>>
>> Looking at the email spams, it looks like it started at 8:04PM EDT on Jun
>> 15 2018.
>>
>> From my memory, I think the cluster was working fine until sometime that
>> night. Somewhere between midnight and the next (Saturday) morning, the
>> engine crashed and all vm's stopped.
>>
>> I do have nightly backups that ran every night, using the engine-backup
>> command. Looks like my last valid backup was 2018-06-15.
>>
>> I've included all logs I think might be of use. Please forgive the use of
>> 7zip, as the raw logs took 50mb which is greater than my attachment limit.
>>
>> I think the just of what happened, is we had a downed node for a period
>> of time. Earlier that day, the node was brought back into service. Later
>> that night or early the next morning, the engine was gone and hopping from
>> node to node.
>>
>> I have tried to mount the engine's hdd file to see if I could fix it.
>> There are a few corrupted partitions, and those are xfs formatted. Trying
>> to mount gives me issues about needing repaired, trying to repair gives me
>> issues about needing something cleaned first. I cannot remember exactly
>> what it was, but it wanted me to run a command that ended -L to clear out
>> the logs. I said no way and h

[ovirt-users] Re: HE + Gluster : Engine corrupted?

2018-06-25 Thread Krutika Dhananjay
Could you share the gluster mount and brick logs? You'll find  them under
/var/log/glusterfs.
Also, what's the version of gluster you're using?
Also, output of `gluster volume info `?

-Krutika

On Thu, Jun 21, 2018 at 9:50 AM, Sahina Bose  wrote:

>
>
> On Wed, Jun 20, 2018 at 11:33 PM, Hanson Turner <
> han...@andrewswireless.net> wrote:
>
>> Hi Benny,
>>
>> Who should I be reaching out to for help with a gluster based hosted
>> engine corruption?
>>
>
>
> Krutika, could you help?
>
>
>>
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : True
>> Hostname   : ovirtnode1.abcxyzdomains.net
>> Host ID: 1
>> Engine status  : {"reason": "failed liveliness
>> check", "health": "bad", "vm": "up", "detail": "Up"}
>> Score  : 3400
>> stopped: False
>> Local maintenance  : False
>> crc32  : 92254a68
>> local_conf_timestamp   : 115910
>> Host timestamp : 115910
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=115910 (Mon Jun 18 09:43:20 2018)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=115910 (Mon Jun 18 09:43:20 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=GlobalMaintenance
>> stopped=False
>>
>>
>> My when I VNC into my HE, All I get is:
>> Probing EDD (edd=off to disable)... ok
>>
>>
>> So, that's why it's failing the liveliness check... I cannot get the
>> screen on HE to change short of ctl-alt-del which will reboot the HE.
>> I do have backups for the HE that are/were run on a nightly basis.
>>
>> If the cluster was left alone, the HE vm would bounce from machine to
>> machine trying to boot. This is why the cluster is in maintenance mode.
>> One of the nodes was down for a period of time and brought back, sometime
>> through the night, which is when the automated backup kicks, the HE started
>> bouncing around. Got nearly 1000 emails.
>>
>> This seems to be the same error (but may not be the same cause) as listed
>> here:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1569827
>>
>> Thanks,
>>
>> Hanson
>>
>>
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct: https://www.ovirt.org/communit
>> y/about/community-guidelines/
>> List Archives: https://lists.ovirt.org/archiv
>> es/list/users@ovirt.org/message/3NLA2URX3KN44FGFUVV4N5EJBPICABHH/
>>
>>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/535L3P4FXHSIR2NCKKZM2FOWXLAW6MHS/


[ovirt-users] Re: Gluster problems, cluster performance issues

2018-05-29 Thread Krutika Dhananjay
Adding Ravi to look into the heal issue.

As for the fsync hang and subsequent IO errors, it seems a lot like
https://bugzilla.redhat.com/show_bug.cgi?id=1497156 and Paolo Bonzini from
qemu had pointed out that this would be fixed by the following commit:

  commit e72c9a2a67a6400c8ef3d01d4c461dbbbfa0e1f0
Author: Paolo Bonzini 
Date:   Wed Jun 21 16:35:46 2017 +0200

scsi: virtio_scsi: let host do exception handling

virtio_scsi tries to do exception handling after the default 30 seconds
timeout expires.  However, it's better to let the host control the
timeout, otherwise with a heavy I/O load it is likely that an abort will
also timeout.  This leads to fatal errors like filesystems going
offline.

Disable the 'sd' timeout and allow the host to do exception handling,
following the precedent of the storvsc driver.

Hannes has a proposal to introduce timeouts in virtio, but this provides
an immediate solution for stable kernels too.

[mkp: fixed typo]

Reported-by: Douglas Miller 
Cc: "James E.J. Bottomley" 
Cc: "Martin K. Petersen" 
Cc: Hannes Reinecke 
Cc: linux-s...@vger.kernel.org
Cc: sta...@vger.kernel.org
Signed-off-by: Paolo Bonzini 
Signed-off-by: Martin K. Petersen 


Adding Paolo/Kevin to comment.

As for the poor gluster performance, could you disable cluster.eager-lock
and see if that makes any difference:

# gluster volume set  cluster.eager-lock off

Do also capture the volume profile again if you still see performance
issues after disabling eager-lock.

-Krutika


On Wed, May 30, 2018 at 6:55 AM, Jim Kusznir  wrote:

> I also finally found the following in my system log on one server:
>
> [10679.524491] INFO: task glusterclogro:14933 blocked for more than 120
> seconds.
> [10679.525826] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [10679.527144] glusterclogro   D 97209832bf40 0 14933  1
> 0x0080
> [10679.527150] Call Trace:
> [10679.527161]  [] schedule+0x29/0x70
> [10679.527218]  [] _xfs_log_force_lsn+0x2e8/0x340 [xfs]
> [10679.527225]  [] ? wake_up_state+0x20/0x20
> [10679.527254]  [] xfs_file_fsync+0x107/0x1e0 [xfs]
> [10679.527260]  [] do_fsync+0x67/0xb0
> [10679.527268]  [] ? system_call_after_swapgs+0xbc/0x160
> [10679.527271]  [] SyS_fsync+0x10/0x20
> [10679.527275]  [] system_call_fastpath+0x1c/0x21
> [10679.527279]  [] ? system_call_after_swapgs+0xc8/0x160
> [10679.527283] INFO: task glusterposixfsy:14941 blocked for more than 120
> seconds.
> [10679.528608] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [10679.529956] glusterposixfsy D 972495f84f10 0 14941  1
> 0x0080
> [10679.529961] Call Trace:
> [10679.529966]  [] schedule+0x29/0x70
> [10679.530003]  [] _xfs_log_force_lsn+0x2e8/0x340 [xfs]
> [10679.530008]  [] ? wake_up_state+0x20/0x20
> [10679.530038]  [] xfs_file_fsync+0x107/0x1e0 [xfs]
> [10679.530042]  [] do_fsync+0x67/0xb0
> [10679.530046]  [] ? system_call_after_swapgs+0xbc/0x160
> [10679.530050]  [] SyS_fdatasync+0x13/0x20
> [10679.530054]  [] system_call_fastpath+0x1c/0x21
> [10679.530058]  [] ? system_call_after_swapgs+0xc8/0x160
> [10679.530062] INFO: task glusteriotwr13:15486 blocked for more than 120
> seconds.
> [10679.531805] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [10679.533732] glusteriotwr13  D 9720a83f 0 15486  1
> 0x0080
> [10679.533738] Call Trace:
> [10679.533747]  [] schedule+0x29/0x70
> [10679.533799]  [] _xfs_log_force_lsn+0x2e8/0x340 [xfs]
> [10679.533806]  [] ? wake_up_state+0x20/0x20
> [10679.533846]  [] xfs_file_fsync+0x107/0x1e0 [xfs]
> [10679.533852]  [] do_fsync+0x67/0xb0
> [10679.533858]  [] ? system_call_after_swapgs+0xbc/0x160
> [10679.533863]  [] SyS_fdatasync+0x13/0x20
> [10679.533868]  [] system_call_fastpath+0x1c/0x21
> [10679.533873]  [] ? system_call_after_swapgs+0xc8/0x160
> [10919.512757] INFO: task glusterclogro:14933 blocked for more than 120
> seconds.
> [10919.514714] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [10919.516663] glusterclogro   D 97209832bf40 0 14933  1
> 0x0080
> [10919.516677] Call Trace:
> [10919.516690]  [] schedule+0x29/0x70
> [10919.516696]  [] schedule_timeout+0x239/0x2c0
> [10919.516703]  [] ? blk_finish_plug+0x14/0x40
> [10919.516768]  [] ? _xfs_buf_ioapply+0x334/0x460 [xfs]
> [10919.516774]  [] wait_for_completion+0xfd/0x140
> [10919.516782]  [] ? wake_up_state+0x20/0x20
> [10919.516821]  [] ? _xfs_buf_read+0x23/0x40 [xfs]
> [10919.516859]  [] xfs_buf_submit_wait+0xf9/0x1d0 [xfs]
> [10919.516902]  [] ? xfs_trans_read_buf_map+0x199/0x400
> [xfs]
> [10919.516940]  [] _xfs_buf_read+0x23/0x40 [xfs]
> [10919.516977]  [] xfs_buf_read_map+0xf9/0x160 [xfs]
> [10919.517022]  [] xfs_trans_read_buf_map+0x199/0x400
> [xfs]
> [10919.517057]  [] xfs_da_read_buf+0xd4/0x100 [xfs]
> [10919.517091]  [] xfs_da3_node_read+0x23/0xd0 [xfs]
> 

Re: [ovirt-users] [Gluster-users] Very poor GlusterFS performance

2017-06-21 Thread Krutika Dhananjay
No, you don't need to do any of that. Just executing volume-set commands is
sufficient for the changes to take effect.


-Krutika

On Wed, Jun 21, 2017 at 3:48 PM, Chris Boot <bo...@bootc.net> wrote:

> [replying to lists this time]
>
> On 20/06/17 11:23, Krutika Dhananjay wrote:
> > Couple of things:
> >
> > 1. Like Darrell suggested, you should enable stat-prefetch and increase
> > client and server event threads to 4.
> > # gluster volume set  performance.stat-prefetch on
> > # gluster volume set  client.event-threads 4
> > # gluster volume set  server.event-threads 4
> >
> > 2. Also glusterfs-3.10.1 and above has a shard performance bug fix -
> > https://review.gluster.org/#/c/16966/
> >
> > With these two changes, we saw great improvement in performance in our
> > internal testing.
>
> Hi Krutika,
>
> Thanks for your input. I have yet to run any benchmarks, but I'll do
> that once I have a bit more time to work on this.
>
> I've tweaked the options as you suggest, but that doesn't seem to have
> made an appreciable difference. I admit that without benchmarks it's a
> bit like sticking your finger in the air, though. Do I need to restart
> my bricks and/or remount the volumes for these to take effect?
>
> I'm actually running GlusterFS 3.10.2-1. This is all coming from the
> CentOS Storage SIG's centos-release-gluster310 repository.
>
> Thanks again.
>
> Chris
>
> --
> Chris Boot
> bo...@bootc.net
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] Very poor GlusterFS performance

2017-06-21 Thread Krutika Dhananjay
No. It's just that in the internal testing that was done here, increasing
the thread count beyond 4 did not improve the performance any further.

-Krutika

On Tue, Jun 20, 2017 at 11:30 PM, mabi  wrote:

> Dear Krutika,
>
> Sorry for asking so naively but can you tell me on what factor do you base
> that the client and server event-threads parameters for a volume should be
> set to 4?
>
> Is this metric for example based on the number of cores a GlusterFS server
> has?
>
> I am asking because I saw my GlusterFS volumes are set to 2 and would like
> to set these parameters to something meaningful for performance tuning. My
> setup is a two node replica with GlusterFS 3.8.11.
>
> Best regards,
> M.
>
>
>
>  Original Message 
> Subject: Re: [Gluster-users] [ovirt-users] Very poor GlusterFS performance
> Local Time: June 20, 2017 12:23 PM
> UTC Time: June 20, 2017 10:23 AM
> From: kdhan...@redhat.com
> To: Lindsay Mathieson 
> gluster-users , oVirt users 
>
> Couple of things:
> 1. Like Darrell suggested, you should enable stat-prefetch and increase
> client and server event threads to 4.
> # gluster volume set  performance.stat-prefetch on
> # gluster volume set  client.event-threads 4
> # gluster volume set  server.event-threads 4
>
> 2. Also glusterfs-3.10.1 and above has a shard performance bug fix -
> https://review.gluster.org/#/c/16966/
>
> With these two changes, we saw great improvement in performance in our
> internal testing.
>
> Do you mind trying these two options above?
> -Krutika
>
> On Tue, Jun 20, 2017 at 1:00 PM, Lindsay Mathieson <
> lindsay.mathie...@gmail.com> wrote:
>
>> Have you tried with:
>>
>> performance.strict-o-direct : off
>> performance.strict-write-ordering : off
>> They can be changed dynamically.
>>
>>
>> On 20 June 2017 at 17:21, Sahina Bose  wrote:
>>
>>> [Adding gluster-users]
>>>
>>> On Mon, Jun 19, 2017 at 8:16 PM, Chris Boot  wrote:
>>>
 Hi folks,

 I have 3x servers in a "hyper-converged" oVirt 4.1.2 + GlusterFS 3.10
 configuration. My VMs run off a replica 3 arbiter 1 volume comprised of
 6 bricks, which themselves live on two SSDs in each of the servers (one
 brick per SSD). The bricks are XFS on LVM thin volumes straight onto the
 SSDs. Connectivity is 10G Ethernet.

 Performance within the VMs is pretty terrible. I experience very low
 throughput and random IO is really bad: it feels like a latency issue.
 On my oVirt nodes the SSDs are not generally very busy. The 10G network
 seems to run without errors (iperf3 gives bandwidth measurements of >=
 9.20 Gbits/sec between the three servers).

 To put this into perspective: I was getting better behaviour from NFS4
 on a gigabit connection than I am with GlusterFS on 10G: that doesn't
 feel right at all.

 My volume configuration looks like this:

 Volume Name: vmssd
 Type: Distributed-Replicate
 Volume ID: d5a5ddd1-a140-4e0d-b514-701cfe464853
 Status: Started
 Snapshot Count: 0
 Number of Bricks: 2 x (2 + 1) = 6
 Transport-type: tcp
 Bricks:
 Brick1: ovirt3:/gluster/ssd0_vmssd/brick
 Brick2: ovirt1:/gluster/ssd0_vmssd/brick
 Brick3: ovirt2:/gluster/ssd0_vmssd/brick (arbiter)
 Brick4: ovirt3:/gluster/ssd1_vmssd/brick
 Brick5: ovirt1:/gluster/ssd1_vmssd/brick
 Brick6: ovirt2:/gluster/ssd1_vmssd/brick (arbiter)
 Options Reconfigured:
 nfs.disable: on
 transport.address-family: inet6
 performance.quick-read: off
 performance.read-ahead: off
 performance.io-cache: off
 performance.stat-prefetch: off
 performance.low-prio-threads: 32
 network.remote-dio: off
 cluster.eager-lock: enable
 cluster.quorum-type: auto
 cluster.server-quorum-type: server
 cluster.data-self-heal-algorithm: full
 cluster.locking-scheme: granular
 cluster.shd-max-threads: 8
 cluster.shd-wait-qlength: 1
 features.shard: on
 user.cifs: off
 storage.owner-uid: 36
 storage.owner-gid: 36
 features.shard-block-size: 128MB
 performance.strict-o-direct: on
 network.ping-timeout: 30
 cluster.granular-entry-heal: enable

 I would really appreciate some guidance on this to try to improve things
 because at this rate I will need to reconsider using GlusterFS
 altogether.

>>>
>>> Could you provide the gluster volume profile output while you're running
>>> your I/O tests.
>>> # gluster volume profile  start
>>> to start profiling
>>> # gluster volume profile  info
>>> for the profile output.
>>>
>>>

 Cheers,
 Chris

 --
 Chris Boot
 bo...@bootc.net
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

>>>
>>>
>>> 

Re: [ovirt-users] [Gluster-users] Very poor GlusterFS performance

2017-06-20 Thread Krutika Dhananjay
Couple of things:

1. Like Darrell suggested, you should enable stat-prefetch and increase
client and server event threads to 4.
# gluster volume set  performance.stat-prefetch on
# gluster volume set  client.event-threads 4
# gluster volume set  server.event-threads 4

2. Also glusterfs-3.10.1 and above has a shard performance bug fix -
https://review.gluster.org/#/c/16966/

With these two changes, we saw great improvement in performance in our
internal testing.

Do you mind trying these two options above?

-Krutika

On Tue, Jun 20, 2017 at 1:00 PM, Lindsay Mathieson <
lindsay.mathie...@gmail.com> wrote:

> Have you tried with:
>
> performance.strict-o-direct : off
> performance.strict-write-ordering : off
>
> They can be changed dynamically.
>
>
> On 20 June 2017 at 17:21, Sahina Bose  wrote:
>
>> [Adding gluster-users]
>>
>> On Mon, Jun 19, 2017 at 8:16 PM, Chris Boot  wrote:
>>
>>> Hi folks,
>>>
>>> I have 3x servers in a "hyper-converged" oVirt 4.1.2 + GlusterFS 3.10
>>> configuration. My VMs run off a replica 3 arbiter 1 volume comprised of
>>> 6 bricks, which themselves live on two SSDs in each of the servers (one
>>> brick per SSD). The bricks are XFS on LVM thin volumes straight onto the
>>> SSDs. Connectivity is 10G Ethernet.
>>>
>>> Performance within the VMs is pretty terrible. I experience very low
>>> throughput and random IO is really bad: it feels like a latency issue.
>>> On my oVirt nodes the SSDs are not generally very busy. The 10G network
>>> seems to run without errors (iperf3 gives bandwidth measurements of >=
>>> 9.20 Gbits/sec between the three servers).
>>>
>>> To put this into perspective: I was getting better behaviour from NFS4
>>> on a gigabit connection than I am with GlusterFS on 10G: that doesn't
>>> feel right at all.
>>>
>>> My volume configuration looks like this:
>>>
>>> Volume Name: vmssd
>>> Type: Distributed-Replicate
>>> Volume ID: d5a5ddd1-a140-4e0d-b514-701cfe464853
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 2 x (2 + 1) = 6
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: ovirt3:/gluster/ssd0_vmssd/brick
>>> Brick2: ovirt1:/gluster/ssd0_vmssd/brick
>>> Brick3: ovirt2:/gluster/ssd0_vmssd/brick (arbiter)
>>> Brick4: ovirt3:/gluster/ssd1_vmssd/brick
>>> Brick5: ovirt1:/gluster/ssd1_vmssd/brick
>>> Brick6: ovirt2:/gluster/ssd1_vmssd/brick (arbiter)
>>> Options Reconfigured:
>>> nfs.disable: on
>>> transport.address-family: inet6
>>> performance.quick-read: off
>>> performance.read-ahead: off
>>> performance.io-cache: off
>>> performance.stat-prefetch: off
>>> performance.low-prio-threads: 32
>>> network.remote-dio: off
>>> cluster.eager-lock: enable
>>> cluster.quorum-type: auto
>>> cluster.server-quorum-type: server
>>> cluster.data-self-heal-algorithm: full
>>> cluster.locking-scheme: granular
>>> cluster.shd-max-threads: 8
>>> cluster.shd-wait-qlength: 1
>>> features.shard: on
>>> user.cifs: off
>>> storage.owner-uid: 36
>>> storage.owner-gid: 36
>>> features.shard-block-size: 128MB
>>> performance.strict-o-direct: on
>>> network.ping-timeout: 30
>>> cluster.granular-entry-heal: enable
>>>
>>> I would really appreciate some guidance on this to try to improve things
>>> because at this rate I will need to reconsider using GlusterFS
>>> altogether.
>>>
>>
>>
>> Could you provide the gluster volume profile output while you're running
>> your I/O tests.
>>
>> # gluster volume profile  start
>> to start profiling
>>
>> # gluster volume profile  info
>>
>> for the profile output.
>>
>>
>>>
>>> Cheers,
>>> Chris
>>>
>>> --
>>> Chris Boot
>>> bo...@bootc.net
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>
>>
>> ___
>> Gluster-users mailing list
>> gluster-us...@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> --
> Lindsay
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt gluster sanlock issue

2017-06-06 Thread Krutika Dhananjay
I stand corrected.

Just realised the strace command I gave was wrong.

Here's what you would actually need to execute:

strace -y -ff -o  

-Krutika

On Tue, Jun 6, 2017 at 3:20 PM, Krutika Dhananjay <kdhan...@redhat.com>
wrote:

> OK.
>
> So for the 'Transport endpoint is not connected' issue, could you share
> the mount and brick logs?
>
> Hmmm.. 'Invalid argument' error even on the root partition. What if you
> change bs to 4096 and run?
>
> The logs I showed in my earlier mail shows that gluster is merely
> returning the error it got from the disk file system where the
> brick is hosted. But you're right about the fact that the offset 127488 is
> not 4K-aligned.
>
> If the dd on /root worked for you with bs=4096, could you try the same
> directly on gluster mount point on a dummy file and capture the strace
> output of dd?
> You can perhaps reuse your existing gluster volume by mounting it at
> another location and doing the dd.
> Here's what you need to execute:
>
> strace -ff -T -p  -o 
> `
>
> FWIW, here's something I found in man(2) open:
>
>
>
>
> *Under  Linux  2.4,  transfer  sizes,  and  the alignment of the user
> buffer and the file offset must all be multiples of the logical block size
> of the filesystem.  Since Linux 2.6.0, alignment to the logical block size
> of the   underlying storage (typically 512 bytes) suffices.  The
> logical block size can be determined using the ioctl(2) BLKSSZGET operation
> or from the shell using the command:   blockdev --getss*
>
>
> -Krutika
>
>
> On Tue, Jun 6, 2017 at 1:18 AM, Abi Askushi <rightkickt...@gmail.com>
> wrote:
>
>> Also when testing with dd i get the following:
>>
>> *Testing on the gluster mount: *
>> dd if=/dev/zero 
>> of=/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.img
>> oflag=direct bs=512 count=1
>> dd: error writing β/rhev/data-center/mnt/glusterSD/10.100.100.1:
>> _engine/test2.imgβ: *Transport endpoint is not connected*
>> 1+0 records in
>> 0+0 records out
>> 0 bytes (0 B) copied, 0.00336755 s, 0.0 kB/s
>>
>> *Testing on the /root directory (XFS): *
>> dd if=/dev/zero of=/test2.img oflag=direct bs=512 count=1
>> dd: error writing β/test2.imgβ:* Invalid argument*
>> 1+0 records in
>> 0+0 records out
>> 0 bytes (0 B) copied, 0.000321239 s, 0.0 kB/s
>>
>> Seems that the gluster is trying to do the same and fails.
>>
>>
>>
>> On Mon, Jun 5, 2017 at 10:10 PM, Abi Askushi <rightkickt...@gmail.com>
>> wrote:
>>
>>> The question that rises is what is needed to make gluster aware of the
>>> 4K physical sectors presented to it (the logical sector is also 4K). The
>>> offset (127488) at the log does not seem aligned at 4K.
>>>
>>> Alex
>>>
>>> On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkickt...@gmail.com>
>>> wrote:
>>>
>>>> Hi Krutika,
>>>>
>>>> I am saying that I am facing this issue with 4k drives. I never
>>>> encountered this issue with 512 drives.
>>>>
>>>> Alex
>>>>
>>>> On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhan...@redhat.com> wrote:
>>>>
>>>>> This seems like a case of O_DIRECT reads and writes gone wrong,
>>>>> judging by the 'Invalid argument' errors.
>>>>>
>>>>> The two operations that have failed on gluster bricks are:
>>>>>
>>>>> [2017-06-05 09:40:39.428979] E [MSGID: 113072]
>>>>> [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0,
>>>>> [Invalid argument]
>>>>> [2017-06-05 09:41:00.865760] E [MSGID: 113040]
>>>>> [posix.c:3178:posix_readv] 0-engine-posix: read failed on
>>>>> gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c,
>>>>> offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
>>>>>
>>>>> But then, both the write and the read have 512byte-aligned offset,
>>>>> size and buf address (which is correct).
>>>>>
>>>>> Are you saying you don't see this issue with 4K block-size?
>>>>>
>>>>> -Krutika
>>>>>
>>>>> On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkickt...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Sahina,
>>>>>>
>>>>>> Attached are the logs. Let me know if sth else is needed.
>>>>>>
>>>>>> I have 5 

Re: [ovirt-users] oVirt gluster sanlock issue

2017-06-06 Thread Krutika Dhananjay
OK.

So for the 'Transport endpoint is not connected' issue, could you share the
mount and brick logs?

Hmmm.. 'Invalid argument' error even on the root partition. What if you
change bs to 4096 and run?

The logs I showed in my earlier mail shows that gluster is merely returning
the error it got from the disk file system where the
brick is hosted. But you're right about the fact that the offset 127488 is
not 4K-aligned.

If the dd on /root worked for you with bs=4096, could you try the same
directly on gluster mount point on a dummy file and capture the strace
output of dd?
You can perhaps reuse your existing gluster volume by mounting it at
another location and doing the dd.
Here's what you need to execute:

strace -ff -T -p  -o
`

FWIW, here's something I found in man(2) open:




*Under  Linux  2.4,  transfer  sizes,  and  the alignment of the user
buffer and the file offset must all be multiples of the logical block size
of the filesystem.  Since Linux 2.6.0, alignment to the logical block size
of the   underlying storage (typically 512 bytes) suffices.  The
logical block size can be determined using the ioctl(2) BLKSSZGET operation
or from the shell using the command:   blockdev --getss*


-Krutika


On Tue, Jun 6, 2017 at 1:18 AM, Abi Askushi <rightkickt...@gmail.com> wrote:

> Also when testing with dd i get the following:
>
> *Testing on the gluster mount: *
> dd if=/dev/zero 
> of=/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.img
> oflag=direct bs=512 count=1
> dd: error writing 
> β/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.imgβ:
> *Transport endpoint is not connected*
> 1+0 records in
> 0+0 records out
> 0 bytes (0 B) copied, 0.00336755 s, 0.0 kB/s
>
> *Testing on the /root directory (XFS): *
> dd if=/dev/zero of=/test2.img oflag=direct bs=512 count=1
> dd: error writing β/test2.imgβ:* Invalid argument*
> 1+0 records in
> 0+0 records out
> 0 bytes (0 B) copied, 0.000321239 s, 0.0 kB/s
>
> Seems that the gluster is trying to do the same and fails.
>
>
>
> On Mon, Jun 5, 2017 at 10:10 PM, Abi Askushi <rightkickt...@gmail.com>
> wrote:
>
>> The question that rises is what is needed to make gluster aware of the 4K
>> physical sectors presented to it (the logical sector is also 4K). The
>> offset (127488) at the log does not seem aligned at 4K.
>>
>> Alex
>>
>> On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkickt...@gmail.com>
>> wrote:
>>
>>> Hi Krutika,
>>>
>>> I am saying that I am facing this issue with 4k drives. I never
>>> encountered this issue with 512 drives.
>>>
>>> Alex
>>>
>>> On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhan...@redhat.com> wrote:
>>>
>>>> This seems like a case of O_DIRECT reads and writes gone wrong, judging
>>>> by the 'Invalid argument' errors.
>>>>
>>>> The two operations that have failed on gluster bricks are:
>>>>
>>>> [2017-06-05 09:40:39.428979] E [MSGID: 113072]
>>>> [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0,
>>>> [Invalid argument]
>>>> [2017-06-05 09:41:00.865760] E [MSGID: 113040]
>>>> [posix.c:3178:posix_readv] 0-engine-posix: read failed on
>>>> gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c,
>>>> offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
>>>>
>>>> But then, both the write and the read have 512byte-aligned offset, size
>>>> and buf address (which is correct).
>>>>
>>>> Are you saying you don't see this issue with 4K block-size?
>>>>
>>>> -Krutika
>>>>
>>>> On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkickt...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Sahina,
>>>>>
>>>>> Attached are the logs. Let me know if sth else is needed.
>>>>>
>>>>> I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K
>>>>> stripe size at the moment.
>>>>> I have prepared the storage as below:
>>>>>
>>>>> pvcreate --dataalignment 256K /dev/sda4
>>>>> vgcreate --physicalextentsize 256K gluster /dev/sda4
>>>>>
>>>>> lvcreate -n engine --size 120G gluster
>>>>> mkfs.xfs -f -i size=512 /dev/gluster/engine
>>>>>
>>>>> Thanx,
>>>>> Alex
>>>>>
>>>>> On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sab...@redhat.com>
>>>>> wrote:
>>>>>
>>>>>

Re: [ovirt-users] oVirt gluster sanlock issue

2017-06-05 Thread Krutika Dhananjay
This seems like a case of O_DIRECT reads and writes gone wrong, judging by
the 'Invalid argument' errors.

The two operations that have failed on gluster bricks are:

[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev]
0-engine-posix: write failed: offset 0, [Invalid argument]
[2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv]
0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914,
fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid
argument]

But then, both the write and the read have 512byte-aligned offset, size and
buf address (which is correct).

Are you saying you don't see this issue with 4K block-size?

-Krutika

On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi  wrote:

> Hi Sahina,
>
> Attached are the logs. Let me know if sth else is needed.
>
> I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe
> size at the moment.
> I have prepared the storage as below:
>
> pvcreate --dataalignment 256K /dev/sda4
> vgcreate --physicalextentsize 256K gluster /dev/sda4
>
> lvcreate -n engine --size 120G gluster
> mkfs.xfs -f -i size=512 /dev/gluster/engine
>
> Thanx,
> Alex
>
> On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose  wrote:
>
>> Can we have the gluster mount logs and brick logs to check if it's the
>> same issue?
>>
>> On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi 
>> wrote:
>>
>>> I clean installed everything and ran into the same.
>>> I then ran gdeploy and encountered the same issue when deploying engine.
>>> Seems that gluster (?) doesn't like 4K sector drives. I am not sure if
>>> it has to do with alignment. The weird thing is that gluster volumes are
>>> all ok, replicating normally and no split brain is reported.
>>>
>>> The solution to the mentioned bug (1386443
>>> ) was to format
>>> with 512 sector size, which for my case is not an option:
>>>
>>> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine
>>> illegal sector size 512; hw sector is 4096
>>>
>>> Is there any workaround to address this?
>>>
>>> Thanx,
>>> Alex
>>>
>>>
>>> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi 
>>> wrote:
>>>
 Hi Maor,

 My disk are of 4K block size and from this bug seems that gluster
 replica needs 512B block size.
 Is there a way to make gluster function with 4K drives?

 Thank you!

 On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk 
 wrote:

> Hi Alex,
>
> I saw a bug that might be related to the issue you encountered at
> https://bugzilla.redhat.com/show_bug.cgi?id=1386443
>
> Sahina, maybe you have any advise? Do you think that BZ1386443is
> related?
>
> Regards,
> Maor
>
> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi 
> wrote:
> > Hi All,
> >
> > I have installed successfully several times oVirt (version 4.1) with
> 3 nodes
> > on top glusterfs.
> >
> > This time, when trying to configure the same setup, I am facing the
> > following issue which doesn't seem to go away. During installation i
> get the
> > error:
> >
> > Failed to execute stage 'Misc configuration': Cannot acquire host id:
> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22,
> 'Sanlock
> > lockspace add failure', 'Invalid argument'))
> >
> > The only different in this setup is that instead of standard
> partitioning i
> > have GPT partitioning and the disks have 4K block size instead of
> 512.
> >
> > The /var/log/sanlock.log has the following lines:
> >
> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace
> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m
> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8
> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0
> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource
> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m
> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b
> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576
> > for 2,9,23040
> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace
> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m
> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8
> b4d5e5e922/dom_md/ids:0
> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD
> > 0x7f59b8c0:0x7f59b8d0:0x7f59b0101000 result -22:0 match res
> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader
> offset
> > 127488 rv -22
> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e
> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids
> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450
> > 

Re: [ovirt-users] vm has been paused due to unknown storage

2017-05-29 Thread Krutika Dhananjay
Hi,

Could you share your volume-info output and also the brick logs?

-Krutika

On Fri, May 26, 2017 at 4:57 PM, <supo...@logicworks.pt> wrote:

> Hi,
>
> I updated glusterfs:
> glusterfs-client-xlators-3.8.12-1.el7.x86_64
> glusterfs-cli-3.8.12-1.el7.x86_64
> glusterfs-api-3.8.12-1.el7.x86_64
> glusterfs-fuse-3.8.12-1.el7.x86_64
> glusterfs-server-3.8.12-1.el7.x86_64
> glusterfs-libs-3.8.12-1.el7.x86_64
> glusterfs-3.8.12-1.el7.x86_64
>
> Now I cannot add a volume disk preallocated, after a while it breaks.
>
> message log:
> May 26 11:18:16 node journal: vdsm root ERROR VM metrics collection
> failed#012Traceback (most recent call last):#012  File
> "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line 221, in
> send_metrics#012diskinfo['readOps']#012KeyError: 'readOps'
>
> vdsm.log
> 2017-05-26 11:18:16,715+0100 ERROR (periodic/3) [root] VM metrics
> collection failed (vmstats:264)
> 2017-05-26 11:19:39,369+0100 ERROR (tasks/5) [storage.Volume] Unexpected
> error (fileVolume:456)
> 2017-05-26 11:19:39,373+0100 ERROR (tasks/5) [storage.Volume] Unexpected
> error (volume:1107)
> 2017-05-26 11:19:39,374+0100 ERROR (tasks/5) [storage.TaskManager.Task]
> (Task='5b2adb9a-e24e-48fa-9f01-f21c23588aef') Unexpected error (task:870)
>
> glusterfs
> [2017-05-26 10:53:08.247219] W [MSGID: 114031] 
> [client-rpc-fops.c:2933:client3_3_lookup_cbk]
> 0-gv2-client-0: remote operation failed. Path: 
> /.shard/55b94942-dee5-4f69-8b0f-52e251ac6f5e.164
> (----) [No data available]
> [2017-05-26 10:53:14.899499] W [MSGID: 114031] 
> [client-rpc-fops.c:2933:client3_3_lookup_cbk]
> 0-gv2-client-0: remote operation failed. Path: 
> /.shard/55b94942-dee5-4f69-8b0f-52e251ac6f5e.167
> (----) [No data available]
> [2017-05-26 10:53:14.899526] E [MSGID: 133010] 
> [shard.c:1725:shard_common_lookup_shards_cbk]
> 0-gv2-shard: Lookup on shard 167 failed. Base file gfid =
> 55b94942-dee5-4f69-8b0f-52e251ac6f5e [No data available]
> [2017-05-26 10:53:19.712567] W [MSGID: 114031] 
> [client-rpc-fops.c:2933:client3_3_lookup_cbk]
> 0-gv2-client-0: remote operation failed. Path: 
> /.shard/55b94942-dee5-4f69-8b0f-52e251ac6f5e.169
> (----) [No data available]
> [2017-05-26 10:53:19.712614] E [MSGID: 133010] 
> [shard.c:1725:shard_common_lookup_shards_cbk]
> 0-gv2-shard: Lookup on shard 169 failed. Base file gfid =
> 55b94942-dee5-4f69-8b0f-52e251ac6f5e [No data available]
> [2017-05-26 10:53:29.419317] W [MSGID: 114031] 
> [client-rpc-fops.c:2933:client3_3_lookup_cbk]
> 0-gv2-client-0: remote operation failed. Path: 
> /.shard/55b94942-dee5-4f69-8b0f-52e251ac6f5e.173
> (----) [No data available]
> [2017-05-26 10:53:29.419369] E [MSGID: 133010] 
> [shard.c:1725:shard_common_lookup_shards_cbk]
> 0-gv2-shard: Lookup on shard 173 failed. Base file gfid =
> 55b94942-dee5-4f69-8b0f-52e251ac6f5e [No data available]
>
>
> thanks
>
> --
> *De: *"Sahina Bose" <sab...@redhat.com>
> *Para: *supo...@logicworks.pt, "Krutika Dhananjay" <kdhan...@redhat.com>
> *Cc: *"ovirt users" <users@ovirt.org>
> *Enviadas: *Quinta-feira, 25 De Maio de 2017 7:12:40
> *Assunto: *Re: [ovirt-users] vm has been paused due to unknown storage
>
> The glusterfs logs contain below errors:
> [2017-05-22 18:12:50.941883] E [MSGID: 133010] 
> [shard.c:1725:shard_common_lookup_shards_cbk]
> 0-gv2-shard: Lookup on shard 50 failed. Base file gfid =
> 33f1fe3e-c626-49f2-861e-2259c972931d [No data available]
> [2017-05-22 18:12:50.945085] W [fuse-bridge.c:1291:fuse_err_cbk]
> 0-glusterfs-fuse: 61306713: FSYNC() ERR => -1 (No data available)
>
> Krutika, could you take a look?
>
> On Thu, May 25, 2017 at 1:02 AM, <supo...@logicworks.pt> wrote:
>
>> Hi,
>>
>> I setup an ovirt hosted enine, in only one server with local gluster
>> bricks.
>>
>> When running a MS SQL 2012 process to rebuild a data base, which take
>> around 4 hours, after a while the VM is paused with the error:
>>
>> vm has been paused due to unknown storage
>>
>> The VM disk is in Thin provision
>>
>> Ovirt and gluter versions:
>>
>> Version 4.1.1.8-1.el7.centos
>>
>> glusterfs-cli-3.8.11-1.el7.x86_64
>> glusterfs-libs-3.8.11-1.el7.x86_64
>> glusterfs-3.8.11-1.el7.x86_64
>> glusterfs-client-xlators-3.8.11-1.el7.x86_64
>> glusterfs-fuse-3.8.11-1.el7.x86_64
>> glusterfs-api-3.8.11-1.el7.x86_64
>> glusterfs-server-3.8.11-1.el7.x86_64
>>
>>
>> I can find the reason why
>> The logs are attached.
>>
>> Any idea?
>>
>> Thanks
>>
>> --
>> --
>> Jose Ferradeira
>> http://www.logicworks.pt
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM hang

2017-03-12 Thread Krutika Dhananjay
Hi,

Could you please share your volume info output?

-Krutika

On Fri, Mar 10, 2017 at 6:41 PM, p...@email.cz  wrote:

> freez / freezing
> IO operations are paused from any reasons
> available posibilities are
> 1) net - any tcp framework collapse
> 2) gluster interconnect due gluster daemon - process hang ??
> 3) VSD - pause managed services
> 4) XFS - RW issues
> 5) swap overfulled - any processes are killed - but why swap is full if
> max 30% of mem (196 GB )  is used by VMs ? ( unmanaged process forking )
>
> regs
>
>
> On 03/10/2017 01:56 PM, Nir Soffer wrote:
>
> On Fri, Mar 10, 2017 at 1:07 PM, p...@email.cz  
>  wrote:
>
> Hello everybody,
>
> for production usage i'm testing  ovirt with gluster.
> All components seems to be running fine but whenever I'm testing huge
> workload, then node freez. Not the main OS, but VDSM mgmt and attached
> services, VMs eg.
>
> What do you mean by freez?
>
>
> mgmt
> oVirt - 4.1.0.4
> centos 7.3-1611
>
>
> nodes ( installed from ovirt image
> "ovirt-node-ng-installer-ovirt-4.1-2017030804.iso"  )
>
> OS Version: == RHEL - 7 - 3.1611.el7.centos
> OS Description:== oVirt Node 4.1.0
> Kernel Version:== 3.10.0 - 514.10.2.el7.x86_64
> KVM Version:== 2.6.0 - 28.el7_3.3.1
> LIBVIRT Version:== libvirt-2.0.0-10.el7_3.5
> VDSM Version:== vdsm-4.19.4-1.el7.centos
> SPICE Version:== 0.12.4 - 20.el7_3
> GlusterFS Version:== glusterfs-3.8.9-1.el7  ( LVM thinprovisioning in
> replica 2 - created from ovirt GUI )
>
> concurently running
> - huge import from export domain( net workload )
> - sequential write to VMs local disk ( gluster replica sequential workload )
> - VMs database huge select  (  random IOps )
> - huge old snapshot delete  ( random IOps )
>
> In this configuration / workload  is  runnig one hour eg, with no exceptions
> , with 70-80% disk load, but in some point VDSM freez  all jobs for a
> timeout and VM's are in "uknown" status .
> The whole system revitalize then automaticaly in cca 20min time frame (
> except the import and snapshot deleting(rollback) )
>
> engine.log  - focus 10:39:07 time  ( Failed in 'HSMGetAllTasksStatusesVDS'
> method )
> 
>
> n child command id: 'a8a3a4d5-cf7d-4423-8243-022911232508'
> type:'RemoveSnapshotSingleDiskLive' to complete
> 2017-03-10 10:39:01,727+01 INFO
> [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback]
> (DefaultQuartzScheduler2) [759c8e1f] Command 'RemoveSnapshotSingleDiskLive'
> (id: 'a8a3a4d5-cf7d-4423-8243-022911232508') waiting on child command id:
> '33df2c1e-6ce3-44fd-a39b-d111883b4c4e' type:'DestroyImage' to complete
> 2017-03-10 10:39:03,929+01 INFO
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
> (DefaultQuartzScheduler5) [fde51205-3e8b-4b84-a478-352dc444ccc4] START,
> GlusterServersListVDSCommand(HostName = 2kvm1,
> VdsIdVDSCommandParametersBase:{runAsync='true',
> hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 446d0cd3
> 2017-03-10 10:39:04,343+01 INFO
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
> (DefaultQuartzScheduler5) [fde51205-3e8b-4b84-a478-352dc444ccc4] FINISH,
> GlusterServersListVDSCommand, return: [172.16.5.163/24:CONNECTED,
> 16.0.0.164:CONNECTED], log id: 446d0cd3
> 2017-03-10 10:39:04,353+01 INFO
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
> (DefaultQuartzScheduler5) [fde51205-3e8b-4b84-a478-352dc444ccc4] START,
> GlusterVolumesListVDSCommand(HostName = 2kvm1,
> GlusterVolumesListVDSParameters:{runAsync='true',
> hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 69ea1fda
> 2017-03-10 10:39:05,128+01 INFO
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
> (DefaultQuartzScheduler5) [fde51205-3e8b-4b84-a478-352dc444ccc4] FINISH,
> GlusterVolumesListVDSCommand, return:
> {8ded4083-2f31-489e-a60d-a315a5eb9b22=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7765e4ad},
> log id: 69ea1fda
> 2017-03-10 10:39:07,163+01 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand]
> (DefaultQuartzScheduler2) [759c8e1f] Failed in 'HSMGetAllTasksStatusesVDS'
> method
> 2017-03-10 10:39:07,178+01 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (DefaultQuartzScheduler2) [759c8e1f] EVENT_ID:
> VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null,
> Custom Event ID: -1, Message: VDSM 2kvm2 command HSMGetAllTasksStatusesVDS
> failed: Connection timed out
> 2017-03-10 10:39:07,182+01 INFO
> [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler2)
> [759c8e1f] BaseAsyncTask::onTaskEndSuccess: Task
> 'f594bf69-619b-4d1b-8f6d-a9826997e478' (Parent Command 'ImportVm',
> Parameters Type
> 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
> successfully.
> 2017-03-10 10:39:07,182+01 INFO
> [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (DefaultQuartzScheduler2)
> 

Re: [ovirt-users] ovirt 3.6.6 and gluster 3.7.13

2016-07-26 Thread Krutika Dhananjay
Hi,

1.  Could you please attach the glustershd logs from all three nodes?

2. Also, so far what we know is that the 'Operation not permitted' errors
are on the main vm image itself and not its individual shards (ex
deb61291-5176-4b81-8315-3f1cf8e3534d). Could you do the following:
Get the inode number of
.glusterfs/de/b6/deb61291-5176-4b81-8315-3f1cf8e3534d (ls -li) from the
first brick. I'll call this number INODE_NUMBER.
Execute `find . -inum INODE_NUMBER` from the brick root on first brick to
print the hard links against the file in the prev step and share the output.

3. Did you delete any vms at any point before or after the upgrade?

-Krutika

On Mon, Jul 25, 2016 at 11:30 PM, David Gossage <dgoss...@carouselchecks.com
> wrote:

>
> On Mon, Jul 25, 2016 at 9:58 AM, Krutika Dhananjay <kdhan...@redhat.com>
> wrote:
>
>> OK, could you try the following:
>>
>> i. Set network.remote-dio to off
>> # gluster volume set  network.remote-dio off
>>
>> ii. Set performance.strict-o-direct to on
>> # gluster volume set  performance.strict-o-direct on
>>
>> iii. Stop the affected vm(s) and start again
>>
>> and tell me if you notice any improvement?
>>
>>
> Previous instll I had issue with is still on gluster 3.7.11
>
> My test install of ovirt 3.6.7 and gluster 3.7.13 with 3 bricks on a locak
> disk right now isn't allowing me to add the gluster storage at all.
>
> Keep getting some type of UI error
>
> 2016-07-25 12:49:09,277 ERROR
> [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService]
> (default task-33) [] Permutation name: 430985F23DFC1C8BE1C7FDD91EDAA785
> 2016-07-25 12:49:09,277 ERROR
> [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService]
> (default task-33) [] Uncaught exception: : java.lang.ClassCastException
> at Unknown.ps(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@3837)
>at Unknown.ts(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@20)
>  at Unknown.vs(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@18)
>  at Unknown.iJf(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@19)
> at Unknown.Xab(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@48)
> at Unknown.P8o(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@4447)
>   at Unknown.jQr(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@21)
> at Unknown.A8o(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@51)
> at Unknown.u8o(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@101)
>at Unknown.Eap(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@10718)
>  at Unknown.p8n(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@161)
>at Unknown.Cao(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@31)
> at Unknown.Bap(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@10469)
>  at Unknown.kRn(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@49)
> at Unknown.nRn(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@438)
>at Unknown.eVn(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@40)
> at Unknown.hVn(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@25827)
>  at Unknown.MTn(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@25)
> at Unknown.PTn(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@24052)
>  at Unknown.KJe(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@21125)
>  at Unknown.Izk(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@10384)
>  at Unknown.P3(
> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@137)
>  

Re: [ovirt-users] ovirt 3.6.6 and gluster 3.7.13

2016-07-25 Thread Krutika Dhananjay
OK, could you try the following:

i. Set network.remote-dio to off
# gluster volume set  network.remote-dio off

ii. Set performance.strict-o-direct to on
# gluster volume set  performance.strict-o-direct on

iii. Stop the affected vm(s) and start again

and tell me if you notice any improvement?

-Krutika

On Mon, Jul 25, 2016 at 4:57 PM, Samuli Heinonen <samp...@neutraali.net>
wrote:

> Hi,
>
> > On 25 Jul 2016, at 12:34, David Gossage <dgoss...@carouselchecks.com>
> wrote:
> >
> > On Mon, Jul 25, 2016 at 1:01 AM, Krutika Dhananjay <kdhan...@redhat.com>
> wrote:
> > Hi,
> >
> > Thanks for the logs. So I have identified one issue from the logs for
> which the fix is this: http://review.gluster.org/#/c/14669/. Because of a
> bug in the code, ENOENT was getting converted to EPERM and being propagated
> up the stack causing the reads to bail out early with 'Operation not
> permitted' errors.
> > I still need to find out two things:
> > i) why there was a readv() sent on a non-existent (ENOENT) file (this is
> important since some of the other users have not faced or reported this
> issue on gluster-users with 3.7.13)
> > ii) need to see if there's a way to work around this issue.
> >
> > Do you mind sharing the steps needed to be executed to run into this
> issue? This is so that we can apply our patches, test and ensure they fix
> the problem.
>
>
> Unfortunately I can’t test this right away nor give exact steps how to
> test this. This is just a theory but please correct me if you see some
> mistakes.
>
> oVirt uses cache=none settings for VM’s by default which requires direct
> I/O. oVirt also uses dd with iflag=direct to check that storage has direct
> I/O enabled. Problems exist with GlusterFS with sharding enabled and bricks
> running on ZFS on Linux. Everything seems to be fine with GlusterFS 3.7.11
> and problems exist at least with version .12 and .13. There has been some
> posts saying that GlusterFS 3.8.x is also affected.
>
> Steps to reproduce:
> 1. Sharded file is created with GlusterFS 3.7.11. Everything works ok.
> 2. GlusterFS is upgraded to 3.7.12+
> 3. Sharded file cannot be read or written with direct I/O enabled. (Ie.
> oVirt uses to check storage connection with command "dd
> if=/rhev/data-center/0001-0001-0001-0001-02b6/mastersd/dom_md/inbox
> iflag=direct,fullblock count=1 bs=1024000”)
>
> Please let me know if you need more information.
>
> -samuli
>
> > Well after upgrade of gluster all I did was start ovirt hosts up which
> launched and started their ha-agent and broker processes.  I don't believe
> I started getting any errors till it mounted GLUSTER1.  I had enabled
> sharding but had no sharded disk images yet.  Not sure if the check for
> shards would have caused that.  Unfortunately I can't just update this
> cluster and try and see what caused it as it has sme VM's users expect to
> be available in few hours.
> >
> > I can see if I can get my test setup to recreate it.  I think I'll need
> to de-activate data center so I can detach the storage thats on xfs and
> attach the one thats over zfs with sharding enabled.  My test is 3 bricks
> on same local machine, with 3 different volumes but I think im running into
> sanlock issue or something as it won't mount more than one volume that was
> created locally.
> >
> >
> > -Krutika
> >
> > On Fri, Jul 22, 2016 at 7:17 PM, David Gossage <
> dgoss...@carouselchecks.com> wrote:
> > Trimmed out the logs to just about when I was shutting down ovirt
> servers for updates which was 14:30 UTC 2016-07-09
> >
> > Pre-update settings were
> >
> > Volume Name: GLUSTER1
> > Type: Replicate
> > Volume ID: 167b8e57-28c3-447a-95cc-8410cbdf3f7f
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1
> > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1
> > Brick3: ccgl3.gl.local:/gluster1/BRICK1/1
> > Options Reconfigured:
> > performance.readdir-ahead: on
> > storage.owner-uid: 36
> > storage.owner-gid: 36
> > performance.quick-read: off
> > performance.read-ahead: off
> > performance.io-cache: off
> > performance.stat-prefetch: off
> > cluster.eager-lock: enable
> > network.remote-dio: enable
> > cluster.quorum-type: auto
> > cluster.server-quorum-type: server
> > server.allow-insecure: on
> > cluster.self-heal-window-size: 1024
> > cluster.background-self-heal-count: 16
> > performance.strict-write-ordering: off
> > nfs.disable: on
> > nfs.addr-namelookup: off
> >

Re: [ovirt-users] ovirt 3.6.6 and gluster 3.7.13

2016-07-22 Thread Krutika Dhananjay
Hi David,

Could you also share the brick logs from the affected volume? They're
located at
/var/log/glusterfs/bricks/.log.

Also, could you share the volume configuration (output of `gluster volume
info `) for the affected volume(s) AND at the time you actually saw
this issue?

-Krutika




On Thu, Jul 21, 2016 at 11:23 PM, David Gossage  wrote:

> On Thu, Jul 21, 2016 at 11:47 AM, Scott  wrote:
>
>> Hi David,
>>
>> My backend storage is ZFS.
>>
>> I thought about moving from FUSE to NFS mounts for my Gluster volumes to
>> help test.  But since I use hosted engine this would be a real pain.  Its
>> difficult to modify the storage domain type/path in the
>> hosted-engine.conf.  And I don't want to go through the process of
>> re-deploying hosted engine.
>>
>>
> I found this
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1347553
>
> Not sure if related.
>
> But I also have zfs backend, another user in gluster mailing list had
> issues and used zfs backend although she used proxmox and got it working by
> changing disk to writeback cache I think it was.
>
> I also use hosted engine, but I run my gluster volume for HE actually on a
> LVM separate from zfs on xfs and if i recall it did not have the issues my
> gluster on zfs did.  I'm wondering now if the issue was zfs settings.
>
> Hopefully should have a test machone up soon I can play around with more.
>
> Scott
>>
>> On Thu, Jul 21, 2016 at 11:36 AM David Gossage <
>> dgoss...@carouselchecks.com> wrote:
>>
>>> What back end storage do you run gluster on?  xfs/zfs/ext4 etc?
>>>
>>> *David Gossage*
>>> *Carousel Checks Inc. | System Administrator*
>>> *Office* 708.613.2284
>>>
>>> On Thu, Jul 21, 2016 at 8:18 AM, Scott  wrote:
>>>
 I get similar problems with oVirt 4.0.1 and hosted engine.  After
 upgrading all my hosts to Gluster 3.7.13 (client and server), I get the
 following:

 $ sudo hosted-engine --set-maintenance --mode=none
 Traceback (most recent call last):
   File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
 "__main__", fname, loader, pkg_name)
   File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
 exec code in run_globals
   File
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
 line 73, in 
 if not maintenance.set_mode(sys.argv[1]):
   File
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
 line 61, in set_mode
 value=m_global,
   File
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
 line 259, in set_maintenance_mode
 str(value))
   File
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
 line 204, in set_global_md_flag
 all_stats = broker.get_stats_from_storage(service)
   File
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
 line 232, in get_stats_from_storage
 result = self._checked_communicate(request)
   File
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
 line 260, in _checked_communicate
 .format(message or response))
 ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed:
 failed to read metadata: [Errno 1] Operation not permitted

 If I only upgrade one host, then things will continue to work but my
 nodes are constantly healing shards.  My logs are also flooded with:

 [2016-07-21 13:15:14.137734] W [fuse-bridge.c:2227:fuse_readv_cbk]
 0-glusterfs-fuse: 274714: READ => -1 gfid=4
 41f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not
 permitted)
 The message "W [MSGID: 114031]
 [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote
 operation failed [Operation not permitted]" repeated 6 times between
 [2016-07-21 13:13:24.134985] and [2016-07-21 13:15:04.132226]
 The message "W [MSGID: 114031]
 [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-1: remote
 operation failed [Operation not permitted]" repeated 8 times between
 [2016-07-21 13:13:34.133116] and [2016-07-21 13:15:14.137178]
 The message "W [MSGID: 114031]
 [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote
 operation failed [Operation not permitted]" repeated 7 times between
 [2016-07-21 13:13:24.135071] and [2016-07-21 13:15:14.137666]
 [2016-07-21 13:15:24.134647] W [MSGID: 114031]
 [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote
 operation failed [Operation not permitted]
 [2016-07-21 13:15:24.134764] W [MSGID: 114031]
 [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote
 operation failed [Operation not permitted]
 [2016-07-21 13:15:24.134793] W [fuse-bridge.c:2227:fuse_readv_cbk]
 0-glusterfs-fuse: 274741: 

Re: [ovirt-users] vm pauses with "vm has paused due to unknown storage error

2016-06-26 Thread Krutika Dhananjay
Hi Bill,

After glusterfs 3.7.11, around 4-5 bugs were found in sharding and
replicate modules and fixed, some of them causing the VM(s) to pause. Could
you share the glusterfs client logs from around the time the issue was
seen? This will help me confirm it's the same issue, or even debug further
if this is a new issue.

-Krutika

On Fri, Jun 24, 2016 at 10:02 AM, Sahina Bose  wrote:

> Can you post the gluster mount logs from the node where paused VM was
> running (under
> /var/log/glusterfs/rhev-datarhev-data-center-mnt-glusterSD.log)
> ?
> Which version of glusterfs are you running?
>
>
> On 06/24/2016 07:49 AM, Bill Bill wrote:
>
> Hello,
>
>
>
> Have 3 nodes running both oVirt and Gluster on 4 SSD’s each. At the
> moment, there are two physical nics, one has public internet access and the
> other is a non-routable network used for ovirtmgmt & gluster.
>
>
>
> In the logical networks, I have selected gluster for the nonroutable
> network running ovirtmgmt and gluster however, two VM’s randomly pause for
> what seems like no reason. They can both be resumed without issue.
>
>
>
> One test VM has 4GB of memory and a small disk – no problems with this
> one. Two others have 800GB disks and 32GB of RAM – both vm’s exhibit the
> same issue.
>
>
>
> I also see these in the oVirt dashboard:
>
>
>
>
>
> Failed to update OVF disks 9e60328d-29af-4533-84f9-633d87f548a7, OVF data
> isn't updated on those OVF stores (Data Center x, Storage Domain
> sr-volume01).
>
>
>
> Jun 23, 2016 9:54:03 PM
>
>
>
> VDSM command failed: Could not acquire resource. Probably resource factory
> threw an exception.: ()
>
>
>
> ///
>
>
>
> VM x has been paused due to unknown storage error.
>
>
>
> ///
>
>
>
> In the error log on the engine, I see these:
>
>
>
> ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (ForkJoinPool-1-worker-7) [10caf93e] Correlation ID: null, Call Stack:
> null, Custom Event ID: -1, Message: VM xx has been paused due to
> unknown storage error.
>
>
>
> INFO
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (ForkJoinPool-1-worker-11) [10caf93e] Correlation ID: null, Call Stack:
> null, Custom Event ID: -1, Message: VM xx has recovered from paused
> back to up.
>
>
>
> ///
>
>
>
> Hostnames are all local to /etc/hosts on all servers – they also resolve
> without issue from each host.
>
>
>
> //
>
>
>
> 2016-06-23 22:08:59,611 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
> (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not associate brick
> 'ovirt3435:/mnt/data/sr-volume01' of volume
> '93e36cdc-ab1b-41ec-ac7f-966cf3856b59' with correct network as no gluster
> network found in cluster '75bd64de-04b2-4a99-9cd0-b63e919b9aca'
>
> 2016-06-23 22:08:59,614 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
> (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not associate brick
> 'ovirt3637:/mnt/data/sr-volume01' of volume
> '93e36cdc-ab1b-41ec-ac7f-966cf3856b59' with correct network as no gluster
> network found in cluster '75bd64de-04b2-4a99-9cd0-b63e919b9aca'
>
> 2016-06-23 22:08:59,616 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
> (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not associate brick
> 'ovirt3839:/mnt/data/sr-volume01' of volume
> '93e36cdc-ab1b-41ec-ac7f-966cf3856b59' with correct network as no gluster
> network found in cluster '75bd64de-04b2-4a99-9cd0-b63e919b9aca'
>
> 2016-06-23 22:08:59,618 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
> (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not associate brick
> 'ovirt3435:/mnt/data/distributed' of volume
> 'b887b05e-2ea6-496e-9552-155d658eeaa6' with correct network as no gluster
> network found in cluster '75bd64de-04b2-4a99-9cd0-b63e919b9aca'
>
> 2016-06-23 22:08:59,620 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
> (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not associate brick
> 'ovirt3637:/mnt/data/distributed' of volume
> 'b887b05e-2ea6-496e-9552-155d658eeaa6' with correct network as no gluster
> network found in cluster '75bd64de-04b2-4a99-9cd0-b63e919b9aca'
>
> 2016-06-23 22:08:59,622 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
> (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not associate brick
> 'ovirt3839:/mnt/data/distributed' of volume
> 'b887b05e-2ea6-496e-9552-155d658eeaa6' with correct network as no gluster
> network found in cluster '75bd64de-04b2-4a99-9cd0-b63e919b9aca'
>
> 2016-06-23 22:08:59,624 WARN
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
> (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not associate brick
> 'ovirt3435:/mnt/data/iso' of volume '89f32457-c8c3-490e-b491-16dd27de0073'
> with correct network as no