On 24 May 2017 at 22:45, Nithya Balachandran <[email protected]> wrote:
> > > On 24 May 2017 at 21:55, Mahdi Adnan <[email protected]> wrote: > >> Hi, >> >> >> Thank you for your response. >> >> I have around 15 files, each is 2TB qcow. >> >> One brick reached 96% so i removed it with "brick remove" and waited >> until it goes for around 40% and stopped the removal process with brick >> remove stop. >> >> The issue is brick1 drain it's data to brick6 only, and when brick6 >> reached around 90% i did the same thing as before and it drained the data >> to brick1 only. >> >> now brick6 reached 99% and i have only a few gigabytes left which will >> fill in the next half hour or so. >> >> attached are the logs for all 6 bricks. >> >> Hi, > > Just to clarify, did you run a rebalance (gluster volume rebalance <vol> > start) or did you only run remove-brick ? > > On re-reading your original email, I see you did run a rebalance. Did it complete? Also which bricks are full at the moment? > > -- >> >> Respectfully >> *Mahdi A. Mahdi* >> >> ------------------------------ >> *From:* Nithya Balachandran <[email protected]> >> *Sent:* Wednesday, May 24, 2017 6:45:10 PM >> *To:* Mohammed Rafi K C >> *Cc:* Mahdi Adnan; [email protected] >> *Subject:* Re: [Gluster-users] Distributed re-balance issue >> >> >> >> On 24 May 2017 at 20:02, Mohammed Rafi K C <[email protected]> wrote: >> >>> >>> >>> On 05/23/2017 08:53 PM, Mahdi Adnan wrote: >>> >>> Hi, >>> >>> >>> I have a distributed volume with 6 bricks, each have 5TB and it's >>> hosting large qcow2 VM disks (I know it's reliable but it's not important >>> data) >>> >>> I started with 5 bricks and then added another one, started the re >>> balance process, everything went well, but now im looking at the bricks >>> free space and i found one brick is around 82% while others ranging from >>> 20% to 60%. >>> >>> The brick with highest utilization is hosting more qcow2 disk than other >>> bricks, and whenever i start re balance it just complete in 0 seconds and >>> without moving any data. >>> >>> >>> How much is your average file size in the cluster? And number of files >>> (roughly) . >>> >>> >>> What will happen with the brick became full ? >>> >>> Once brick contents goes beyond 90%, new files won't be created in the >>> brick. But existing files can grow. >>> >>> >>> Can i move data manually from one brick to the other ? >>> >>> >>> Nop.It is not recommended, even though gluster will try to find the >>> file, it may break. >>> >>> >>> Why re balance not distributing data evenly on all bricks ? >>> >>> >>> Rebalance works based on layout, so we need to see how layouts are >>> distributed. If one of your bricks has higher capacity, it will have larger >>> layout. >>> >>> >> >> >>> That is correct. As Rafi said, the layout matters here. Can you please >>> send across all the rebalance logs from all the 6 nodes? >>> >>> >> Nodes runing CentOS 7.3 >>> >>> Gluster 3.8.11 >>> >>> >>> Volume info; >>> Volume Name: ctvvols >>> Type: Distribute >>> Volume ID: 1ecea912-510f-4079-b437-7398e9caa0eb >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 6 >>> Transport-type: tcp >>> Bricks: >>> Brick1: ctv01:/vols/ctvvols >>> Brick2: ctv02:/vols/ctvvols >>> Brick3: ctv03:/vols/ctvvols >>> Brick4: ctv04:/vols/ctvvols >>> Brick5: ctv05:/vols/ctvvols >>> Brick6: ctv06:/vols/ctvvols >>> Options Reconfigured: >>> nfs.disable: on >>> performance.readdir-ahead: on >>> transport.address-family: inet >>> performance.quick-read: off >>> performance.read-ahead: off >>> performance.io-cache: off >>> performance.stat-prefetch: off >>> performance.low-prio-threads: 32 >>> network.remote-dio: enable >>> cluster.eager-lock: enable >>> cluster.quorum-type: none >>> cluster.server-quorum-type: server >>> cluster.data-self-heal-algorithm: full >>> cluster.locking-scheme: granular >>> cluster.shd-max-threads: 8 >>> cluster.shd-wait-qlength: 10000 >>> features.shard: off >>> user.cifs: off >>> network.ping-timeout: 10 >>> storage.owner-uid: 36 >>> storage.owner-gid: 36 >>> >>> >>> re balance log: >>> >>> >>> [2017-05-23 14:45:12.637671] I [dht-rebalance.c:2866:gf_defrag_process_dir] >>> 0-ctvvols-dht: Migration operation on dir /31e0b341-4eeb-4b71-b280-840eb >>> a7d6940/images/690c728d-a83e-4c79-ac7d-1f3f17edf7f0 took 0.00 secs >>> [2017-05-23 14:45:12.640043] I [MSGID: 109081] >>> [dht-common.c:4202:dht_setxattr] 0-ctvvols-dht: fixing the layout of >>> /31e0b341-4eeb-4b71-b280-840eba7d6940/images/091402ba-dc90-4 >>> 206-848a-d73e85a1cc35 >>> [2017-05-23 14:45:12.641516] I [dht-rebalance.c:2652:gf_defrag_process_dir] >>> 0-ctvvols-dht: migrate data called on /31e0b341-4eeb-4b71-b280-840eb >>> a7d6940/images/091402ba-dc90-4206-848a-d73e85a1cc35 >>> [2017-05-23 14:45:12.642421] I [dht-rebalance.c:2866:gf_defrag_process_dir] >>> 0-ctvvols-dht: Migration operation on dir /31e0b341-4eeb-4b71-b280-840eb >>> a7d6940/images/091402ba-dc90-4206-848a-d73e85a1cc35 took 0.00 secs >>> [2017-05-23 14:45:12.645610] I [MSGID: 109081] >>> [dht-common.c:4202:dht_setxattr] 0-ctvvols-dht: fixing the layout of >>> /31e0b341-4eeb-4b71-b280-840eba7d6940/images/be1e2276-d38f-4 >>> d90-abf5-de757dd04078 >>> [2017-05-23 14:45:12.647034] I [dht-rebalance.c:2652:gf_defrag_process_dir] >>> 0-ctvvols-dht: migrate data called on /31e0b341-4eeb-4b71-b280-840eb >>> a7d6940/images/be1e2276-d38f-4d90-abf5-de757dd04078 >>> [2017-05-23 14:45:12.647589] I [dht-rebalance.c:2866:gf_defrag_process_dir] >>> 0-ctvvols-dht: Migration operation on dir /31e0b341-4eeb-4b71-b280-840eb >>> a7d6940/images/be1e2276-d38f-4d90-abf5-de757dd04078 took 0.00 secs >>> [2017-05-23 14:45:12.653291] I [dht-rebalance.c:3838:gf_defrag_start_crawl] >>> 0-DHT: crawling file-system completed >>> [2017-05-23 14:45:12.653323] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 23 >>> [2017-05-23 14:45:12.653508] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 24 >>> [2017-05-23 14:45:12.653536] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 25 >>> [2017-05-23 14:45:12.653556] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 26 >>> [2017-05-23 14:45:12.653580] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 27 >>> [2017-05-23 14:45:12.653603] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 28 >>> [2017-05-23 14:45:12.653623] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 29 >>> [2017-05-23 14:45:12.653638] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 30 >>> [2017-05-23 14:45:12.653659] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 31 >>> [2017-05-23 14:45:12.653677] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 32 >>> [2017-05-23 14:45:12.653692] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 33 >>> [2017-05-23 14:45:12.653711] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 34 >>> [2017-05-23 14:45:12.653723] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 35 >>> [2017-05-23 14:45:12.653739] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 36 >>> [2017-05-23 14:45:12.653759] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 37 >>> [2017-05-23 14:45:12.653772] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 38 >>> [2017-05-23 14:45:12.653789] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 39 >>> [2017-05-23 14:45:12.653800] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 40 >>> [2017-05-23 14:45:12.653811] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 41 >>> [2017-05-23 14:45:12.653822] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 42 >>> [2017-05-23 14:45:12.653836] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 43 >>> [2017-05-23 14:45:12.653870] I [dht-rebalance.c:2246:gf_defrag_task] >>> 0-DHT: Thread wokeup. defrag->current_thread_count: 44 >>> [2017-05-23 14:45:12.654413] I [MSGID: 109028] >>> [dht-rebalance.c:4079:gf_defrag_status_get] 0-ctvvols-dht: Rebalance is >>> completed. Time taken is 0.00 secs >>> [2017-05-23 14:45:12.654428] I [MSGID: 109028] >>> [dht-rebalance.c:4083:gf_defrag_status_get] 0-ctvvols-dht: Files >>> migrated: 0, size: 0, lookups: 15, failures: 0, skipped: 0 >>> [2017-05-23 14:45:12.654552] W [glusterfsd.c:1327:cleanup_and_exit] >>> (-->/lib64/libpthread.so.0(+0x7dc5) [0x7ff40ff88dc5] >>> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7ff41161acd5] >>> -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7ff41161ab4b] ) 0-: >>> received signum (15), shutting down >>> >>> >>> >>> Appreciate your help >>> >>> >>> >>> -- >>> >>> Respectfully >>> *Mahdi A. Mahdi* >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing >>> [email protected]http://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
