Hi Rusty, Sorry I took so long to get back to you.
Which is the newly added brick? I see datanode02 has not picked up any files for migration which is odd. How full are the individual bricks (df -h ) output. Is each of your bricks in a separate partition? Can you send me the rebalance logs from all 3 nodes (offline if you prefer)? We can try using scripts to speed up the rebalance if you prefer. Regards, Nithya On 16 July 2018 at 22:06, Rusty Bower <[email protected]> wrote: > Thanks for the reply Nithya. > > 1. glusterfs 4.1.1 > > 2. Volume Name: data > Type: Distribute > Volume ID: 294d95ce-0ff3-4df9-bd8c-a52fc50442ba > Status: Started > Snapshot Count: 0 > Number of Bricks: 3 > Transport-type: tcp > Bricks: > Brick1: datanode01:/mnt/data/bricks/data > Brick2: datanode02:/mnt/data/bricks/data > Brick3: datanode03:/mnt/data/bricks/data > Options Reconfigured: > performance.readdir-ahead: on > > 3. > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s > --------- ----------- ----------- > ----------- ----------- ----------- ------------ > -------------- > localhost 36822 11.3GB > 50715 0 0 in progress 26:46:17 > datanode02 0 0Bytes > 2852 0 0 in progress 26:46:16 > datanode03 3128 513.7MB > 11442 0 3128 in progress 26:46:17 > Estimated time left for rebalance to complete : > 2 months. Please try > again later. > volume rebalance: data: success > > 4. Directory structure is basically an rsync backup of some old systems as > well as all of my personal media. I can elaborate more, but it's a pretty > standard filesystem. > > 5. In some folders there might be up to like 12-15 levels of directories > (especially the backups) > > 6. I'm honestly not sure, I can try to scrounge this number up > > 7. My guess would be > 100k > > 8. Most files are pretty large (media files), but there's a lot of small > files (metadata and configuration files) as well > > I've also appended a (moderately sanitized) snippet of the rebalance log > (let me know if you need more) > > [2018-07-16 17:37:59.979003] I [MSGID: 0] > [dht-rebalance.c:1799:dht_migrate_file] > 0-data-dht: destination for file - /this/is/a/file/path/that/ > exists/wz/wz/Npc.wz/2040036.img.xml is changed to - data-client-2 > [2018-07-16 17:38:00.004262] I [MSGID: 109022] > [dht-rebalance.c:2274:dht_migrate_file] > 0-data-dht: completed migration of /this/is/a/file/path/that/ > exists/wz/wz/Npc.wz/2112002.img.xml from subvolume data-client-0 to > data-client-2 > [2018-07-16 17:38:00.725582] I [dht-rebalance.c:4982:gf_ > defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) > total_processed=43108305980 tmp_cnt = > 55419279917056,rate_processed=446597.869797, > elapsed = 96526.000000 > [2018-07-16 17:38:00.725641] I [dht-rebalance.c:5130:gf_defrag_status_get] > 0-glusterfs: TIME: Estimated total time to complete (size)= 124092127 > seconds, seconds left = 123995601 > [2018-07-16 17:38:00.725709] I [MSGID: 109028] > [dht-rebalance.c:5210:gf_defrag_status_get] > 0-glusterfs: Rebalance is in progress. Time taken is 96526.00 secs > [2018-07-16 17:38:00.725738] I [MSGID: 109028] > [dht-rebalance.c:5214:gf_defrag_status_get] > 0-glusterfs: Files migrated: 36876, size: 12270259289, lookups: 50715, > failures: 0, skipped: 0 > [2018-07-16 17:38:02.769121] I [dht-rebalance.c:4982:gf_ > defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) > total_processed=43108305980 tmp_cnt = > 55419279917056,rate_processed=446588.616567, > elapsed = 96528.000000 > [2018-07-16 17:38:02.769207] I [dht-rebalance.c:5130:gf_defrag_status_get] > 0-glusterfs: TIME: Estimated total time to complete (size)= 124094698 > seconds, seconds left = 123998170 > [2018-07-16 17:38:02.769263] I [MSGID: 109028] > [dht-rebalance.c:5210:gf_defrag_status_get] > 0-glusterfs: Rebalance is in progress. Time taken is 96528.00 secs > [2018-07-16 17:38:02.769286] I [MSGID: 109028] > [dht-rebalance.c:5214:gf_defrag_status_get] > 0-glusterfs: Files migrated: 36876, size: 12270259289, lookups: 50715, > failures: 0, skipped: 0 > [2018-07-16 17:38:03.410469] I [dht-rebalance.c:1645:dht_migrate_file] > 0-data-dht: /this/is/a/file/path/that/exists/wz/wz/Npc.wz/9201002.img.xml: > attempting to move from data-client-0 to data-client-2 > [2018-07-16 17:38:03.416127] I [MSGID: 109022] > [dht-rebalance.c:2274:dht_migrate_file] > 0-data-dht: completed migration of /this/is/a/file/path/that/ > exists/wz/wz/Npc.wz/2040036.img.xml from subvolume data-client-0 to > data-client-2 > [2018-07-16 17:38:04.738885] I [dht-rebalance.c:1645:dht_migrate_file] > 0-data-dht: /this/is/a/file/path/that/exists/wz/wz/Npc.wz/9110012.img.xml: > attempting to move from data-client-0 to data-client-2 > [2018-07-16 17:38:04.745722] I [MSGID: 109022] > [dht-rebalance.c:2274:dht_migrate_file] > 0-data-dht: completed migration of /this/is/a/file/path/that/ > exists/wz/wz/Npc.wz/9201002.img.xml from subvolume data-client-0 to > data-client-2 > [2018-07-16 17:38:04.812368] I [dht-rebalance.c:4982:gf_ > defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) > total_processed=43108308134 tmp_cnt = > 55419279917056,rate_processed=446579.386035, > elapsed = 96530.000000 > [2018-07-16 17:38:04.812417] I [dht-rebalance.c:5130:gf_defrag_status_get] > 0-glusterfs: TIME: Estimated total time to complete (size)= 124097263 > seconds, seconds left = 124000733 > [2018-07-16 17:38:04.812465] I [MSGID: 109028] > [dht-rebalance.c:5210:gf_defrag_status_get] > 0-glusterfs: Rebalance is in progress. Time taken is 96530.00 secs > [2018-07-16 17:38:04.812489] I [MSGID: 109028] > [dht-rebalance.c:5214:gf_defrag_status_get] > 0-glusterfs: Files migrated: 36877, size: 12270261443, lookups: 50715, > failures: 0, skipped: 0 > [2018-07-16 17:38:04.992413] I [dht-rebalance.c:1645:dht_migrate_file] > 0-data-dht: /this/is/a/file/path/that/exists/wz/wz/Npc.wz/2050000.img.xml: > attempting to move from data-client-0 to data-client-2 > [2018-07-16 17:38:04.994122] I [MSGID: 109022] > [dht-rebalance.c:2274:dht_migrate_file] > 0-data-dht: completed migration of /this/is/a/file/path/that/ > exists/wz/wz/Npc.wz/9110012.img.xml from subvolume data-client-0 to > data-client-2 > [2018-07-16 17:38:06.855618] I [dht-rebalance.c:4982:gf_ > defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) > total_processed=43108318798 tmp_cnt = > 55419279917056,rate_processed=446570.244043, > elapsed = 96532.000000 > [2018-07-16 17:38:06.855719] I [dht-rebalance.c:5130:gf_defrag_status_get] > 0-glusterfs: TIME: Estimated total time to complete (size)= 124099804 > seconds, seconds left = 124003272 > [2018-07-16 17:38:06.855770] I [MSGID: 109028] > [dht-rebalance.c:5210:gf_defrag_status_get] > 0-glusterfs: Rebalance is in progress. Time taken is 96532.00 secs > [2018-07-16 17:38:06.855793] I [MSGID: 109028] > [dht-rebalance.c:5214:gf_defrag_status_get] > 0-glusterfs: Files migrated: 36879, size: 12270266602, lookups: 50715, > failures: 0, skipped: 0 > [2018-07-16 17:38:08.511064] I [dht-rebalance.c:1645:dht_migrate_file] > 0-data-dht: /this/is/a/file/path/that/exists/wz/wz/Npc.wz/9201055.img.xml: > attempting to move from data-client-0 to data-client-2 > [2018-07-16 17:38:08.533029] I [MSGID: 109022] > [dht-rebalance.c:2274:dht_migrate_file] > 0-data-dht: completed migration of /this/is/a/file/path/that/ > exists/wz/wz/Npc.wz/2050000.img.xml from subvolume data-client-0 to > data-client-2 > [2018-07-16 17:38:08.899708] I [dht-rebalance.c:4982:gf_ > defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) > total_processed=43108318798 tmp_cnt = > 55419279917056,rate_processed=446560.991961, > elapsed = 96534.000000 > [2018-07-16 17:38:08.899791] I [dht-rebalance.c:5130:gf_defrag_status_get] > 0-glusterfs: TIME: Estimated total time to complete (size)= 124102375 > seconds, seconds left = 124005841 > [2018-07-16 17:38:08.899842] I [MSGID: 109028] > [dht-rebalance.c:5210:gf_defrag_status_get] > 0-glusterfs: Rebalance is in progress. Time taken is 96534.00 secs > [2018-07-16 17:38:08.899865] I [MSGID: 109028] > [dht-rebalance.c:5214:gf_defrag_status_get] > 0-glusterfs: Files migrated: 36879, size: 12270266602, lookups: 50715, > failures: 0, skipped: 0 > > > On Mon, Jul 16, 2018 at 7:37 AM, Nithya Balachandran <[email protected]> > wrote: > >> If possible, please send the rebalance logs as well. >> >> >> On 16 July 2018 at 10:14, Nithya Balachandran <[email protected]> >> wrote: >> >>> Hi Rusty, >>> >>> We need the following information: >>> >>> 1. The exact gluster version you are running >>> 2. gluster volume info <volname> >>> 3. gluster rebalance status >>> 4. Information on the directory structure and file locations on your >>> volume. >>> 5. How many levels of directories >>> 6. How many files and directories in each level >>> 7. How many directories and files in total (a rough estimate) >>> 8. Average file size >>> >>> Please note that having a rebalance running in the background should not >>> affect your volume access in any way. However I would like to know why only >>> 6000 files have been scanned in 6 hours. >>> >>> Regards, >>> Nithya >>> >>> >>> On 16 July 2018 at 06:13, Rusty Bower <[email protected]> wrote: >>> >>>> Hey folks, >>>> >>>> I just added a new brick to my existing gluster volume, but *gluster >>>> volume rebalance data status* is telling me the following: Estimated >>>> time left for rebalance to complete : > 2 months. Please try again later. >>>> >>>> I already did a fix-mapping, but this thing is absolutely crawling >>>> trying to rebalance everything (last estimate was ~40 years) >>>> >>>> Any thoughts on if this is a bug, or ways to speed this up? It's taking >>>> ~6 hours to scan 6000 files, which seems unreasonably slow. >>>> >>>> Thanks >>>> Rusty >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> [email protected] >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >> >
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
