+Raghavendra/Nithya On Tue, Jun 6, 2017 at 7:41 PM, Jarsulic, Michael [CRI] < [email protected]> wrote:
> Hello, > > I am still working at recovering from a few failed OS hard drives on my > gluster storage and have been removing, and re-adding bricks quite a bit. I > noticed yesterday night that some of the directories are not visible when I > access them through the client, but are still on the brick. For example: > > Client: > > # ls /scratch/dw > Ethiopian_imputation HGDP Rolwaling Tibetan_Alignment > > Brick: > > # ls /data/brick1/scratch/dw > 1000GP_Phase3 Ethiopian_imputation HGDP Rolwaling SGDP > Siberian_imputation Tibetan_Alignment mapata > > > However, the directory is accessible on the client side (just not visible): > > # stat /scratch/dw/SGDP > File: `/scratch/dw/SGDP' > Size: 212992 Blocks: 416 IO Block: 131072 directory > Device: 21h/33d Inode: 11986142482805280401 Links: 2 > Access: (0775/drwxrwxr-x) Uid: (339748621/dw) Gid: (339748621/dw) > Access: 2017-06-02 16:00:02.398109000 -0500 > Modify: 2017-06-06 06:59:13.004947703 -0500 > Change: 2017-06-06 06:59:13.004947703 -0500 > > > The only place I see the directory mentioned in the log files are in the > rebalance logs. The following piece may provide a clue as to what is going > on: > > [2017-06-05 20:46:51.752726] E [MSGID: 109010] > [dht-rebalance.c:2259:gf_defrag_get_entry] > 0-hpcscratch-dht: /dw/SGDP/HGDP00476_chr6.tped gfid not present > [2017-06-05 20:46:51.752742] E [MSGID: 109010] > [dht-rebalance.c:2259:gf_defrag_get_entry] > 0-hpcscratch-dht: /dw/SGDP/LP6005441-DNA_B08_chr4.tmp gfid not present > [2017-06-05 20:46:51.752773] E [MSGID: 109010] > [dht-rebalance.c:2259:gf_defrag_get_entry] > 0-hpcscratch-dht: /dw/SGDP/LP6005441-DNA_B08.geno.tmp gfid not present > [2017-06-05 20:46:51.752789] E [MSGID: 109010] > [dht-rebalance.c:2259:gf_defrag_get_entry] > 0-hpcscratch-dht: /dw/SGDP/LP6005443-DNA_D02_chr4.out gfid not present > > This happened yesterday during a rebalance that failed. However, doing a > rebalance fix-layout allowed my to clean up these errors and successfully > complete a migration to a re-added brick. > > > Here is the information for my storage cluster: > > # gluster volume info > > Volume Name: hpcscratch > Type: Distribute > Volume ID: 80b8eeed-1e72-45b9-8402-e01ae0130105 > Status: Started > Number of Bricks: 6 > Transport-type: tcp > Bricks: > Brick1: fs001-ib:/data/brick2/scratch > Brick2: fs003-ib:/data/brick5/scratch > Brick3: fs003-ib:/data/brick6/scratch > Brick4: fs004-ib:/data/brick7/scratch > Brick5: fs001-ib:/data/brick1/scratch > Brick6: fs004-ib:/data/brick8/scratch > Options Reconfigured: > server.event-threads: 8 > performance.client-io-threads: on > client.event-threads: 8 > performance.cache-size: 32MB > performance.readdir-ahead: on > diagnostics.client-log-level: INFO > diagnostics.brick-log-level: INFO > > > Mount points for the bricks: > > /dev/sdb on /data/brick2 type xfs (rw,noatime,nobarrier) > /dev/sda on /data/brick1 type xfs (rw,noatime,nobarrier) > > > Mount point on the client: > > 10.xx.xx.xx:/hpcscratch on /scratch type fuse.glusterfs > (rw,default_permissions,allow_other,max_read=131072) > > > My question is what are some of the possibilities for the root cause of > this issue and what is the recommended way of recovering from it? Let me > know if you need any more information. > > > -- > Mike Jarsulic > Sr. HPC Administrator > Center for Research Informatics | University of Chicago > 773.702.2066 > _______________________________________________ > Gluster-users mailing list > [email protected] > http://lists.gluster.org/mailman/listinfo/gluster-users > -- Pranith
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
