----- Original Message ----- > Hi, > > I have a centos 6.5 cluster that are connected to a Fibre Channel SAN in star > topology. All nodes/SAN_storages have single-pair fibre connection and > no multipathing. Possibility of hardware issue had been eliminated > because read/write between all other node/SAN_storage pairs works > perfectly. > > Problem: > Everything was running perfectly for years. Then node3 suddenly has > very slow write to SAN_storage1, ~10KB/sec. Read speed seems to remain > normal. > > Can anyone give be some pointers to debug the problem. Thank you. > > Dil
Hi Dil, The first thing I would suspect is that the file system is running low on free blocks. GFS2 starts to struggle when a file system has too few blocks for new allocations. If the file system has a small resource group size, it may still look like you've got a lot of free blocks when this happens. The solution, of course, is to use a bigger file system with more free space. You can use lvresize then gfs2_grow to make the file system bigger, but you may want to consider copying the data to a new device that's bigger, simply to reduce file system fragmentation (as I'm about to explain). The second thing I would suspect is file system fragmentation. When GFS2 file systems get too fragmented over time, the gfs2 block allocator runs into the same problem: It can find free blocks, but not a long enough continuous run of them to satisfy its "ideal" conditions. Unfortunately, there's no defrag tool for GFS2, so you'd just have to copy the data to a new file system (with a single process only), which ought to minimize the fragmentation in the new copy. There may be lots of causes for GFS2 slowing down (such as faulty routers), and each has a separate thing to diagnose and debug, but these are probably the top two. Regards, Bob Peterson Red Hat File Systems -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster