Re: [Gluster-users] File operation failure on simple distributed volume
Hi Yonex Recently Poornima has fixed one corruption issue with upcall, which seems unlikely the cause of the issue, given that you are running fuse clients. Even then I would like to give you a debug build including the fix [1] and adding additional logs. Will you be able to run the debug build ? [1] : https://review.gluster.org/#/c/16613/ Regards Rafi KC On 02/16/2017 09:13 PM, yonex wrote: > Hi Rafi, > > I'm still on this issue. But reproduction has not yet been achieved > outside of production. In production environment, I have made > applications stop writing data to glusterfs volume. Only read > operations are going. > > P.S. It seems that I have corrupted the email thread..;-( > http://lists.gluster.org/pipermail/gluster-users/2017-January/029679.html > > 2017-02-14 17:19 GMT+09:00 Mohammed Rafi K C: >> Hi Yonex, >> >> Are you still hitting this issue ? >> >> >> Regards >> >> Rafi KC >> >> >> On 01/16/2017 10:36 AM, yonex wrote: >> >> Hi >> >> I noticed that there is a high throughput degradation while attaching the >> gdb script to a glusterfs client process. Write speed becomes 2% or less. It >> is not be able to keep thrown in production. >> >> Could you provide the custom build that you mentioned before? I am going to >> keep trying to reproduce the problem outside of the production environment. >> >> Regards >> >> 2017年1月8日 21:54、Mohammed Rafi K C : >> >> Is there any update on this ? >> >> >> Regards >> >> Rafi KC >> >> On 12/24/2016 03:53 PM, yonex wrote: >> >> Rafi, >> >> >> Thanks again. I will try that and get back to you. >> >> >> Regards. >> >> >> >> 2016-12-23 18:03 GMT+09:00 Mohammed Rafi K C : >> >> Hi Yonex, >> >> >> As we discussed in irc #gluster-devel , I have attached the gdb script >> >> along with this mail. >> >> >> Procedure to run the gdb script. >> >> >> 1) Install gdb, >> >> >> 2) Download and install gluster debuginfo for your machine . packages >> >> location --- > https://cbs.centos.org/koji/buildinfo?buildID=12757 >> >> >> 3) find the process id and attach gdb to the process using the command >> >> gdb attach -x >> >> >> 4) Continue running the script till you hit the problem >> >> >> 5) Stop the gdb >> >> >> 6) You will see a file called mylog.txt in the location where you ran >> >> the gdb >> >> >> >> Please keep an eye on the attached process. If you have any doubt please >> >> feel free to revert me. >> >> >> Regards >> >> >> Rafi KC >> >> >> >> On 12/19/2016 05:33 PM, Mohammed Rafi K C wrote: >> >> On 12/19/2016 05:32 PM, Mohammed Rafi K C wrote: >> >> Client 0-glusterfs01-client-2 has disconnected from bricks around >> >> 2016-12-15 11:21:17.854249 . Can you look and/or paste the brick logs >> >> around the time. >> >> You can find the brick name and hostname for 0-glusterfs01-client-2 from >> >> client graph. >> >> >> Rafi >> >> >> Are you there in any of gluster irc channel, if so Have you got a >> >> nickname that I can search. >> >> >> Regards >> >> Rafi KC >> >> >> On 12/19/2016 04:28 PM, yonex wrote: >> >> Rafi, >> >> >> OK. Thanks for your guide. I found the debug log and pasted lines around >> that. >> >> http://pastebin.com/vhHR6PQN >> >> >> Regards >> >> >> >> 2016-12-19 14:58 GMT+09:00 Mohammed Rafi K C : >> >> On 12/16/2016 09:10 PM, yonex wrote: >> >> Rafi, >> >> >> Thanks, the .meta feature I didn't know is very nice. I finally have >> >> captured debug logs from a client and bricks. >> >> >> A mount log: >> >> - http://pastebin.com/Tjy7wGGj >> >> >> FYI rickdom126 is my client's hostname. >> >> >> Brick logs around that time: >> >> - Brick1: http://pastebin.com/qzbVRSF3 >> >> - Brick2: http://pastebin.com/j3yMNhP3 >> >> - Brick3: http://pastebin.com/m81mVj6L >> >> - Brick4: http://pastebin.com/JDAbChf6 >> >> - Brick5: http://pastebin.com/7saP6rsm >> >> >> However I could not find any message like "EOF on socket". I hope >> >> there is any helpful information in the logs above. >> >> Indeed. I understand that the connections are in disconnected state. But >> >> what particularly I'm looking for is the cause of the disconnect, Can >> >> you paste the debug logs when it start disconnects, and around that. You >> >> may see a debug logs that says "disconnecting now". >> >> >> >> Regards >> >> Rafi KC >> >> >> >> Regards. >> >> >> >> 2016-12-14 15:20 GMT+09:00 Mohammed Rafi K C : >> >> On 12/13/2016 09:56 PM, yonex wrote: >> >> Hi Rafi, >> >> >> Thanks for your response. OK, I think it is possible to capture debug >> >> logs, since the error seems to be reproduced a few times per day. I >> >> will try that. However, so I want to avoid redundant debug outputs if >> >> possible, is there a way to enable debug log only on specific client >> >> nodes? >> >> if you are using fuse mount, there is proc kind of feature called .meta >> >> . You can set log level through that for a particular client [1] . But I >> >> also want log from bricks
[Gluster-users] Machine becomes its own peer
Dear all Last week I posted a query about a problem I had with a machine that had failed but the underlying hard disk with the gluster brick was good. I’ve made some progress in restoring. I now have the problem with my new restored machine where it becomes its own peer, which then breaks everything. 1. Gluster daemons are off on all peers, content of /var/lib/glusterd/peers looks good. 2. I start the gluster daemons on all peers. All looks good. 3. For about 2 minutes, there’s no obvious problem — if I do a gluster peer status on any machine it looks good, if I do a gluster volume status A01 on any machine it looks good. 4. Then at some point, the /var/lib/glusterd/peers file of the new, restored machine gets an entry for itself and things start breaking. A typical error message is the understandable : Unable to get lock for uuid: 4fb930f7-554e-462a-9204-4592591feeb8, lock held by: 4fb930f7-554e-462a-9204-4592591feeb8 5. This is repeatable — if I stop daemons, remove the offending entry in /var/lib/glusterd/peer, and restart, the same behavior occurs — all good for a minute or two and then something magically puts something in /var/lib/glusterd/peers In a previous step in restoring my machine, I had a different error of mismatching cksums and what I did then may be the cause of the problem. In searching the list archives I found someone with a similar cksum problem, and the proposed solution was to copy the /var/lib/glusterd/vols/ from another of the peers to the new machine. This may not be the issue but this is the only thing I think I did that was unconventional. I am running version 3.7.5-19 on Scientific Linux 6.8 If anyone can suggest a way forward I would be grateful Many thanks Scott This communication is intended for the addressee only. It is confidential. If you have received this communication in error, please notify us immediately and destroy the original message. You may not copy or disseminate this communication without the permission of the University. Only authorised signatories are competent to enter into agreements on behalf of the University and recipients are thus advised that the content of this message may not be legally binding on the University and may contain the personal views and opinions of the author, which are not necessarily the views and opinions of The University of the Witwatersrand, Johannesburg. All agreements between the University and outsiders are subject to South African Law unless the University agrees in writing to the contrary. http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 90 Brick/Server suggestions?
>We have 12 on order. Actually the DSS7000 has two nodes in the chassis, >and each accesses 45 bricks. We will be using an erasure code scheme >probably 24:3 or 24:4, we have not sat down and really thought about the >exact scheme we will use. If we cannot get 1 node/90 disk configuration, we also get it as 2 nodes/45 disks each. Be careful about EC. I am using 16+4 in production, only drawback is slow rebuild times. It takes 10 days to rebuild 8TB disk. Although parallel heal for EC improves it in 3.9, don't forget to test rebuild times for different EC configurations, >90 disks per server is a lot. In particular, it might be out of balance with >other >characteristics of the machine - number of cores, amount of memory, network >or even bus bandwidth Nodes will be pretty powerful, 2x18 core CPUs with 256GB RAM and 2X10Gb bonded ethernet. It will be used for archive purposes so I don't need more than 1GB/s/node. RAID is not an option, JBOD with EC will be used. >gluster volume set all cluster.brick-multiplex on I just read the 3.10 release notes and saw this. I think this is a good solution, I plan to use 3.10.x and will probably test multiplexing and get in touch for help.. Thanks for the suggestions, Serkan On Fri, Feb 17, 2017 at 1:39 AM, Jeff Darcywrote: >> We are evaluating dell DSS7000 chassis with 90 disks. >> Has anyone used that much brick per server? >> Any suggestions, advices? > > 90 disks per server is a lot. In particular, it might be out of balance with > other characteristics of the machine - number of cores, amount of memory, > network or even bus bandwidth. Most people who put that many disks in a > server use some sort of RAID (HW or SW) to combine them into a smaller number > of physical volumes on top of which filesystems and such can be built. If > you can't do that, or don't want to, you're in poorly explored territory. My > suggestion would be to try running as 90 bricks. It might work fine, or you > might run into various kinds of contention: > > (1) Excessive context switching would indicate not enough CPU. > > (2) Excessive page faults would indicate not enough memory. > > (3) Maxed-out network ports . . . well, you can figure that one out. ;) > > If (2) applies, you might want to try brick multiplexing. This is a new > feature in 3.10, which can reduce memory consumption by more than 2x in many > cases by putting multiple bricks into a single process (instead of one per > brick). This also drastically reduces the number of ports you'll need, since > the single process only needs one port total instead of one per brick. In > terms of CPU usage or performance, gains are far more modest. Work in that > area is still ongoing, as is work on multiplexing in general. If you want to > help us get it all right, you can enable multiplexing like this: > > gluster volume set all cluster.brick-multiplex on > > If multiplexing doesn't help for you, speak up and maybe we can make it > better, or perhaps come up with other things to try. Good luck! ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 90 Brick/Server suggestions?
We have 12 on order. Actually the DSS7000 has two nodes in the chassis, and each accesses 45 bricks. We will be using an erasure code scheme probably 24:3 or 24:4, we have not sat down and really thought about the exact scheme we will use. On 15 February 2017 at 14:04, Serkan Çobanwrote: > Hi, > > We are evaluating dell DSS7000 chassis with 90 disks. > Has anyone used that much brick per server? > Any suggestions, advices? > > Thanks, > Serkan > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] File operation failure on simple distributed volume
Hi Rafi, I'm still on this issue. But reproduction has not yet been achieved outside of production. In production environment, I have made applications stop writing data to glusterfs volume. Only read operations are going. P.S. It seems that I have corrupted the email thread..;-( http://lists.gluster.org/pipermail/gluster-users/2017-January/029679.html 2017-02-14 17:19 GMT+09:00 Mohammed Rafi K C: > Hi Yonex, > > Are you still hitting this issue ? > > > Regards > > Rafi KC > > > On 01/16/2017 10:36 AM, yonex wrote: > > Hi > > I noticed that there is a high throughput degradation while attaching the > gdb script to a glusterfs client process. Write speed becomes 2% or less. It > is not be able to keep thrown in production. > > Could you provide the custom build that you mentioned before? I am going to > keep trying to reproduce the problem outside of the production environment. > > Regards > > 2017年1月8日 21:54、Mohammed Rafi K C : > > Is there any update on this ? > > > Regards > > Rafi KC > > On 12/24/2016 03:53 PM, yonex wrote: > > Rafi, > > > Thanks again. I will try that and get back to you. > > > Regards. > > > > 2016-12-23 18:03 GMT+09:00 Mohammed Rafi K C : > > Hi Yonex, > > > As we discussed in irc #gluster-devel , I have attached the gdb script > > along with this mail. > > > Procedure to run the gdb script. > > > 1) Install gdb, > > > 2) Download and install gluster debuginfo for your machine . packages > > location --- > https://cbs.centos.org/koji/buildinfo?buildID=12757 > > > 3) find the process id and attach gdb to the process using the command > > gdb attach -x > > > 4) Continue running the script till you hit the problem > > > 5) Stop the gdb > > > 6) You will see a file called mylog.txt in the location where you ran > > the gdb > > > > Please keep an eye on the attached process. If you have any doubt please > > feel free to revert me. > > > Regards > > > Rafi KC > > > > On 12/19/2016 05:33 PM, Mohammed Rafi K C wrote: > > On 12/19/2016 05:32 PM, Mohammed Rafi K C wrote: > > Client 0-glusterfs01-client-2 has disconnected from bricks around > > 2016-12-15 11:21:17.854249 . Can you look and/or paste the brick logs > > around the time. > > You can find the brick name and hostname for 0-glusterfs01-client-2 from > > client graph. > > > Rafi > > > Are you there in any of gluster irc channel, if so Have you got a > > nickname that I can search. > > > Regards > > Rafi KC > > > On 12/19/2016 04:28 PM, yonex wrote: > > Rafi, > > > OK. Thanks for your guide. I found the debug log and pasted lines around > that. > > http://pastebin.com/vhHR6PQN > > > Regards > > > > 2016-12-19 14:58 GMT+09:00 Mohammed Rafi K C : > > On 12/16/2016 09:10 PM, yonex wrote: > > Rafi, > > > Thanks, the .meta feature I didn't know is very nice. I finally have > > captured debug logs from a client and bricks. > > > A mount log: > > - http://pastebin.com/Tjy7wGGj > > > FYI rickdom126 is my client's hostname. > > > Brick logs around that time: > > - Brick1: http://pastebin.com/qzbVRSF3 > > - Brick2: http://pastebin.com/j3yMNhP3 > > - Brick3: http://pastebin.com/m81mVj6L > > - Brick4: http://pastebin.com/JDAbChf6 > > - Brick5: http://pastebin.com/7saP6rsm > > > However I could not find any message like "EOF on socket". I hope > > there is any helpful information in the logs above. > > Indeed. I understand that the connections are in disconnected state. But > > what particularly I'm looking for is the cause of the disconnect, Can > > you paste the debug logs when it start disconnects, and around that. You > > may see a debug logs that says "disconnecting now". > > > > Regards > > Rafi KC > > > > Regards. > > > > 2016-12-14 15:20 GMT+09:00 Mohammed Rafi K C : > > On 12/13/2016 09:56 PM, yonex wrote: > > Hi Rafi, > > > Thanks for your response. OK, I think it is possible to capture debug > > logs, since the error seems to be reproduced a few times per day. I > > will try that. However, so I want to avoid redundant debug outputs if > > possible, is there a way to enable debug log only on specific client > > nodes? > > if you are using fuse mount, there is proc kind of feature called .meta > > . You can set log level through that for a particular client [1] . But I > > also want log from bricks because I suspect bricks process for > > initiating the disconnects. > > > > [1] eg : echo 8 > /mnt/glusterfs/.meta/logging/loglevel > > > Regards > > > Yonex > > > 2016-12-13 23:33 GMT+09:00 Mohammed Rafi K C : > > Hi Yonex, > > > Is this consistently reproducible ? if so, Can you enable debug log [1] > > and check for any message similar to [2]. Basically you can even search > > for "EOF on socket". > > > You can set your log level back to default (INFO) after capturing for > > some time. > > > > [1] : gluster volume set diagnostics.brick-log-level DEBUG and > > gluster volume set diagnostics.client-log-level
[Gluster-users] Failed snapshot clone leaving undeletable orphaned volume on a single peer
Hey guys, I tried to create a new volume from a cloned snapshot yesterday, however something went wrong during the process & I'm now stuck with the new volume being created on the server I ran the commands on (s0), but not on the rest of the peers. I'm unable to delete this new volume from the server, as it doesn't exist on the peers. What do I do? Any insights into what may have gone wrong? CentOS 7.3.1611 Gluster 3.8.8 The command history & extract from etc-glusterfs-glusterd.vol.log are included below. gluster volume list gluster snapshot list gluster snapshot clone data-teste data-bck_GMT-2017.02.09-14.15.43 gluster volume status data-teste gluster volume delete data-teste gluster snapshot create teste data gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04 gluster snapshot status gluster snapshot activate teste_GMT-2017.02.15-12.44.04 gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04 [2017-02-15 12:43:21.667403] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data-teste [2017-02-15 12:43:21.682530] E [MSGID: 106301] [glusterd-syncop.c:1297:gd_stage_op_phase] 0-management: Staging of operation 'Volume Status' failed on localhost : Volume data-teste is not started [2017-02-15 12:43:43.633031] I [MSGID: 106495] [glusterd-handler.c:3128:__glusterd_handle_getwd] 0-glusterd: Received getwd req [2017-02-15 12:43:43.640597] I [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcc4b2) [0x7ffb396a14b2] -->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcbf65) [0x7ffb396a0f65] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7ffb44ec31c5] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post --volname=data-teste [2017-02-15 13:05:20.103423] E [MSGID: 106122] [glusterd-snapshot.c:2397:glusterd_snapshot_clone_prevalidate] 0-management: Failed to pre validate [2017-02-15 13:05:20.103464] E [MSGID: 106443] [glusterd-snapshot.c:2413:glusterd_snapshot_clone_prevalidate] 0-management: One or more bricks are not running. Please run snapshot status command to see brick status. Please start the stopped brick and then issue snapshot clone command [2017-02-15 13:05:20.103481] W [MSGID: 106443] [glusterd-snapshot.c:8563:glusterd_snapshot_prevalidate] 0-management: Snapshot clone pre-validation failed [2017-02-15 13:05:20.103492] W [MSGID: 106122] [glusterd-mgmt.c:167:gd_mgmt_v3_pre_validate_fn] 0-management: Snapshot Prevalidate Failed [2017-02-15 13:05:20.103503] E [MSGID: 106122] [glusterd-mgmt.c:884:glusterd_mgmt_v3_pre_validate] 0-management: Pre Validation failed for operation Snapshot on local node [2017-02-15 13:05:20.103514] E [MSGID: 106122] [glusterd-mgmt.c:2243:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Pre Validation Failed [2017-02-15 13:05:20.103531] E [MSGID: 106027] [glusterd-snapshot.c:8118:glusterd_snapshot_clone_postvalidate] 0-management: unable to find clone data-teste volinfo [2017-02-15 13:05:20.103542] W [MSGID: 106444] [glusterd-snapshot.c:9063:glusterd_snapshot_postvalidate] 0-management: Snapshot create post-validation failed [2017-02-15 13:05:20.103561] W [MSGID: 106121] [glusterd-mgmt.c:351:gd_mgmt_v3_post_validate_fn] 0-management: postvalidate operation failed [2017-02-15 13:05:20.103572] E [MSGID: 106121] [glusterd-mgmt.c:1660:glusterd_mgmt_v3_post_validate] 0-management: Post Validation failed for operation Snapshot on local node [2017-02-15 13:05:20.103582] E [MSGID: 106122] [glusterd-mgmt.c:2363:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Post Validation Failed [2017-02-15 13:11:15.862858] W [MSGID: 106057] [glusterd-snapshot-utils.c:410:glusterd_snap_volinfo_find] 0-management: Snap volume c3ceae3889484e96ab8bed69593cf6d3.s0.run-gluster-snaps-c3ceae3889484e96ab8bed69593cf6d3-brick1-data-brick not found [Argumento inválido] [2017-02-15 13:11:16.314759] I [MSGID: 106143] [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick /run/gluster/snaps/c3ceae3889484e96ab8bed69593cf6d3/brick1/data/brick on port 49452 [2017-02-15 13:11:16.316090] I [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2017-02-15 13:11:16.348867] W [MSGID: 106057] [glusterd-snapshot-utils.c:410:glusterd_snap_volinfo_find] 0-management: Snap volume c3ceae3889484e96ab8bed69593cf6d3.s0.run-gluster-snaps-c3ceae3889484e96ab8bed69593cf6d3-brick6-data-arbiter not found [Argumento inválido] [2017-02-15 13:11:16.558878] I [MSGID: 106143] [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick /run/gluster/snaps/c3ceae3889484e96ab8bed69593cf6d3/brick6/data/arbiter on port 49453 [2017-02-15 13:11:16.559883] I [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2017-02-15 13:11:23.279721] E [MSGID: 106030] [glusterd-snapshot.c:4736:glusterd_take_lvm_snapshot] 0-management: taking snapshot of the brick
Re: [Gluster-users] Optimal shard size & self-heal algorithm for VM hosting?
On Wed, Feb 15, 2017 at 9:38 PM, Gambit15wrote: > Hey guys, > I keep seeing different recommendations for the best shard sizes for VM > images, from 64MB to 512MB. > > What's the benefit of smaller v larger shards? > I'm guessing smaller shards are quicker to heal, but larger shards will > provide better sequential I/O for single clients? Anything else? > That's the main difference. And also smaller shards provide better brick utilization and distribution of IO in distributed-replicated volumes as opposed to larger shards. > > I also usually see "cluster.data-self-heal-algorithm: full" is generally > recommended in these cases. Why not "diff"? Is it simply to reduce CPU load > when there's plenty of excess network capacity? > That's correct. diff heal requires rolling checksum to be computed for every 128KB chunk of the file on both source and sink bricks, which is CPU intensive, potentially affecting IO traffic. -Krutika > > Thanks in advance, > Doug > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Error while setting cluster.granular-entry-heal
Could you please attach the "glfsheal-.log" logfile? -Krutika On Thu, Feb 16, 2017 at 12:05 AM, Andrea Fogazziwrote: > Hello, > > I have a gluster volume on 3.8.8 which has multiple volumes, each on > distributed/replicated on 5 servers (2 replicas+1 quorum); each volume is > both accessed as gluster client (RW, clients have 3.8.8 or 3.8.5) or > Ganesha FS (RO). > > > On one of the volumes we have: > > - cluster.data-self-heal-algorithm: full > - cluster.locking-scheme: granular > > When we try to set > > - cluster.granular-entry-heal on > > (command I use is "gluster volume set vol-cor-homes > cluster.granular-entry-heal on") > > I receive > > *volume set: failed: 'gluster volume set > cluster.granular-entry-heal {enable, disable}' is not supported. Use > 'gluster volume heal granular-entry-heal {enable, disable}' > instead.* > > > Answer is not clear to me; I also tried command suggested in the command > response, but it does not work (I get "Enable granular entry heal on > volume vol-cor-homes has been unsuccessful on bricks that are down. Please > check if all brick processes are running." while I am sure all bricks are > online). > > > Do you have any suggestion on what I am doing wrong, or how to debug the > issue? > > > > Thanks in advance. > > > Best regards > > andrea > > > > > > -- > Andrea Fogazzi > fo...@fogazzi.com > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users