[Gluster-users] REMINDER: Gluster Community Bug Triage meeting today at 12:00 UTC
Hi all, Later today we will have an other Gluster Community Bug Triage meeting. Meeting details: - location: #gluster-meeting on Freenode IRC - date: every Tuesday - time: 12:00 UTC, 13:00 CET (in your terminal, run: date -d 12:00 UTC) - agenda: https://public.pad.fsfe.org/p/gluster-bug-triage Currently the following items are listed: * Roll Call * Status of last weeks action items * Group Triage * Open Floor The last two topics have space for additions. If you have a suitable bug or topic to discuss, please add it to the agenda. Your host today is LalatenduM. I'm unfortunately not avaialble this/my afternoon. Thanks, Niels pgpP36J4PD4lh.pgp Description: PGP signature ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Disable rsync compression for georeplication
Hi John, There seems to a bug in cli command line parsing which is not allowing options prefixed with double hyphen apart from few. As a workaround to fix this for now, do following steps. 1. Stop geo-replication session. 2. Add the following at the end of the file /var/lib/glusterd/geo-replication/gsyncd.conf in all master nodes. rsync_options = --compress-level=0 3. Start geo-replication session. Thanks and Regards, Kotresh H R - Original Message - From: John Gardeniers jgardeni...@objectmastery.com To: gluster-users@gluster.org Sent: Tuesday, November 25, 2014 5:24:18 AM Subject: [Gluster-users] Disable rsync compression for georeplication Using Gluster 3.4.2, we have a situation where georeplication is causing the CPU on the master to be hammered, while the bandwidth is having a holiday. I'd therefore like to disable the rsync compression for georeplication. In a response to David F. Robinson asking the same question back in August, Vishwanath Bhat wrote: You can use rsync-options to specify any rsync options that you want to rsync to use. Just make sure that it does not conflict with the default rsync options used by geo-rep. For example, #gluster volume geo-replication MASTER SLAVE config rsync-options '--bwlimit=value' Finding no way to negate the --compress option used in georeplication I thought that perhaps I could get part way there by simply winding down the compression level using gluster volume geo-replication MASTER SLAVE config rsync-options '--compress-level=0' (Yes, I did use real entries for MASTER and SLAVE.) The return from that command was unrecognized option --compress-level=0. I tried a few other numbers and got the same negative results. Can someone please advise how to disable rsync compression for georeplication? Thanks. regards, John ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] locking in 3.6.1
On 11/24/14, 11:56 PM, Atin Mukherjee wrote: Can you please find/point out the first instance of the command and its associated glusterd log which failed to acquire the cluster wide lock. Can you help me identify what I should be looking for in the logs? I restarted the glusterd service and see the following on server gluster2: [2014-11-25 13:34:50.552695] W [glusterfsd.c:1194:cleanup_and_exit] (-- 0-: received signum (15), shutting down [2014-11-25 13:34:50.569445] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.6.1 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2014-11-25 13:34:50.576951] I [glusterd.c:1214:init] 0-management: Maximum allowed open file descriptors set to 65536 [2014-11-25 13:34:50.577008] I [glusterd.c:1259:init] 0-management: Using /var/lib/glusterd as working directory [2014-11-25 13:34:50.581436] E [rpc-transport.c:266:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.6.1/rpc-transport/rdma.so: cannot open shared object file: No such file or directory [2014-11-25 13:34:50.581469] W [rpc-transport.c:270:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine [2014-11-25 13:34:50.581486] W [rpcsvc.c:1524:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2014-11-25 13:34:50.583680] I [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system [2014-11-25 13:34:50.584503] I [glusterd-store.c:2043:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30501 [2014-11-25 13:34:51.074904] I [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2014-11-25 13:34:51.075024] I [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2014-11-25 13:34:51.075107] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-25 13:34:51.082647] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-25 13:34:51.089136] I [glusterd-store.c:3501:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2014-11-25 13:34:51.094388] I [glusterd.c:146:glusterd_uuid_init] 0-management: retrieved UUID: 23989211-4f0d-4087-b9c5-bc82295b2c38 Final graph: +--+ 1: volume management 2: type mgmt/glusterd 3: option rpc-auth.auth-glusterfs on 4: option rpc-auth.auth-unix on 5: option rpc-auth.auth-null on 6: option transport.socket.listen-backlog 128 7: option ping-timeout 30 8: option transport.socket.read-fail-log off 9: option transport.socket.keepalive-interval 2 10: option transport.socket.keepalive-time 10 11: option transport-type rdma 12: option working-directory /var/lib/glusterd 13: end-volume 14: +--+ [2014-11-25 13:34:57.330091] W [socket.c:611:__socket_rwv] 0-management: readv on 192.168.30.107:24007 failed (No data available) [2014-11-25 13:34:57.330583] E [rpc-clnt.c:362:saved_frames_unwind] (-- /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f7065b04396] (-- /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f70658d6fce] (-- /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f70658d70de] (-- /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x82)[0x7f70658d8a42] (-- /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f70658d91f8] ) 0-management: forced unwinding frame type(Peer mgmt) op(--(2)) called at 2014-11-25 13:34:51.150255 (xid=0x5) [2014-11-25 13:34:57.330641] I [MSGID: 106004] [glusterd-handler.c:4365:__glusterd_peer_rpc_notify] 0-management: Peer bb6b-b048-4dc9-b54d-12a0cc2dd8a9, in Peer in Cluster state, has disconnected from glusterd. [2014-11-25 13:34:57.338507] I [glusterd-rpc-ops.c:436:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 60b2251f-2d69-41a6-91da-fd3f14c5a1e6, host: gluster3.innova.local, port: 0 [2014-11-25 13:34:57.361000] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-25 13:34:57.361306] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-25 13:34:57.361535] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-25 13:34:57.361781] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-25 13:34:57.362014] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-25 13:34:57.362241] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-25 13:34:57.362458] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-25
[Gluster-users] Todays minutes of the Gluster Community Bug Triage meeting
On 11/25/2014 01:41 PM, Niels de Vos wrote: Hi all, Later today we will have an other Gluster Community Bug Triage meeting. Meeting details: - location: #gluster-meeting on Freenode IRC - date: every Tuesday - time: 12:00 UTC, 13:00 CET (in your terminal, run: date -d 12:00 UTC) - agenda: https://public.pad.fsfe.org/p/gluster-bug-triage Currently the following items are listed: * Roll Call * Status of last weeks action items * Group Triage * Open Floor The last two topics have space for additions. If you have a suitable bug or topic to discuss, please add it to the agenda. Your host today is LalatenduM. I'm unfortunately not avaialble this/my afternoon. Thanks, Niels zodbot Meeting ended Tue Nov 25 12:55:40 2014 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . zodbot Minutes: http://meetbot.fedoraproject.org/gluster-meeting/2014-11-25/gluster-meeting.2014-11-25-12.02.html zodbot Minutes (text): http://meetbot.fedoraproject.org/gluster-meeting/2014-11-25/gluster-meeting.2014-11-25-12.02.txt zodbot Log: http://meetbot.fedoraproject.org/gluster-meeting/2014-11-25/gluster-meeting.2014-11-25-12.02.log.html Meeting summary --- * Roll Call (lalatenduM, 12:02:45) * Status of last weeks action items (lalatenduM, 12:06:16) * hagarth will look for somebody that can act like a bug assigner manager kind of person (lalatenduM, 12:06:30) * pranithk to report how his team is assigning triaged bugs (lalatenduM, 12:07:17) * Humble will request the replacement of old versions by a new unsupported version (lalatenduM, 12:08:14) * ACTION: Humble to remove pre-2.0 from versions (lalatenduM, 12:13:21) * ndevos will update the unsupported bugs with a message about being unsupported, and request for testing (lalatenduM, 12:09:31) * hagarth should update the MAINTAINERS file, add current maintainers and new components like Snapshot (lalatenduM, 12:14:48) * Group Triage (lalatenduM, 12:16:22) * New bugs to triage (filed since last week): (lalatenduM, 12:17:23) * open floor (lalatenduM, 12:46:42) Meeting ended at 12:55:40 UTC Thanks, Lala ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Need some clarifications about the disperse feature
Hello Gluster experts, I have been using gluster for a small cluster for a few years now and I have a question regarding the new disperse feature, which is for me a much anticipated addition. *Suppose* I create a volume with a disperse set of 3, redundancy 1 (let's call them A1, A2, A3) and then I add 3 more bricks to that volume (we'll call them B1, B2, B3). *First question* - which of the bricks will be the one carrying the redundancy data? *Second question* - If I have machines with faster disk - should I assign them to the data or the redundancy bricks? What should I expect the load to be on the redundancy machine in heavy read scenarios and in heavy write scenarios? *Third question* - *does this require reading the entire data* of A1, A2 and A3 by initiating a heal or another operation? *4th question* (and most important for me) - I saw in the list that it is now a Distributed-Dispersed volume. I understand I can now lose, for example bricks A1 and B1 and still have my entire data intact. Is this also correct for bricks from the same set, for example A1 and A2? Or to put it in a more generic way - *does this create the exact same dispersed volume as if I created it originally with A1, A2 A3 B1 B2 B3 and a redundancy of 2?* Many thanks for your work and for your help on this list, Ayelet ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Need some clarifications about the disperse feature
Xavi will be the better person to clear all your doubts on this feature, however as per my understanding please see the response inline. ~Atin On 11/25/2014 07:11 PM, Ayelet Shemesh wrote: Hello Gluster experts, I have been using gluster for a small cluster for a few years now and I have a question regarding the new disperse feature, which is for me a much anticipated addition. *Suppose* I create a volume with a disperse set of 3, redundancy 1 (let's call them A1, A2, A3) and then I add 3 more bricks to that volume (we'll call them B1, B2, B3). *First question* - which of the bricks will be the one carrying the redundancy data? The current implementation is *non systematic* which means we don't have any dedicated parity/redundancy brick. *Second question* - If I have machines with faster disk - should I assign them to the data or the redundancy bricks? What should I expect the load to be on the redundancy machine in heavy read scenarios and in heavy write scenarios? As mentioned above, this configuration is not possible for non systematic implementation. *Third question* - *does this require reading the entire data* of A1, A2 and A3 by initiating a heal or another operation? If the configuration is 2+1 as you mentioned, you can recover the whole set of data from any two of three bricks, the algorithm provides the intelligence of constructing the chunk of data which resides in a brick which might be down for this configuration. *4th question* (and most important for me) - I saw in the list that it is now a Distributed-Dispersed volume. I understand I can now lose, for example bricks A1 and B1 and still have my entire data intact. Is this also correct for bricks from the same set, for example A1 and A2? Or to put it in a more generic way - *does this create the exact same dispersed volume as if I created it originally with A1, A2 A3 B1 B2 B3 and a redundancy of 2?* No, if you see the volume info with this configuration it will show you 2 X (2+1) which means on every set the quorum is two i.e. you need to have atleast two bricks running. Many thanks for your work and for your help on this list, Ayelet ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Log file timestamp
Is there a way to have the gluster log files date/time stamps use local timezone. This system is in EST and the times recorded in the logs are off by 5 hours. I'm assuming logs are represented as UTC/GMT. This is using Gluster 3.5.2. Thanks ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] ping_timer_expired
We have three Gluster clusters, all three of which are exhibiting the same symptom: FUSE clients report network ping timeouts from bricks, disconnect from the volume, and then very quickly re-connect to all bricks. An example from the client logs: [2014-11-20 01:19:09.079725] C [client-handshake.c:127:rpc_client_ping_timer_expired] 0-wp_uploads-client-2: server 192.168.135.37:49152 has not responded in the last 5 seconds, disconnecting. [2014-11-20 01:19:09.278701] E [rpc-clnt.c:369:saved_frames_unwind] (--/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15d) [0x38ca00fced] (--/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3) [0x38ca00f833] (--/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x38ca00f74e]))) 0-wp_uploads-client-2: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2014-11-20 01:19:03.771414 (xid=0x7bd8a6) [2014-11-20 01:19:09.278734] W [client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-wp_uploads-client-2: remote operation failed: Transport endpoint is not connected. Path: / (----0001) [2014-11-20 01:19:09.278893] E [rpc-clnt.c:369:saved_frames_unwind] (--/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15d) [0x38ca00fced] (--/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3) [0x38ca00f833] (--/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x38ca00f74e]))) 0-wp_uploads-client-2: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2014-11-20 01:19:03.771917 (xid=0x7bd8a7) [2014-11-20 01:19:09.278901] W [client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-wp_uploads-client-2: remote operation failed: Transport endpoint is not connected. Path: / (----0001) [2014-11-20 01:19:09.279008] E [rpc-clnt.c:369:saved_frames_unwind] (--/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15d) [0x38ca00fced] (--/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3) [0x38ca00f833] (--/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x38ca00f74e]))) 0-wp_uploads-client-2: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2014-11-20 01:19:04.072860 (xid=0x7bd8a8) [2014-11-20 01:19:09.279028] W [client-handshake.c:276:client_ping_cbk] 0-wp_uploads-client-2: timer must have expired [2014-11-20 01:19:09.279090] E [rpc-clnt.c:369:saved_frames_unwind] (--/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15d) [0x38ca00fced] (--/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3) [0x38ca00f833] (--/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x38ca00f74e]))) 0-wp_uploads-client-2: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2014-11-20 01:19:07.070544 (xid=0x7bd8a9) [2014-11-20 01:19:09.279099] W [client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-wp_uploads-client-2: remote operation failed: Transport endpoint is not connected. Path: / (----0001) [2014-11-20 01:19:09.287885] I [client.c:2229:client_rpc_notify] 0-wp_uploads-client-2: disconnected from 192.168.135.37:49152. Client process will keep trying to connect to glusterd until brick's port is available [2014-11-20 01:19:09.377628] I [socket.c:3060:socket_submit_request] 0-wp_uploads-client-2: not connected (priv-connected = 0) [2014-11-20 01:19:09.377669] W [rpc-clnt.c:1542:rpc_clnt_submit] 0-wp_uploads-client-2: failed to submit rpc-request (XID: 0x7bd8aa Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (wp_uploads-client-2) [2014-11-20 01:19:09.377692] W [client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-wp_uploads-client-2: remote operation failed: Transport endpoint is not connected. Path: /2014 (10537923-c903-4a34-af42-f74b9eb6cf11) [2014-11-20 01:19:09.498741] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-wp_uploads-client-2: changing port to 49152 (from 0) [2014-11-20 01:19:09.501832] I [client-handshake.c:1677:select_server_supported_programs] 0-wp_uploads-client-2: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2014-11-20 01:19:09.538700] I [client-handshake.c:1462:client_setvolume_cbk] 0-wp_uploads-client-2: Connected to 192.168.135.37:49152, attached to remote volume '/bricks/brick1/brick'. [2014-11-20 01:19:09.538718] I [client-handshake.c:1474:client_setvolume_cbk] 0-wp_uploads-client-2: Server and Client lk-version numbers are not same, reopening the fds [2014-11-20 01:19:09.548683] I [client-handshake.c:450:client_set_lk_version_cbk] 0-wp_uploads-client-2: Server lk version = 1 As you can see, the error and the resolution occur within the same second. The servers on cluster 1 are physical servers with 10G network devices. The clients for this cluster are VMs in a different subnet, so all traffic passes through a firewall. These servers and clients run Gluster 3.6.1. The servers on cluster 2 are VMs. The clients for this cluster are VMs in a different subnet, so all traffic passes through a firewall. These servers and clients run Gluster 3.6.1. The servers on cluster 3 are VMs. The clients for this cluster are the other servers in the cluster, and all
Re: [Gluster-users] locking in 3.6.1
On 11/25/14, 10:06 AM, Atin Mukherjee wrote: On 11/25/2014 07:08 PM, Scott Merrill wrote: On 11/24/14, 11:56 PM, Atin Mukherjee wrote: Can you please find/point out the first instance of the command and its associated glusterd log which failed to acquire the cluster wide lock. Can you help me identify what I should be looking for in the logs? grep for first instance of locking failed in glusterd log in the server where the command failed. smerrill@gluster2:PRODUCTION:~ sudo grep locking failed /var/log/glusterfs/etc-glusterfs-glusterd.vol.log* /var/log/glusterfs/etc-glusterfs-glusterd.vol.log-20141124:[2014-11-19 19:05:19.312168] E [glusterd-syncop.c:105:gd_collate_errors] 0-: Unlocking failed on gluster1.innova.local. Please check log file for details. smerrill@gluster3:PRODUCTION:~ sudo grep locking failed /var/log/glusterfs/etc-glusterfs-glusterd.vol.log* /var/log/glusterfs/etc-glusterfs-glusterd.vol.log-20141123:[2014-11-20 16:40:08.368442] E [glusterd-syncop.c:105:gd_collate_errors] 0-: Unlocking failed on gluster2.innova.local. Please check log file for details. smerrill@gluster1:PRODUCTION:~ sudo grep locking failed /var/log/glusterfs/etc-glust-glusterd.vol.log* /var/log/glusterfs/etc-glusterfs-glusterd.vol.log-20141124:[2014-11-22 02:10:09.795695] E [glusterd-syncop.c:105:gd_collate_errors] 0-: Unlocking failed on gluster2.innova.local. Please check log file for details. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Log file timestamp
I would like to add that all other log files /var/log/messages, /var/log/syslog, /var/log/secure, etc... have the correct entries using the local timezone. It is only the gluster log files that are off by 5 hours. Could someone point me in the right direction on configuring this?This Gluster app resides on Red Hat Enterprise Linux Server release 6.2 Thanks! On Tue, Nov 25, 2014 at 10:07 AM, Atin Mukherjee amukh...@redhat.com wrote: This has to be set up in the server based on what time format you want. I don't see any application dependency here. ~Atin On 11/25/2014 08:34 PM, Koby, Bradley wrote: Is there a way to have the gluster log files date/time stamps use local timezone. This system is in EST and the times recorded in the logs are off by 5 hours. I'm assuming logs are represented as UTC/GMT. This is using Gluster 3.5.2. Thanks ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster volume not automounted when peer is down
A much simpler answer is to assign a hostname to multiple IP addresses (round robin dns). When gethostbyname() returns multiple entries, the client will try them all until it's successful. On 11/24/2014 06:23 PM, Paul Robert Marino wrote: This is simple and can be handled in many ways. Some background first. The mount point is a single IP or host name. The only thing the client uses it for is to download a describing all the bricks in the cluster. The next thing is it opens connections to all the nodes containing bricks for that volume. So the answer is tell the client to connect to a virtual IP address. I personally use keepalived for this but you can use any one of the many IPVS Or other tools that manage IPS for this. I assign the VIP to a primary node then have each node monitor the cluster processes if they die on a node it goes into a faulted state and can not own the VIP. As long as the client are connecting to a running host in the cluster you are fine even if that host doesn't own bricks in the volume but is aware of them as part of the cluster. -- Sent from my HP Pre3 On Nov 24, 2014 8:07 PM, Eric Ewanco eric.ewa...@genband.com wrote: Hi all, We’re trying to use gluster as a replicated volume. It works OK when both peers are up but when one peer is down and the other reboots, the “surviving” peer does not automount glusterfs. Furthermore, after the boot sequence is complete, it can be mounted without issue. It automounts fine when the peer is up during startup. I tried to google this and while I found some similar issues, I haven’t found any solutions to my problem. Any insight would be appreciated. Thanks. gluster volume info output (after startup): Volume Name: rel-vol Type: Replicate Volume ID: 90cbe313-e9f9-42d9-a947-802315ab72b0 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.250.1.1:/export/brick1 Brick2: 10.250.1.2:/export/brick1 gluster peer status output (after startup): Number of Peers: 1 Hostname: 10.250.1.2 Uuid: 8d49b929-4660-4b1e-821b-bfcd6291f516 State: Peer in Cluster (Disconnected) Original volume create command: gluster volume create rel-vol rep 2 transport tcp 10.250.1.1:/export/brick1 10.250.1.2:/export/brick1 I am running Gluster 3.4.5 on OpenSuSE 12.2. gluster --version: glusterfs 3.4.5 built on Jul 25 2014 08:31:19 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. http://www.gluster.com GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. The fstab line is: localhost:/rel-vol /homeglusterfs defaults,_netdev 0 0 lsof -i :24007-24100: COMMANDPID USER FD TYPE DEVICE SIZE/OFF NODE NAME glusterd 4073 root6u IPv4 82170 0t0 TCP s1:24007-s1:1023 (ESTABLISHED) glusterd 4073 root9u IPv4 13816 0t0 TCP *:24007 (LISTEN) glusterd 4073 root 10u IPv4 88106 0t0 TCP s1:exp2-s2:24007 (SYN_SENT) glusterfs 4097 root8u IPv4 16751 0t0 TCP s1:1023-s1:24007 (ESTABLISHED) This is shorter than it is when it works, but maybe that’s because the mount spawns some more processes. Some ports are down: root@q50-s1:/root telnet localhost 24007 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. telnet close Connection closed. root@q50-s1:/root telnet localhost 24009 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... telnet: connect to address 127.0.0.1: Connection refused ps axww | fgrep glu: 4073 ?Ssl0:10 /usr/sbin/glusterd -p /run/glusterd.pid 4097 ?Ssl0:00 /usr/sbin/glusterfsd -s 10.250.1.1 --volfile-id rel-vol.10.250.1.1.export-brick1 -p /var/lib/glusterd/vols/rel-vol/run/10.250.1.1-export-brick1.pid -S /var/run/89ba432ed09e07e107723b4b266e18f9.socket --brick-name /export/brick1 -l /var/log/glusterfs/bricks/export-brick1.log --xlator-option *-posix.glusterd-uuid=3b02a581-8fb9-4c6a-8323-9463262f23bc --brick-port 49152 --xlator-option rel-vol-server.listen-port=49152 5949 ttyS0S+ 0:00 fgrep glu These are the error messages I see in /var/log/gluster/home.log (/home is the mountpoint): +--+ [2014-11-24 13:51:27.932285] E [client-handshake.c:1742:client_query_portmap_cbk] 0-rel-vol-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2014-11-24 13:51:27.932373] W [socket.c:514:__socket_rwv] 0-rel-vol-client-0: readv failed (No data available) [2014-11-24 13:51:27.932405] I [client.c:2098:client_rpc_notify] 0-rel-vol-client-0: disconnected [2014-11-24 13:51:30.818281] E
Re: [Gluster-users] snapshots fail on 3.6.1
I have a fresh install of gluster v3.6.1 on debian: # dpkg -l|grep gluster ii glusterfs-client 3.6.1-1 ii glusterfs-common 3.6.1-1 ii glusterfs-server 3.6.1-1 When I issue this command: # gluster snapshot create 20141125 testing I get this error: snapshot create: failed: Cluster operating version is lesser than the supported version for a snapshot Snapshot command failed In fact, all gluster snapshot commands fail with the same error. Here's the info on the volume: # gluster volume info testing Volume Name: testing Type: Distributed-Replicate Volume ID: 3ee751af-2160-432b-998c-e1b4742c0f20 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: IP:/data/testing Brick2: IP:/data/testing Brick3: IP:/data/testing Brick4: IP:/data/testing Can anyone give me some guidance? Thanks much ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Unable to add brick from new fresh installed server into upgraded cluster
I have a 2 server CentOS 6 replicated cluster which started out as version 3.3 and has been progressively updated and is now on version 3.6.1. Yesterday I added a new freshly installed CentOS 6.6 host and wanted to convert to replica 3 on one of my volumes, however I was unable to add the brick as it reported that all the brick hosts had to be at version 03060. Presumable some minimum compatibility is set on the volume, but I am struggling to find where. The info file under /va/lib/glusterd/vols/volname has op-version=2, client-op-version=2. Any suggestions? Thanks, Alastair ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Disable rsync compression for georeplication
Hi Kotresh, Thank you. Unfortunately, it doesn't seem to have made much difference, with rsync still hammering the CPU. regards, John On 25/11/14 20:05, Kotresh Hiremath Ravishankar wrote: Hi John, There seems to a bug in cli command line parsing which is not allowing options prefixed with double hyphen apart from few. As a workaround to fix this for now, do following steps. 1. Stop geo-replication session. 2. Add the following at the end of the file /var/lib/glusterd/geo-replication/gsyncd.conf in all master nodes. rsync_options = --compress-level=0 3. Start geo-replication session. Thanks and Regards, Kotresh H R - Original Message - From: John Gardeniers jgardeni...@objectmastery.com To: gluster-users@gluster.org Sent: Tuesday, November 25, 2014 5:24:18 AM Subject: [Gluster-users] Disable rsync compression for georeplication Using Gluster 3.4.2, we have a situation where georeplication is causing the CPU on the master to be hammered, while the bandwidth is having a holiday. I'd therefore like to disable the rsync compression for georeplication. In a response to David F. Robinson asking the same question back in August, Vishwanath Bhat wrote: You can use rsync-options to specify any rsync options that you want to rsync to use. Just make sure that it does not conflict with the default rsync options used by geo-rep. For example, #gluster volume geo-replication MASTER SLAVE config rsync-options '--bwlimit=value' Finding no way to negate the --compress option used in georeplication I thought that perhaps I could get part way there by simply winding down the compression level using gluster volume geo-replication MASTER SLAVE config rsync-options '--compress-level=0' (Yes, I did use real entries for MASTER and SLAVE.) The return from that command was unrecognized option --compress-level=0. I tried a few other numbers and got the same negative results. Can someone please advise how to disable rsync compression for georeplication? Thanks. regards, John ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users __ This email has been scanned by the Symantec Email Security.cloud service. For more information please visit http://www.symanteccloud.com __ ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster volume not automounted when peer is down
Ah now I see your problem.Have you considered setting it not to mount automatically on boot then add a simple init script which runs after the gluster services have started to mount it.That would effectively solve your problem. However generally you don't want to reboot a the second node in a 2 node cluster while the other host is down.Further more you want to be very careful about rebooting the second node after the first node comes back. Specificity you need to be sure that a full self heal has run successfully or you risk a split brain scenario. Resolving split brain scenarios are a pain!-- Sent from my HP Pre3On Nov 25, 2014 2:14 PM, Eric Ewanco eric.ewa...@genband.com wrote: Hmmm. I can see how that would work if you had an external client that wanted to connect to a cluster, and one of the nodes was down; it would need to pick another node in the cluster, yes. But thats not my scenario. My scenario is I have two nodes in a cluster, one node goes down, and if the surviving node reboots, it cannot mount the gluster volume *on itself* at startup, but mounts it fine after startup. In other words, the mount -t glusterfs -a fails (and fstab contains localhost:/rel-vol /home glusterfs defaults,_netdev 0 0) on startup, but not afterwards. I am not successfully mapping these answers to my situation -- if I change localhost to a round robin dns address with two addresses, it will either return the down node (which does us no good) or the current node, which is equivalent to localhost, and presumably it will do the same thing, wont it? Confused, Eric From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Joe Julian Sent: Tuesday, November 25, 2014 1:04 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] Gluster volume not automounted when peer is down A much simpler answer is to assign a hostname to multiple IP addresses (round robin dns). When gethostbyname() returns multiple entries, the client will try them all until it's successful. On 11/24/2014 06:23 PM, Paul Robert Marino wrote: This is simple and can be handled in many ways. Some background first. The mount point is a single IP or host name. The only thing the client uses it for is to download a describing all the bricks in the cluster. The next thing is it opens connections to all the nodes containing bricks for that volume. So the answer is tell the client to connect to a virtual IP address. I personally use keepalived for this but you can use any one of the many IPVS Or other tools that manage IPS for this. I assign the VIP to a primary node then have each node monitor the cluster processes if they die on a node it goes into a faulted state and can not own the VIP. As long as the client are connecting to a running host in the cluster you are fine even if that host doesn't own bricks in the volume but is aware of them as part of the cluster. -- Sent from my HP Pre3 On Nov 24, 2014 8:07 PM, Eric Ewanco eric.ewa...@genband.com wrote: Hi all, Were trying to use gluster as a replicated volume. It works OK when both peers are up but when one peer is down and the other reboots, the surviving peer does not automount glusterfs. Furthermore, after the boot sequence is complete, it can be mounted without issue. It automounts fine when the peer is up during startup. I tried to google this and while I found some similar issues, I havent found any solutions to my problem. Any insight would be appreciated. Thanks. gluster volume info output (after startup): Volume Name: rel-vol Type: Replicate Volume ID: 90cbe313-e9f9-42d9-a947-802315ab72b0 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.250.1.1:/export/brick1 Brick2: 10.250.1.2:/export/brick1 gluster peer status output (after startup): Number of Peers: 1 Hostname: 10.250.1.2 Uuid: 8d49b929-4660-4b1e-821b-bfcd6291f516 State: Peer in Cluster (Disconnected) Original volume create command: gluster volume create rel-vol rep 2 transport tcp 10.250.1.1:/export/brick1 10.250.1.2:/export/brick1 I am running Gluster 3.4.5 on OpenSuSE 12.2. gluster --version: glusterfs 3.4.5 built on Jul 25 2014 08:31:19 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. http://www.gluster.com GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. The fstab line is: localhost:/rel-vol /home glusterfs defaults,_netdev 0 0 lsof -i :24007-24100: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME glusterd 4073 root 6u IPv4 82170 0t0 TCP s1:24007-s1:1023 (ESTABLISHED) glusterd 4073 root 9u IPv4 13816 0t0 TCP *:24007 (LISTEN) glusterd 4073 root 10u IPv4 88106 0t0 TCP s1:exp2-s2:24007 (SYN_SENT) glusterfs 4097 root 8u IPv4 16751 0t0 TCP s1:1023-s1:24007 (ESTABLISHED) This is shorter than it is when it works, but maybe thats because the mount spawns some more processes. Some ports are down:
Re: [Gluster-users] [ovirt-users] Gluster command [UNKNOWN] failed on server...
Hi, My Glusterfs version is :- glusterfs-3.6.1-1.el7 On Wed, Nov 26, 2014 at 1:59 AM, Kanagaraj Mayilsamy kmayi...@redhat.com wrote: [+Gluster-users@gluster.org] Initialization of volume 'management' failed, review your volfile again, glusterd throws this error when the service is started automatically after the reboot. But the service is successfully started later manually by the user. can somebody from gluster-users please help on this? glusterfs version: 3.5.1 Thanks, Kanagaraj - Original Message - From: Punit Dambiwal hypu...@gmail.com To: Kanagaraj kmayi...@redhat.com Cc: us...@ovirt.org Sent: Tuesday, November 25, 2014 7:24:45 PM Subject: Re: [ovirt-users] Gluster command [UNKNOWN] failed on server... Hi Kanagraj, Please check the attached log filesi didn't find any thing special On Tue, Nov 25, 2014 at 12:12 PM, Kanagaraj kmayi...@redhat.com wrote: Do you see any errors in /var/log/glusterfs/etc-glusterfs-glusterd.vol.log or vdsm.log when the service is trying to start automatically after the reboot? Thanks, Kanagaraj On 11/24/2014 08:13 PM, Punit Dambiwal wrote: Hi Kanagaraj, Yes...once i will start the gluster service and then vdsmd ...the host can connect to cluster...but the question is why it's not started even it has chkconfig enabled... I have tested it in two host cluster environment...(Centos 6.6 and centos 7.0) on both hypervisior cluster..it's failed to reconnect in to cluster after reboot In both the environment glusterd enabled for next bootbut it's failed with the same errorseems it's bug in either gluster or Ovirt ?? Please help me to find the workaround here if can not resolve it...as without this the Host machine can not connect after rebootthat means engine will consider it as down and every time need to manually start the gluster service and vdsmd... ?? Thanks, Punit On Mon, Nov 24, 2014 at 10:20 PM, Kanagaraj kmayi...@redhat.com wrote: From vdsm.log error: Connection failed. Please check if gluster daemon is operational. Starting glusterd service should fix this issue. 'service glusterd start' But i am wondering why the glusterd was not started automatically after the reboot. Thanks, Kanagaraj On 11/24/2014 07:18 PM, Punit Dambiwal wrote: Hi Kanagaraj, Please find the attached VDSM logs :- Thread-13::DEBUG::2014-11-24 21:41:17,182::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-11-24 21:41:17,182::task::993::Storage.TaskManager.Task::(_decref) Task=`1691d409-9b27-4585-8281-5ec26154367a`::ref 0 aborting False Thread-13::DEBUG::2014-11-24 21:41:32,393::task::595::Storage.TaskManager.Task::(_updateState) Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state init - state preparing Thread-13::INFO::2014-11-24 21:41:32,393::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-13::INFO::2014-11-24 21:41:32,393::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-13::DEBUG::2014-11-24 21:41:32,393::task::1191::Storage.TaskManager.Task::(prepare) Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::finished: {} Thread-13::DEBUG::2014-11-24 21:41:32,394::task::595::Storage.TaskManager.Task::(_updateState) Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state preparing - state finished Thread-13::DEBUG::2014-11-24 21:41:32,394::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-13::DEBUG::2014-11-24 21:41:32,394::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-11-24 21:41:32,394::task::993::Storage.TaskManager.Task::(_decref) Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::ref 0 aborting False Thread-13::DEBUG::2014-11-24 21:41:41,550::BindingXMLRPC::1132::vds::(wrapper) client [10.10.10.2]::call getCapabilities with () {} Thread-13::DEBUG::2014-11-24 21:41:41,553::utils::738::root::(execCmd) /sbin/ip route show to 0.0.0.0/0 table all (cwd None) Thread-13::DEBUG::2014-11-24 21:41:41,560::utils::758::root::(execCmd) SUCCESS: err = ''; rc = 0 Thread-13::DEBUG::2014-11-24 21:41:41,588::caps::728::root::(_getKeyPackages) rpm package ('gluster-swift',) not found Thread-13::DEBUG::2014-11-24 21:41:41,592::caps::728::root::(_getKeyPackages) rpm package ('gluster-swift-object',) not found Thread-13::DEBUG::2014-11-24 21:41:41,593::caps::728::root::(_getKeyPackages) rpm package ('gluster-swift-plugin',) not found Thread-13::DEBUG::2014-11-24 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package
Re: [Gluster-users] No such file or directory in logs
On 11/25/2014 05:59 AM, Derick Turner wrote: Gluster version is standard Ubuntu 14.04 LTS repo version - glusterfs 3.4.2 built on Jan 14 2014 18:05:37 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. http://www.gluster.com GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. The gluster volume heal volume info command produces a lot of output. There are a number of gfid:hashnumber entries for both nodes and a few directories in the list as well. Checking the directories on both nodes and the files appear to be the same on each so I resolved those issues. There are, however, still a large number of gfid files listed from the gluster volume heal eukleia info command. There are also a large number of gfid files and one file listed from the gluster volume heal eukleia info split-brain and one file. This file no longer exists on either of the bricks or the mounted filesystems. Is there any way to clear these down or resolve this? Could you check how many files are reported for the following command's output? This command needs to be executed on the brick inside .glusterfs: find /your/brick/directory/.glusterfs -links 1 -type f All such files need to be deleted/renamed to some other place I guess. Pranith Thanks Derick On 24/11/14 05:32, Pranith Kumar Karampuri wrote: On 11/21/2014 05:33 AM, Derick Turner wrote: I have a new set up which has been running for a few weeks. Due to a configuration issue the self heal wasnt working properly and I ended up with the system in a bit of a state. Ive been chasing down issues and it should (fingers crossed) be back and stable again. One issue wich seems to be re-occurring is that on one of the client bricks I get a load of gfids don't exist anywhere else. The inodes of these files only point to the gfid file and it appears that they keep coming back. Volume is set up as such root@vader:/gluster/eukleiahome/intertrust/moodledata# gluster volume info eukleiaweb Volume Name: eukleiaweb Type: Replicate Volume ID: d8a29f07-7f3e-46a3-9ec4-4281038267ce Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: lando:/gluster/eukleiahome Brick2: vader:/gluster/eukleiahome and the file systems are mounting via NFS. In the logs of the host for Brick one I get the following (e.g.) [2014-11-20 23:53:55.910705] W [client-rpc-fops.c:471:client3_3_open_cbk] 0-eukleiaweb-client-1: remote operation failed: No such file or directory. Path: gfid:e5d25375-ecb8-47d2-833f-0586b659f98a (----) [2014-11-20 23:53:55.910721] E [afr-self-heal-data.c:1270:afr_sh_data_open_cbk] 0-eukleiaweb-replicate-0: open of gfid:e5d25375-ecb8-47d2-833f-0586b659f98a failed on child eukleiaweb-client-1 (No such file or directory) [2014-11-20 23:53:55.921425] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-eukleiaweb-client-1: remote operation failed: No such file or directory when I check this gfid out it exists on Brick 1 but not on Brick 2 (which I am assuming is due to the error above). Additionally when I check for the file that this GFID references it doesn't go anywhere. I.e. - Which version of gluster are you using? Could you check if there are any directories that need to be healed, using gluster volume heal volname info? Pranith root@lando:/gluster/eukleiahome# find . -samefile .glusterfs/e5/d2/e5d25375-ecb8-47d2-833f-0586b659f98a ./.glusterfs/e5/d2/e5d25375-ecb8-47d2-833f-0586b659f98a root@lando:/gluster/eukleiahome# file .glusterfs/e5/d2/e5d25375-ecb8-47d2-833f-0586b659f98a .glusterfs/e5/d2/e5d25375-ecb8-47d2-833f-0586b659f98a: JPEG image data, EXIF standard I have tried removing these files using rm .glusterfs/e5/d2/e5d25375-ecb8-47d2-833f-0586b659f98a but eitherall of the occurrences haven't been logged in /var/log/glusterfs/glusterfsd.log (as I am clearing out all that I can find) or they are re-appearing. Firstly, is this something to worry about? Secondly, should I be able to simply get rid of them (and I'm being mistaken about them re-appearing) and if so, is simply removing them the best method? Thanks Derick ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Log file timestamp
+Kaleb who implemented this feature to see if there is a way to change it. Pranith On 11/25/2014 09:04 PM, Koby, Bradley wrote: I would like to add that all other log files /var/log/messages, /var/log/syslog, /var/log/secure, etc... have the correct entries using the local timezone. It is only the gluster log files that are off by 5 hours. Could someone point me in the right direction on configuring this? This Gluster app resides on Red Hat Enterprise Linux Server release 6.2 Thanks! On Tue, Nov 25, 2014 at 10:07 AM, Atin Mukherjee amukh...@redhat.com mailto:amukh...@redhat.com wrote: This has to be set up in the server based on what time format you want. I don't see any application dependency here. ~Atin On 11/25/2014 08:34 PM, Koby, Bradley wrote: Is there a way to have the gluster log files date/time stamps use local timezone. This system is in EST and the times recorded in the logs are off by 5 hours. I'm assuming logs are represented as UTC/GMT. This is using Gluster 3.5.2. Thanks ___ Gluster-users mailing list Gluster-users@gluster.org mailto:Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] ZFS and Snapshots
+Kiran to check if he knows anything about this. Pranith On 11/26/2014 02:17 AM, Kiebzak, Jason M. wrote: I'm running ZFS . It appears that Gluster Snapshots require LVM. I've spent the last hour googling this, and it doesn't seem like the two can be mixed -- that is Gluster Snapshots and ZFS. Has anyone attempted to write a wrapper for ZFS to mimic LVM, and thus fool gluster into thinking that LVM is installed? Thanks ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster volume not automounted when peer is down
Could be this bug, https://bugzilla.redhat.com/show_bug.cgi?id=1168080 Regards, Poornima - Original Message - From: Eric Ewanco eric.ewa...@genband.com To: gluster-users@gluster.org Sent: Wednesday, November 26, 2014 12:43:56 AM Subject: Re: [Gluster-users] Gluster volume not automounted when peer is down Hmmm. I can see how that would work if you had an external client that wanted to connect to a cluster, and one of the nodes was down; it would need to pick another node in the cluster, yes. But that’s not my scenario. My scenario is I have two nodes in a cluster, one node goes down, and if the surviving node reboots, it cannot mount the gluster volume * on itself * at startup, but mounts it fine after startup. In other words, the mount -t glusterfs -a fails (and fstab contains localhost:/rel-vol /home glusterfs defaults,_netdev 0 0) on startup, but not afterwards. I am not successfully “mapping” these answers to my situation -- if I change “localhost” to a round robin dns address with two addresses, it will either return the down node (which does us no good) or the current node, which is equivalent to localhost, and presumably it will do the same thing, won’t it? Confused, Eric From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Joe Julian Sent: Tuesday, November 25, 2014 1:04 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] Gluster volume not automounted when peer is down A much simpler answer is to assign a hostname to multiple IP addresses (round robin dns). When gethostbyname() returns multiple entries, the client will try them all until it's successful. On 11/24/2014 06:23 PM, Paul Robert Marino wrote: This is simple and can be handled in many ways. Some background first. The mount point is a single IP or host name. The only thing the client uses it for is to download a describing all the bricks in the cluster. The next thing is it opens connections to all the nodes containing bricks for that volume. So the answer is tell the client to connect to a virtual IP address. I personally use keepalived for this but you can use any one of the many IPVS Or other tools that manage IPS for this. I assign the VIP to a primary node then have each node monitor the cluster processes if they die on a node it goes into a faulted state and can not own the VIP. As long as the client are connecting to a running host in the cluster you are fine even if that host doesn't own bricks in the volume but is aware of them as part of the cluster. -- Sent from my HP Pre3 On Nov 24, 2014 8:07 PM, Eric Ewanco eric.ewa...@genband.com wrote: Hi all, We’re trying to use gluster as a replicated volume. It works OK when both peers are up but when one peer is down and the other reboots, the “surviving” peer does not automount glusterfs. Furthermore, after the boot sequence is complete, it can be mounted without issue. It automounts fine when the peer is up during startup. I tried to google this and while I found some similar issues, I haven’t found any solutions to my problem. Any insight would be appreciated. Thanks. gluster volume info output (after startup): Volume Name: rel-vol Type: Replicate Volume ID: 90cbe313-e9f9-42d9-a947-802315ab72b0 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.250.1.1:/export/brick1 Brick2: 10.250.1.2:/export/brick1 gluster peer status output (after startup): Number of Peers: 1 Hostname: 10.250.1.2 Uuid: 8d49b929-4660-4b1e-821b-bfcd6291f516 State: Peer in Cluster (Disconnected) Original volume create command: gluster volume create rel-vol rep 2 transport tcp 10.250.1.1:/export/brick1 10.250.1.2:/export/brick1 I am running Gluster 3.4.5 on OpenSuSE 12.2. gluster --version: glusterfs 3.4.5 built on Jul 25 2014 08:31:19 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. http://www.gluster.com GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. The fstab line is: localhost:/rel-vol /home glusterfs defaults,_netdev 0 0 lsof -i :24007-24100: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME glusterd 4073 root 6u IPv4 82170 0t0 TCP s1:24007-s1:1023 (ESTABLISHED) glusterd 4073 root 9u IPv4 13816 0t0 TCP *:24007 (LISTEN) glusterd 4073 root 10u IPv4 88106 0t0 TCP s1:exp2-s2:24007 (SYN_SENT) glusterfs 4097 root 8u IPv4 16751 0t0 TCP s1:1023-s1:24007 (ESTABLISHED) This is shorter than it is when it works, but maybe that’s because the mount spawns some more processes. Some ports are down: root@q50-s1:/root telnet localhost 24007 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. telnet close Connection
Re: [Gluster-users] ZFS and Snapshots
- Original Message - From: Jason M. Kiebzak jk3...@cumc.columbia.edu To: gluster-users@gluster.org gluster-users@gluster.org Sent: Wednesday, November 26, 2014 2:17:28 AM Subject: [Gluster-users] ZFS and Snapshots I’m running ZFS . It appears that Gluster Snapshots require LVM. I’ve spent the last hour googling this, and it doesn’t seem like the two can be mixed – that is Gluster Snapshots and ZFS. Yes, the current Gluster snapshot implementation is based on LVM. Has anyone attempted to write a wrapper for ZFS to mimic LVM, and thus fool gluster into thinking that LVM is installed? One of our community member, Prakash (CCed), is currently working on to provide snapshot support for ZFS. He can provide you more information about the same. Best Regards, Rajesh ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] snapshots fail on 3.6.1
On 11/26/2014 01:07 AM, Kiebzak, Jason M. wrote: I have a fresh install of gluster v3.6.1 on debian: # dpkg -l|grep gluster ii glusterfs-client 3.6.1-1 ii glusterfs-common 3.6.1-1 ii glusterfs-server 3.6.1-1 When I issue this command: # gluster snapshot create 20141125 testing I get this error: snapshot create: failed: Cluster operating version is lesser than the supported version for a snapshot Snapshot command failed In fact, all gluster snapshot commands fail with the same error. Here's the info on the volume: # gluster volume info testing Volume Name: testing Type: Distributed-Replicate Volume ID: 3ee751af-2160-432b-998c-e1b4742c0f20 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: IP:/data/testing Brick2: IP:/data/testing Brick3: IP:/data/testing Brick4: IP:/data/testing Can anyone give me some guidance? Is it a fresh install or you had your cluster running with older bits? If you have upgraded the cluster from older version then you need to manually bump up the op-version with the following command: gluster volume set all cluster.op-version 30600 ~Atin Thanks much ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster volume not automounted when peer is down
Just wanted to clarify one thing here. If you have a 2 node cluster and one of the node is down and other is rebooted, the daemons wouldn't get started until and unless there is no peer in the cluster or atleast a friend update (happens during handshake) is received. This is to ensure the node doesn't spawn daemons with having stale data as it was down. In this case the node will never receive a friend update as the other only node in the cluster is down. 'gluster volume start force' can bypass this check and can start the daemons, but consistency is not guaranteed. ~Atin On 11/26/2014 09:52 AM, Poornima Gurusiddaiah wrote: Could be this bug, https://bugzilla.redhat.com/show_bug.cgi?id=1168080 Regards, Poornima - Original Message - From: Eric Ewanco eric.ewa...@genband.com To: gluster-users@gluster.org Sent: Wednesday, November 26, 2014 12:43:56 AM Subject: Re: [Gluster-users] Gluster volume not automounted when peer is down Hmmm. I can see how that would work if you had an external client that wanted to connect to a cluster, and one of the nodes was down; it would need to pick another node in the cluster, yes. But that’s not my scenario. My scenario is I have two nodes in a cluster, one node goes down, and if the surviving node reboots, it cannot mount the gluster volume * on itself * at startup, but mounts it fine after startup. In other words, the mount -t glusterfs -a fails (and fstab contains localhost:/rel-vol /home glusterfs defaults,_netdev 0 0) on startup, but not afterwards. I am not successfully “mapping” these answers to my situation -- if I change “localhost” to a round robin dns address with two addresses, it will either return the down node (which does us no good) or the current node, which is equivalent to localhost, and presumably it will do the same thing, won’t it? Confused, Eric From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Joe Julian Sent: Tuesday, November 25, 2014 1:04 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] Gluster volume not automounted when peer is down A much simpler answer is to assign a hostname to multiple IP addresses (round robin dns). When gethostbyname() returns multiple entries, the client will try them all until it's successful. On 11/24/2014 06:23 PM, Paul Robert Marino wrote: This is simple and can be handled in many ways. Some background first. The mount point is a single IP or host name. The only thing the client uses it for is to download a describing all the bricks in the cluster. The next thing is it opens connections to all the nodes containing bricks for that volume. So the answer is tell the client to connect to a virtual IP address. I personally use keepalived for this but you can use any one of the many IPVS Or other tools that manage IPS for this. I assign the VIP to a primary node then have each node monitor the cluster processes if they die on a node it goes into a faulted state and can not own the VIP. As long as the client are connecting to a running host in the cluster you are fine even if that host doesn't own bricks in the volume but is aware of them as part of the cluster. -- Sent from my HP Pre3 On Nov 24, 2014 8:07 PM, Eric Ewanco eric.ewa...@genband.com wrote: Hi all, We’re trying to use gluster as a replicated volume. It works OK when both peers are up but when one peer is down and the other reboots, the “surviving” peer does not automount glusterfs. Furthermore, after the boot sequence is complete, it can be mounted without issue. It automounts fine when the peer is up during startup. I tried to google this and while I found some similar issues, I haven’t found any solutions to my problem. Any insight would be appreciated. Thanks. gluster volume info output (after startup): Volume Name: rel-vol Type: Replicate Volume ID: 90cbe313-e9f9-42d9-a947-802315ab72b0 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.250.1.1:/export/brick1 Brick2: 10.250.1.2:/export/brick1 gluster peer status output (after startup): Number of Peers: 1 Hostname: 10.250.1.2 Uuid: 8d49b929-4660-4b1e-821b-bfcd6291f516 State: Peer in Cluster (Disconnected) Original volume create command: gluster volume create rel-vol rep 2 transport tcp 10.250.1.1:/export/brick1 10.250.1.2:/export/brick1 I am running Gluster 3.4.5 on OpenSuSE 12.2. gluster --version: glusterfs 3.4.5 built on Jul 25 2014 08:31:19 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. http://www.gluster.com GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License.