Can you post the client logs also? There should be a filename corresponding to the mountpoint of the gluster volume on the client.

Since you are running a replicate volume, you could try shutting down gluster on each of the servers in turn and seeing if the write block only occurs on one of them.

Did fuse get updated as part of your debian update? Perhaps you are hitting a fuse bug? Do you have another system running the previous version of debian you were using to validate it can coonnect/write to the gluster volume properly? Did you recompile your gluster fuse libraries following the update, so they are built against the version of fuse you are running?

On 2/5/12 3:49 PM, Stefan Becker wrote:
- no ip tables involved
- the server is running 3.2.0 as well, of course I could upgrade but this 
probably means some downtime which I can not afford right now
- did not find something in the logs, but there are a lot of files so I might 
miss something, logs on the client or server side?
- some debug flags or verbose logging possible?

the bricks log on the server side just says "client connected" so there is not 
a lot of value in that. On the client side I have the following:

[2012-02-05 19:38:53.324172] I [fuse-bridge.c:3214:fuse_thread_proc] 0-fuse: 
unmounting /home/XXXstorage
[2012-02-05 19:38:53.324221] I [glusterfsd.c:712:cleanup_and_exit] 
0-glusterfsd: shutting down
[2012-02-05 19:38:58.709783] W [write-behind.c:3023:init] 
0-XXXstorage-write-behind: disabling write-behind for first 0 bytes
[2012-02-05 19:38:58.711289] I [client.c:1935:notify] 0-XXXstorage-client-0: 
parent translators are ready, attempting connect on transport
[2012-02-05 19:38:58.711489] I [client.c:1935:notify] 0-XXXstorage-client-1: 
parent translators are ready, attempting connect on transport
Given volfile:
+------------------------------------------------------------------------------+
   1: volume XXXstorage-client-0
   2:     type protocol/client
   3:     option remote-host 10.10.100.40
   4:     option remote-subvolume /brick1
   5:     option transport-type tcp
   6: end-volume
   7:
   8: volume XXXstorage-client-1
   9:     type protocol/client
  10:     option remote-host 10.10.100.41
  11:     option remote-subvolume /brick1
  12:     option transport-type tcp
  13: end-volume
  14:
  15: volume XXXstorage-replicate-0
  16:     type cluster/replicate
  17:     subvolumes XXXstorage-client-0 XXXstorage-client-1
  18: end-volume
  19:
  20: volume XXXstorage-write-behind
  21:     type performance/write-behind
  22:     subvolumes XXXstorage-replicate-0
  23: end-volume
  24:
  25: volume XXXstorage-read-ahead
  26:     type performance/read-ahead
  27:     subvolumes XXXstorage-write-behind
  28: end-volume
  29:
  30: volume XXXstorage-io-cache
  31:     type performance/io-cache
  32:     subvolumes XXXstorage-read-ahead
  33: end-volume
  34:
  35: volume XXXstorage-stat-prefetch
  36:     type performance/stat-prefetch
  37:     subvolumes XXXstorage-io-cache
  38: end-volume
  39:
  40: volume XXXstorage
  41:     type debug/io-stats
  42:     option latency-measurement off
  43:     option count-fop-hits off
  44:     subvolumes XXXstorage-stat-prefetch
  45: end-volume

+------------------------------------------------------------------------------+
[2012-02-05 19:38:58.712460] I [rpc-clnt.c:1531:rpc_clnt_reconfig] 
0-XXXstorage-client-1: changing port to 24015 (from 0)
[2012-02-05 19:38:58.712527] I [rpc-clnt.c:1531:rpc_clnt_reconfig] 
0-XXXstorage-client-0: changing port to 24012 (from 0)
[2012-02-05 19:39:02.709882] I 
[client-handshake.c:1080:select_server_supported_programs] 
0-XXXstorage-client-1: Using Program GlusterFS-3.1.0, Num (1298437), Version 
(310)
[2012-02-05 19:39:02.710112] I 
[client-handshake.c:1080:select_server_supported_programs] 
0-XXXstorage-client-0: Using Program GlusterFS-3.1.0, Num (1298437), Version 
(310)
[2012-02-05 19:39:02.710355] I [client-handshake.c:913:client_setvolume_cbk] 
0-XXXstorage-client-1: Connected to 10.10.100.41:24015, attached to remote 
volume '/brick1'.
[2012-02-05 19:39:02.710395] I [afr-common.c:2514:afr_notify] 
0-XXXstorage-replicate-0: Subvolume 'XXXstorage-client-1' came back up; going 
online.
[2012-02-05 19:39:02.712314] I [fuse-bridge.c:3316:fuse_graph_setup] 0-fuse: 
switched to graph 0
[2012-02-05 19:39:02.712387] I [client-handshake.c:913:client_setvolume_cbk] 
0-XXXstorage-client-0: Connected to 10.10.100.40:24012, attached to remote 
volume '/brick1'.
[2012-02-05 19:39:02.712436] I [fuse-bridge.c:2897:fuse_init] 0-glusterfs-fuse: 
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13
[2012-02-05 19:39:02.713253] I [afr-common.c:836:afr_fresh_lookup_cbk] 
0-XXXstorage-replicate-0: added root inode

I cannot see any problems. I was tailing a few logs while I issued a write 
which hangs. Nothing gets logged.

-----Ursprüngliche Nachricht-----
Von: Whit Blauvelt [mailto:[email protected]]
Gesendet: Sonntag, 5. Februar 2012 21:25
An: Brian Candler
Cc: Stefan Becker; [email protected]
Betreff: Re: [Gluster-users] Hanging writes after upgrading "clients" to debian 
squeeze

On Sun, Feb 05, 2012 at 07:36:55PM +0000, Brian Candler wrote:
On Sun, Feb 05, 2012 at 08:02:08PM +0100, Stefan Becker wrote:
    After the debian upgrade I can
    still mount my volumes. Reading is fine as well but it hangs on writes.
Could it be that on the post-upgrade machines one brick is reachable but not
the other?  Compare iptables rules between the pre-upgrade and post-upgrade
machines?  Compare tcpdump or ntop between them?
If you can, try dropping iptables out of the picture entirely. If you are
running it, and have it logging what it drops, the docs say "Ensure that TCP
ports 111, 24007,24008, 24009-(24009 + number of bricks across all volumes)
are open on all Gluster servers. If you will be using NFS, open additional
ports 38465 to 38467." So I'd check your logs to see if iptables is dropping
any traffic to/from the IPs in question on those ports.

Or us "netstat -tc" while doing some file operations, and you should see the
traffic on the IPs/ports. Another utility to see the same thing is "iptraf."

Whit    
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to