[Gluster-users] How to auto-sync the new brick, after it has been added to a replica volume?

2014-11-07 Thread 董鑫
Hi Everyone,

I have a replica volume, with two bricks:

Volume Name: c1
Type: Replicate
Status: Start
Number of Bircks: 1X2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.0.1:/data
Brick2: 192.168.0.2:/data

I don't set any options an this volume. There has one mount point an 
192.168.0.4:/mnt/c1

on 192.168.0.4:
ls /mnt/c1
file1 file2 file3

Now, I want add a brick(192.168.0.3:/data) to the volume c1, let the replica 
count from 2 to 3:

gluster volume add-brick c1 replica 3 192.168.0.3:/data

then, see the volume info, it works:

Volume Name: c1
Type: Replicate
Status: Start
Number of Bircks: 1X3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.0.1:/data
Brick2: 192.168.0.2:/data
Brick3: 192.168.0.3:/data

Well, all about these is OK, but when I see the 192.168.0.3's /data, there has 
nothing for a long time...

When I see the mount point an 192.168.0.4's /mnt/c1, the files begin transport 
to the 192.168.0.3's /data.

Although use this way can sync to the new brick, but it is not smart. How can 
it auto-sync to the new brick?



Thanks in advance for any info or tips!

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster 3.3 replace-brick

2014-11-07 Thread Juan José Pavlik Salles
Hi guys, we added a new node to the cluster we have so I used replace-brick
to move a brick from node 4 to the new node 5. Everything was going great
but the  operation was completed before moving all the files it was
supposed to move. It barely moved 1TB from 2.3TB it should have moved.

What should I do? Abort the replacement? Commit force it and let self heal
move the rest of the files?

Regards

-- 
Pavlik Salles Juan José
Blog - http://viviendolared.blogspot.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Installing glusterfs 3.6 client on EL7

2014-11-07 Thread Alastair Neil
I'm looking for repos to install on EL7 since  RH only supplies 3.4.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Installing glusterfs 3.6 client on EL7

2014-11-07 Thread Volnei Puttini

Hi,

You can download from:

http://download.gluster.org/pub/gluster/glusterfs/LATEST/


On 07-11-2014 15:21, Alastair Neil wrote:

I'm looking for repos to install on EL7 since  RH only supplies 3.4.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Installing glusterfs 3.6 client on EL7

2014-11-07 Thread Volnei Puttini

Sorry.

I use this repository, didn't pay attention to the version you want.
However, unless you really need the version 3.6for a particular reason, 
the version 3.5.2 works

very well.
I use on CentOS 7 without problems.


On 07-11-2014 16:43, Alastair Neil wrote:

That only has 3.5.2, I was hoping to find 3.6


On 7 November 2014 12:34, Volnei Puttini vol...@vcplinux.com.br 
mailto:vol...@vcplinux.com.br wrote:


Hi,

You can download from:

http://download.gluster.org/pub/gluster/glusterfs/LATEST/


On 07-11-2014 15:21, Alastair Neil wrote:

I'm looking for repos to install on EL7 since  RH only supplies 3.4.


___
Gluster-users mailing list
Gluster-users@gluster.org  mailto:Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users





___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.3 replace-brick

2014-11-07 Thread Juan José Pavlik Salles
What I finally did was:

-commit force the replace-brick operation and the brick was changed.
-to force healing I ran find /mnt/gvol/ -print0 | xargs --null stat
/dev/null;

If the source disk dies I still have the old copy on the replaced brick,
but going through a full directory listing is quite a time/resource
consuming task.

2014-11-07 9:50 GMT-03:00 Juan José Pavlik Salles jjpav...@gmail.com:

 Hi guys, we added a new node to the cluster we have so I used
 replace-brick to move a brick from node 4 to the new node 5. Everything was
 going great but the  operation was completed before moving all the files it
 was supposed to move. It barely moved 1TB from 2.3TB it should have moved.

 What should I do? Abort the replacement? Commit force it and let self heal
 move the rest of the files?

 Regards

 --
 Pavlik Salles Juan José
 Blog - http://viviendolared.blogspot.com




-- 
Pavlik Salles Juan José
Blog - http://viviendolared.blogspot.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Geo-replication fails on self.slave.server.set_stime() with OSError: [Errno 2] No such file or directory

2014-11-07 Thread Morten Johansen
Hi, list

We’re having some issues with geo-replication, which I _think_ are related to 
delete operations.
Sometimes the replication goes into faulty state, and then after a while comes 
back again.
Changelog change detection fails, and it falls back to xsync. The slave volume 
does not replicate deleted files.

My research led me to this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1073844

The bug lists a traceback which is very similar to the one we’re seeing in our 
logs.

We’re running version 3.5.2, which has this bug fix in it, and inspecting the 
master.py file on our actual servers confirms we do have this patch: 
http://review.gluster.org/#/c/7207/2/geo-replication/syncdaemon/master.py

In our case, something fails in the call on the line BEFORE the patched one, 
i.e. the call to self.slave.server.set_stime() on line 152 in master.py

This is an example traceback from our logs:
SNIP
[2014-11-07 12:47:07.516124] I [master(/media/slot2/geotest):1124:crawl] 
_GMaster: starting hybrid crawl...
[2014-11-07 12:47:07.518146] I [master(/media/slot2/geotest):1133:crawl] 
_GMaster: processing xsync changelog 
/var/run/gluster/geotest/ssh%3A%2F%2Froot%4010.32.0.101%3Agluster%3A%2F%2F127.0.0.1%3Ageotest/d531d53915b53c130ad434b5295ebf7c/xsync/XSYNC-CHANGELOG.1415360827
[2014-11-07 12:47:07.520725] E 
[syncdutils(/media/slot2/geotest):240:log_raise_exception] top: FAIL:
Traceback (most recent call last):
File /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py, line 150, in main
  main_i()
File /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py, line 542, in main_i
  local.service_loop(*[r for r in [remote] if r])
File /usr/libexec/glusterfs/python/syncdaemon/resource.py, line 1177, in 
service_loop
  g2.crawlwrap()
File /usr/libexec/glusterfs/python/syncdaemon/master.py, line 467, in 
crawlwrap
  self.crawl()
File /usr/libexec/glusterfs/python/syncdaemon/master.py, line 1137, in crawl
  self.upd_stime(item[1][1], item[1][0])
File /usr/libexec/glusterfs/python/syncdaemon/master.py, line 884, in 
upd_stime
  self.sendmark(path, stime)
File /usr/libexec/glusterfs/python/syncdaemon/master.py, line 658, in sendmark
  self.set_slave_xtime(path, mark)
File /usr/libexec/glusterfs/python/syncdaemon/master.py, line 152, in 
set_slave_xtime
  self.slave.server.set_stime(path, self.uuid, mark)
File /usr/libexec/glusterfs/python/syncdaemon/resource.py, line 1163, in 
lambda
  slave.server.set_stime = types.MethodType(lambda _self, path, uuid, mark: 
brickserver.set_stime(path, uuid + '.' + gconf.slave_id, mark), slave.server)
File /usr/libexec/glusterfs/python/syncdaemon/resource.py, line 299, in ff
  return f(*a)
File /usr/libexec/glusterfs/python/syncdaemon/resource.py, line 496, in 
set_stime
  Xattr.lsetxattr(path, '.'.join([cls.GX_NSPACE, uuid, 'stime']), 
struct.pack('!II', *mark))
File /usr/libexec/glusterfs/python/syncdaemon/libcxattr.py, line 66, in 
lsetxattr
  cls.raise_oserr()
File /usr/libexec/glusterfs/python/syncdaemon/libcxattr.py, line 25, in 
raise_oserr
  raise OSError(errn, os.strerror(errn))
OSError: [Errno 2] No such file or directory
[2014-11-07 12:47:07.522511] I [syncdutils(/media/slot2/geotest):192:finalize] 
top: exiting.
/SNIP

Any ideas on this one? What breaks if I comment out line 152 too?
Any quick fixes on this would be much appreciated.

Best regards,

-- 
Morten Johansen
Systems developer, Cerum AS
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] NFS not start on localhost

2014-11-07 Thread Sven Achtelik
Hi everyone,

I’m facing the exact same issue on my installation. Nfs.log entries indicate 
that something is blocking  the gluster nfs from registering with 
rpcbind.

[root@ovirt-one ~]# rpcinfo -p
   program vers proto   port  service
104   tcp111  portmapper
103   tcp111  portmapper
102   tcp111  portmapper
104   udp111  portmapper
103   udp111  portmapper
102   udp111  portmapper
153   tcp  38465  mountd
151   tcp  38466  mountd
133   tcp   2049  nfs
1002273   tcp   2049  nfs_acl
1000213   udp  34343  nlockmgr
1000214   udp  34343  nlockmgr
1000213   tcp  54017  nlockmgr
1000214   tcp  54017  nlockmgr
1000241   udp  39097  status
1000241   tcp  53471  status
1000211   udp715  nlockmgr

I’m sure that I’m not using the system NFS Server and I didn’t mount any nfs 
share.

@Tibor: Did you solve that issue somehow ?

Best,

Sven



Hi,



Thank you for you reply.



I did your recommendations, but there are no changes.



In the nfs.log there are no new things.





[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
glusterfs]# reboot

Connection to 172.16.0.10 closed by remote host.

Connection to 172.16.0.10 closed.

[tdemeter at 
sirius-31http://supercolony.gluster.org/mailman/listinfo/gluster-users ~]$ 
ssh root at 
172.16.0.10http://supercolony.gluster.org/mailman/listinfo/gluster-users

root at 
172.16.0.10http://supercolony.gluster.org/mailman/listinfo/gluster-users's 
password:

Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106

[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# systemctl status nfs.target

nfs.target - Network File System Server

   Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled)

   Active: inactive (dead)



[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# gluster volume status engine

Status of volume: engine

Gluster process  PortOnlinePid

--

Brick gs00.itsmart.cloud:/gluster/engine050160   Y3271

Brick gs01.itsmart.cloud:/gluster/engine150160   Y595

NFS Server on localhost N/A N   N/A

Self-heal Daemon on localhostN/A Y3286

NFS Server on gs01.itsmart.cloud 2049Y6951

Self-heal Daemon on gs01.itsmart.cloud   N/A Y6958



Task Status of Volume engine

--

There are no active volume tasks



[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# systemctl status

Display all 262 possibilities? (y or n)

[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# systemctl status nfs-lock

nfs-lock.service - NFS file locking service.

   Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled)

   Active: inactive (dead)



[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# systemctl stop nfs-lock

[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# systemctl restart gluster

glusterd.serviceglusterfsd.service  gluster.mount

[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# systemctl restart gluster

glusterd.serviceglusterfsd.service  gluster.mount

[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# systemctl restart glusterfsd.service

[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# systemctl restart glusterd.service

[root at node0http://supercolony.gluster.org/mailman/listinfo/gluster-users 
~]# gluster volume status engine

Status of volume: engine

Gluster process  PortOnlinePid

--

Brick gs00.itsmart.cloud:/gluster/engine050160   Y5140

Brick gs01.itsmart.cloud:/gluster/engine150160   Y2037

NFS Server on localhost N/A N   N/A

Self-heal Daemon on localhostN/A NN/A

NFS Server on gs01.itsmart.cloud 2049Y6951

Self-heal Daemon on gs01.itsmart.cloud   N/A Y6958





Any other idea?



Tibor

















- Eredeti üzenet -

 On Mon, Oct 20, 2014 at 09:04:2.8AM +0200, Demeter Tibor wrote:

  Hi,

 

  This is the full nfs.log after delete  reboot.

  It is refers to portmap registering problem.

 

  [root at 
  

Re: [Gluster-users] NFS not start on localhost

2014-11-07 Thread Jason Russler
I've run into this as well. After installing hosted-engine for ovirt on a 
gluster volume. The only way to get things working again for me was to manually 
de-register (rpcinfo -d ...) nlockmgr from the portmapper and then restart 
glusterd. Then gluster's NFS successfully registers. I don't really get what's 
going on though.

- Original Message -
From: Sven Achtelik sven.achte...@mailpool.us
To: gluster-users@gluster.org
Sent: Friday, November 7, 2014 5:28:32 PM
Subject: Re: [Gluster-users] NFS not start on localhost



Hi everyone, 



I’m facing the exact same issue on my installation. Nfs.log entries indicate 
that something is blocking the gluster nfs from registering with rpcbind. 



[root@ovirt-one ~]# rpcinfo -p 

program vers proto port service 

10 4 tcp 111 portmapper 

10 3 tcp 111 portmapper 

10 2 tcp 111 portmapper 

10 4 udp 111 portmapper 

10 3 udp 111 portmapper 

10 2 udp 111 portmapper 

15 3 tcp 38465 mountd 

15 1 tcp 38466 mountd 

13 3 tcp 2049 nfs 

100227 3 tcp 2049 nfs_acl 

100021 3 udp 34343 nlockmgr 

100021 4 udp 34343 nlockmgr 

100021 3 tcp 54017 nlockmgr 

100021 4 tcp 54017 nlockmgr 

100024 1 udp 39097 status 

100024 1 tcp 53471 status 

100021 1 udp 715 nlockmgr 



I’m sure that I’m not using the system NFS Server and I didn’t mount any nfs 
share. 



@Tibor: Did you solve that issue somehow ? 



Best, 



Sven 






Hi, 
Thank you for you reply. 
I did your recommendations, but there are no changes. 
In the nfs.log there are no new things. 
[ root at node0 glusterfs]# reboot 
Connection to 172.16.0.10 closed by remote host. 
Connection to 172.16.0.10 closed. 
[ tdemeter at sirius-31 ~]$ ssh root at 172.16.0.10 
root at 172.16.0.10 's password: 
Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106 
[ root at node0 ~]# systemctl status nfs.target 
nfs.target - Network File System Server 
Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled) 
Active: inactive (dead) 
[ root at node0 ~]# gluster volume status engine 
Status of volume: engine 
Gluster process  Port    Online    Pid 
-- 
Brick gs00.itsmart.cloud:/gluster/engine0    50160   Y    3271 
Brick gs01.itsmart.cloud:/gluster/engine1    50160   Y    595 
NFS Server on localhost N/A N   N/A 
Self-heal Daemon on localhost    N/A Y    3286 
NFS Server on gs01.itsmart.cloud 2049    Y    6951 
Self-heal Daemon on gs01.itsmart.cloud   N/A Y    6958 
Task Status of Volume engine 
-- 
There are no active volume tasks 
[ root at node0 ~]# systemctl status 
Display all 262 possibilities? (y or n) 
[ root at node0 ~]# systemctl status nfs-lock 
nfs-lock.service - NFS file locking service. 
Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled) 
Active: inactive (dead) 
[ root at node0 ~]# systemctl stop nfs-lock 
[ root at node0 ~]# systemctl restart gluster 
glusterd.service    glusterfsd.service  gluster.mount 
[ root at node0 ~]# systemctl restart gluster 
glusterd.service    glusterfsd.service  gluster.mount 
[ root at node0 ~]# systemctl restart glusterfsd.service 
[ root at node0 ~]# systemctl restart glusterd.service 
[ root at node0 ~]# gluster volume status engine 
Status of volume: engine 
Gluster process  Port    Online    Pid 
-- 
Brick gs00.itsmart.cloud:/gluster/engine0    50160   Y    5140 
Brick gs01.itsmart.cloud:/gluster/engine1    50160   Y    2037 
NFS Server on localhost N/A N   N/A 
Self-heal Daemon on localhost    N/A N    N/A 
NFS Server on gs01.itsmart.cloud 2049    Y    6951 
Self-heal Daemon on gs01.itsmart.cloud   N/A Y    6958 
Any other idea? 
Tibor 
- Eredeti üzenet - 
 On Mon, Oct 20, 2014 at 09:04:2.8AM +0200, Demeter Tibor wrote: 
  Hi, 
  
  This is the full nfs.log after delete  reboot. 
  It is refers to portmap registering problem. 
  
  [ root at node0 glusterfs]# cat nfs.log 
  [2014-10-20 06:48:43.221136] I [glusterfsd.c:1959:main] 
  0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.2 
  (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p 
  /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S 
  /var/run/567e0bba7ad7102eae3049e2ad6c3ed7.socket) 
  [2014-10-20 06:48:43.22] I [socket.c:3561:socket_init] 
  0-socket.glusterfsd: SSL support is NOT enabled 
  [2014-10-20 06:48:43.224475] I [socket.c:3576:socket_init] 
  0-socket.glusterfsd: using system polling thread 
  

Re: [Gluster-users] NFS not start on localhost

2014-11-07 Thread Niels de Vos
On Fri, Nov 07, 2014 at 07:51:47PM -0500, Jason Russler wrote:
 I've run into this as well. After installing hosted-engine for ovirt
 on a gluster volume. The only way to get things working again for me
 was to manually de-register (rpcinfo -d ...) nlockmgr from the
 portmapper and then restart glusterd. Then gluster's NFS successfully
 registers. I don't really get what's going on though.

Is this on RHEL/CentOS 7? A couple of days back someone on IRC had an
issue with this as well. We found out that rpcbind.service uses the
-w option by default (for warm-restart). Registered services are
written to a cache file, and upon reboot those services get
re-registered automatically, even when not running.

The solution was something like this:

# cp /usr/lib/systemd/system/rpcbind.service /etc/systemd/system/
* edit /etc/systemd/system/rpcbind.service and remove the -w
  option
# systemctl daemon-reload
# systemctl restart rpcbind.service
# systemctl restart glusterd.service

I am not sure why -w was added by default, but it doen not seem to
play nice with Gluster/NFS. Gluster/NFS does not want to break other
registered services, so it bails out when something is registered
already.

HTH,
Niels

 
 - Original Message -
 From: Sven Achtelik sven.achte...@mailpool.us
 To: gluster-users@gluster.org
 Sent: Friday, November 7, 2014 5:28:32 PM
 Subject: Re: [Gluster-users] NFS not start on localhost
 
 
 
 Hi everyone, 
 
 
 
 I’m facing the exact same issue on my installation. Nfs.log entries indicate 
 that something is blocking the gluster nfs from registering with rpcbind. 
 
 
 
 [root@ovirt-one ~]# rpcinfo -p 
 
 program vers proto port service 
 
 10 4 tcp 111 portmapper 
 
 10 3 tcp 111 portmapper 
 
 10 2 tcp 111 portmapper 
 
 10 4 udp 111 portmapper 
 
 10 3 udp 111 portmapper 
 
 10 2 udp 111 portmapper 
 
 15 3 tcp 38465 mountd 
 
 15 1 tcp 38466 mountd 
 
 13 3 tcp 2049 nfs 
 
 100227 3 tcp 2049 nfs_acl 
 
 100021 3 udp 34343 nlockmgr 
 
 100021 4 udp 34343 nlockmgr 
 
 100021 3 tcp 54017 nlockmgr 
 
 100021 4 tcp 54017 nlockmgr 
 
 100024 1 udp 39097 status 
 
 100024 1 tcp 53471 status 
 
 100021 1 udp 715 nlockmgr 
 
 
 
 I’m sure that I’m not using the system NFS Server and I didn’t mount any nfs 
 share. 
 
 
 
 @Tibor: Did you solve that issue somehow ? 
 
 
 
 Best, 
 
 
 
 Sven 
 
 
 
 
 
 
 Hi, 
 Thank you for you reply. 
 I did your recommendations, but there are no changes. 
 In the nfs.log there are no new things. 
 [ root at node0 glusterfs]# reboot 
 Connection to 172.16.0.10 closed by remote host. 
 Connection to 172.16.0.10 closed. 
 [ tdemeter at sirius-31 ~]$ ssh root at 172.16.0.10 
 root at 172.16.0.10 's password: 
 Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106 
 [ root at node0 ~]# systemctl status nfs.target 
 nfs.target - Network File System Server 
 Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled) 
 Active: inactive (dead) 
 [ root at node0 ~]# gluster volume status engine 
 Status of volume: engine 
 Gluster process  Port    Online    
 Pid 
 --
  
 Brick gs00.itsmart.cloud:/gluster/engine0    50160   Y    3271 
 Brick gs01.itsmart.cloud:/gluster/engine1    50160   Y    595 
 NFS Server on localhost N/A N   
 N/A 
 Self-heal Daemon on localhost    N/A Y    3286 
 NFS Server on gs01.itsmart.cloud 2049    Y    6951 
 Self-heal Daemon on gs01.itsmart.cloud   N/A Y    6958 
 Task Status of Volume engine 
 --
  
 There are no active volume tasks 
 [ root at node0 ~]# systemctl status 
 Display all 262 possibilities? (y or n) 
 [ root at node0 ~]# systemctl status nfs-lock 
 nfs-lock.service - NFS file locking service. 
 Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled) 
 Active: inactive (dead) 
 [ root at node0 ~]# systemctl stop nfs-lock 
 [ root at node0 ~]# systemctl restart gluster 
 glusterd.service    glusterfsd.service  gluster.mount 
 [ root at node0 ~]# systemctl restart gluster 
 glusterd.service    glusterfsd.service  gluster.mount 
 [ root at node0 ~]# systemctl restart glusterfsd.service 
 [ root at node0 ~]# systemctl restart glusterd.service 
 [ root at node0 ~]# gluster volume status engine 
 Status of volume: engine 
 Gluster process  Port    Online    
 Pid 
 --
  
 Brick gs00.itsmart.cloud:/gluster/engine0    50160   Y    5140 
 Brick gs01.itsmart.cloud:/gluster/engine1    50160   Y    2037 
 NFS Server on localhost N/A N   
 N/A 
 Self-heal Daemon 

[Gluster-users] [Gluster-devel] info heal-failed shown as gfid

2014-11-07 Thread Peter Auyeung
I have a node down while gfs still open for writing.

Got tons of heal-failed on a replicated volume showing as gfid.

Tried gfid-resolver and got the following:

# ./gfid-resolver.sh /brick02/gfs/ 88417c43-7d0f-4ec5-8fcd-f696617b5bc1
88417c43-7d0f-4ec5-8fcd-f696617b5bc1==File:11/07/14 18:47:19 [ 
/root/scripts ]

Any one has clue how to resolve and fix these heal-failed entries??


# gluster volume heal sas02 info heal-failed
Gathering list of heal failed entries on volume sas02 has been successful

Brick glusterprod001.bo.shopzilla.sea:/brick02/gfs
Number of entries: 0

Brick glusterprod002.bo.shopzilla.sea:/brick02/gfs
Number of entries: 1024
atpath on brick
---
2014-11-08 01:22:23 gfid:88417c43-7d0f-4ec5-8fcd-f696617b5bc1
2014-11-08 01:22:23 gfid:d0d9e3aa-45ef-4a5c-8ccb-1a5859ff658c
2014-11-08 01:22:23 gfid:81a8867c-f0ef-4390-ad84-a4b1b3393842
2014-11-08 01:22:23 gfid:f7ed8fd8-f8bc-4c0c-8803-98a8e31ee46e
2014-11-08 01:22:23 gfid:df451eec-e3e3-4e9a-9cb4-4991c68f0d62
2014-11-08 01:22:23 gfid:4e759749-bf6c-4355-af48-17ca64c20428
2014-11-08 01:22:23 gfid:72c5a979-3106-4130-b3a2-32a18cef7a39
2014-11-08 01:22:23 gfid:15db8683-733c-4af4-ab97-e35af581c06f
2014-11-08 01:22:23 gfid:ff86e3b4-b7ad-46a6-aa97-2437679a4c7f
2014-11-08 01:22:23 gfid:574223c0-b705-4672-b387-6c5aa7f63d31
2014-11-08 01:22:23 gfid:ca522f4f-3fc7-4dfa-823f-888e2269085e
2014-11-08 01:22:23 gfid:56feab84-b4fa-414a-8173-1472913ed50c
2014-11-08 01:22:23 gfid:c66bfc5e-8631-4422-aca4-c9f8f99c2ef2
2014-11-08 01:22:23 gfid:69a20ce2-27dc-41fa-9883-83c1a2842ba8
2014-11-08 01:22:23 gfid:1ab58d44-570e-4ac2-82b6-568dcfe9e530
2014-11-08 01:22:23 gfid:face55f7-e8dc-48e9-a3fb-1aa126f0e829
2014-11-08 01:22:23 gfid:a8fb721d-aaae-4bef-8fcc-d7da950a8c2f
2014-11-08 01:22:23 gfid:45a35318-d208-4bb2-9826-3b37083ae4ab
2014-11-08 01:22:23 gfid:d8174f82-15b2-4318-8953-2873c38bb628
2014-11-08 01:22:23 gfid:689c26f7-2442-4c84-be98-e75a81d916ac
2014-11-08 01:22:23 gfid:acd5e0b4-e12b-4af2-878e-1d197ef90061
2014-11-08 01:22:23 gfid:90260d6f-cfe2-4949-a7e1-15c1cc3a914d
2014-11-08 01:22:23 gfid:5bdd3c63-1873-4cfc-89c3-7ac21b9bb684
2014-11-08 01:22:23 gfid:48b3f29b-fd3c-4147-b849-bac5ec3dd22b
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] NFS not start on localhost

2014-11-07 Thread Jason Russler
Thanks, Niels. Yes, CentOS7. It's been driving me nuts. Much better.

- Original Message -
From: Niels de Vos nde...@redhat.com
To: Jason Russler jruss...@redhat.com
Cc: Sven Achtelik sven.achte...@mailpool.us, gluster-users@gluster.org
Sent: Friday, November 7, 2014 9:32:11 PM
Subject: Re: [Gluster-users] NFS not start on localhost

On Fri, Nov 07, 2014 at 07:51:47PM -0500, Jason Russler wrote:
 I've run into this as well. After installing hosted-engine for ovirt
 on a gluster volume. The only way to get things working again for me
 was to manually de-register (rpcinfo -d ...) nlockmgr from the
 portmapper and then restart glusterd. Then gluster's NFS successfully
 registers. I don't really get what's going on though.

Is this on RHEL/CentOS 7? A couple of days back someone on IRC had an
issue with this as well. We found out that rpcbind.service uses the
-w option by default (for warm-restart). Registered services are
written to a cache file, and upon reboot those services get
re-registered automatically, even when not running.

The solution was something like this:

# cp /usr/lib/systemd/system/rpcbind.service /etc/systemd/system/
* edit /etc/systemd/system/rpcbind.service and remove the -w
  option
# systemctl daemon-reload
# systemctl restart rpcbind.service
# systemctl restart glusterd.service

I am not sure why -w was added by default, but it doen not seem to
play nice with Gluster/NFS. Gluster/NFS does not want to break other
registered services, so it bails out when something is registered
already.

HTH,
Niels

 
 - Original Message -
 From: Sven Achtelik sven.achte...@mailpool.us
 To: gluster-users@gluster.org
 Sent: Friday, November 7, 2014 5:28:32 PM
 Subject: Re: [Gluster-users] NFS not start on localhost
 
 
 
 Hi everyone, 
 
 
 
 I’m facing the exact same issue on my installation. Nfs.log entries indicate 
 that something is blocking the gluster nfs from registering with rpcbind. 
 
 
 
 [root@ovirt-one ~]# rpcinfo -p 
 
 program vers proto port service 
 
 10 4 tcp 111 portmapper 
 
 10 3 tcp 111 portmapper 
 
 10 2 tcp 111 portmapper 
 
 10 4 udp 111 portmapper 
 
 10 3 udp 111 portmapper 
 
 10 2 udp 111 portmapper 
 
 15 3 tcp 38465 mountd 
 
 15 1 tcp 38466 mountd 
 
 13 3 tcp 2049 nfs 
 
 100227 3 tcp 2049 nfs_acl 
 
 100021 3 udp 34343 nlockmgr 
 
 100021 4 udp 34343 nlockmgr 
 
 100021 3 tcp 54017 nlockmgr 
 
 100021 4 tcp 54017 nlockmgr 
 
 100024 1 udp 39097 status 
 
 100024 1 tcp 53471 status 
 
 100021 1 udp 715 nlockmgr 
 
 
 
 I’m sure that I’m not using the system NFS Server and I didn’t mount any nfs 
 share. 
 
 
 
 @Tibor: Did you solve that issue somehow ? 
 
 
 
 Best, 
 
 
 
 Sven 
 
 
 
 
 
 
 Hi, 
 Thank you for you reply. 
 I did your recommendations, but there are no changes. 
 In the nfs.log there are no new things. 
 [ root at node0 glusterfs]# reboot 
 Connection to 172.16.0.10 closed by remote host. 
 Connection to 172.16.0.10 closed. 
 [ tdemeter at sirius-31 ~]$ ssh root at 172.16.0.10 
 root at 172.16.0.10 's password: 
 Last login: Mon Oct 20 11:02:13 2014 from 192.168.133.106 
 [ root at node0 ~]# systemctl status nfs.target 
 nfs.target - Network File System Server 
 Loaded: loaded (/usr/lib/systemd/system/nfs.target; disabled) 
 Active: inactive (dead) 
 [ root at node0 ~]# gluster volume status engine 
 Status of volume: engine 
 Gluster process  Port    Online    
 Pid 
 --
  
 Brick gs00.itsmart.cloud:/gluster/engine0    50160   Y    3271 
 Brick gs01.itsmart.cloud:/gluster/engine1    50160   Y    595 
 NFS Server on localhost N/A N   
 N/A 
 Self-heal Daemon on localhost    N/A Y    3286 
 NFS Server on gs01.itsmart.cloud 2049    Y    6951 
 Self-heal Daemon on gs01.itsmart.cloud   N/A Y    6958 
 Task Status of Volume engine 
 --
  
 There are no active volume tasks 
 [ root at node0 ~]# systemctl status 
 Display all 262 possibilities? (y or n) 
 [ root at node0 ~]# systemctl status nfs-lock 
 nfs-lock.service - NFS file locking service. 
 Loaded: loaded (/usr/lib/systemd/system/nfs-lock.service; enabled) 
 Active: inactive (dead) 
 [ root at node0 ~]# systemctl stop nfs-lock 
 [ root at node0 ~]# systemctl restart gluster 
 glusterd.service    glusterfsd.service  gluster.mount 
 [ root at node0 ~]# systemctl restart gluster 
 glusterd.service    glusterfsd.service  gluster.mount 
 [ root at node0 ~]# systemctl restart glusterfsd.service 
 [ root at node0 ~]# systemctl restart glusterd.service 
 [ root at node0 ~]# gluster volume status engine 
 Status of volume: engine 
 Gluster process  Port    Online