Re: [Gluster-devel] missing files

2015-02-18 Thread Shyam
I have not made any progress on the internal systems, post Pranith's 
investigations on the inode release causing this slowness on a aged 
volume, due to other priorities.


Need to get back on track with this one, let me discuss this with 
Pranith and see how best to move ahead with the same.


Shyam

On 02/17/2015 04:50 PM, David F. Robinson wrote:

Any updates on this issue?  Thanks in advance...

David


-- Original Message --
From: Shyam srang...@redhat.com
To: David F. Robinson david.robin...@corvidtec.com; Justin Clift
jus...@gluster.org
Cc: Gluster Devel gluster-devel@gluster.org
Sent: 2/11/2015 10:02:09 PM
Subject: Re: [Gluster-devel] missing files


On 02/11/2015 08:28 AM, David F. Robinson wrote:

My base filesystem has 40-TB and the tar takes 19 minutes. I copied
over 10-TB and it took the tar extraction from 1-minute to 7-minutes.

My suspicion is that it is related to number of files and not
necessarily file size. Shyam is looking into reproducing this
behavior on a redhat system.


I am able to reproduce the issue on a similar setup internally (at
least at the surface it seems to be similar to what David is facing).

I will continue the investigation for the root cause.

Shyam



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-17 Thread David F. Robinson

Any updates on this issue?  Thanks in advance...

David


-- Original Message --
From: Shyam srang...@redhat.com
To: David F. Robinson david.robin...@corvidtec.com; Justin Clift 
jus...@gluster.org

Cc: Gluster Devel gluster-devel@gluster.org
Sent: 2/11/2015 10:02:09 PM
Subject: Re: [Gluster-devel] missing files


On 02/11/2015 08:28 AM, David F. Robinson wrote:
My base filesystem has 40-TB and the tar takes 19 minutes. I copied 
over 10-TB and it took the tar extraction from 1-minute to 7-minutes.


My suspicion is that it is related to number of files and not 
necessarily file size. Shyam is looking into reproducing this behavior 
on a redhat system.


I am able to reproduce the issue on a similar setup internally (at 
least at the surface it seems to be similar to what David is facing).


I will continue the investigation for the root cause.

Shyam


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-12 Thread Pranith Kumar Karampuri


On 02/12/2015 03:05 PM, Pranith Kumar Karampuri wrote:


On 02/12/2015 09:14 AM, Justin Clift wrote:

On 12 Feb 2015, at 03:02, Shyam srang...@redhat.com wrote:

On 02/11/2015 08:28 AM, David F. Robinson wrote:
My base filesystem has 40-TB and the tar takes 19 minutes. I copied 
over 10-TB and it took the tar extraction from 1-minute to 7-minutes.


My suspicion is that it is related to number of files and not 
necessarily file size. Shyam is looking into reproducing this 
behavior on a redhat system.
I am able to reproduce the issue on a similar setup internally (at 
least at the surface it seems to be similar to what David is facing).


I will continue the investigation for the root cause.
Here is the initial analysis of my investigation: (Thanks for 
providing me with the setup shyam, keep the setup we may need it for 
further analysis)


On bad volume:
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
calls Fop
 -   ---   ---   --- 


  0.00   0.00 us   0.00 us   0.00 us 937104 FORGET
  0.00   0.00 us   0.00 us   0.00 us 872478 RELEASE
  0.00   0.00 us   0.00 us   0.00 us 23668 RELEASEDIR
  0.00  41.86 us  23.00 us  86.00 us 92 STAT
  0.01  39.40 us  24.00 us 104.00 us 218 STATFS
  0.28  55.99 us  43.00 us1152.00 us 4065 SETXATTR
  0.58  56.89 us  25.00 us4505.00 us 8236 OPENDIR
  0.73  26.80 us  11.00 us 257.00 us 22238 FLUSH
  0.77 152.83 us  92.00 us8819.00 us 4065 RMDIR
  2.57  62.00 us  21.00 us 409.00 us 33643 WRITE
  5.46 199.16 us 108.00 us  469938.00 us 22238 UNLINK
  6.70  69.83 us  43.00 us.00 us 77809 LOOKUP
  6.97 447.60 us  21.00 us   54875.00 us 12631 READDIRP
  7.73  79.42 us  33.00 us1535.00 us 78909 SETATTR
 14.112815.00 us 176.00 us 2106305.00 us 4065 MKDIR
 54.091972.62 us 138.00 us 1520773.00 us 22238 CREATE

On good volume:
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
calls Fop
 -   ---   ---   --- 


  0.00   0.00 us   0.00 us   0.00 us 58870 FORGET
  0.00   0.00 us   0.00 us   0.00 us 66016 RELEASE
  0.00   0.00 us   0.00 us   0.00 us 16480 RELEASEDIR
  0.00  61.50 us  58.00 us  65.00 us 2OPEN
  0.01  39.56 us  16.00 us 112.00 us 71 STAT
  0.02  41.29 us  27.00 us  79.00 us 163 STATFS
  0.03  36.06 us  17.00 us  98.00 us 301 FSTAT
  0.79  62.38 us  39.00 us 269.00 us 4065 SETXATTR
  1.14 242.99 us  25.00 us   28636.00 us 1497 READ
  1.54  59.76 us  25.00 us6325.00 us 8236 OPENDIR
  1.70 133.75 us  89.00 us 374.00 us 4065 RMDIR
  2.25  32.65 us  15.00 us 265.00 us 22006 FLUSH
  3.37 265.05 us 172.00 us2349.00 us 4065 MKDIR
  7.14  68.34 us  21.00 us   21902.00 us 33357 WRITE
 11.00 159.68 us 107.00 us2567.00 us 22003 UNLINK
 13.82 200.54 us 133.00 us   21762.00 us 22003 CREATE
 17.85 448.85 us  22.00 us   54046.00 us 12697 READDIRP
 18.37  76.12 us  45.00 us 294.00 us 77044 LOOKUP
 20.95  85.54 us  35.00 us1404.00 us 78204 SETATTR

As we can see here, FORGET/RELEASE are way more in the brick from full 
volume compared to the brick from empty volume. It seems to suggest 
that the inode-table on the volume with lots of data is carrying too 
many passive inodes in the table which need to be displaced to create 
new ones. Need to check if they come in the fop-path. Need to continue 
my investigations further, will let you know.
Just to increase confidence performed one more test. Stopped the volumes 
and re-started. Now on both the volumes, the numbers are almost same:


[root@gqac031 gluster-mount]# time rm -rf boost_1_57_0 ; time tar xf 
boost_1_57_0.tar.gz


real1m15.074s
user0m0.550s
sys 0m4.656s

real2m46.866s
user0m5.347s
sys 0m16.047s

[root@gqac031 gluster-mount]# cd /gluster-emptyvol/
[root@gqac031 gluster-emptyvol]# ls
boost_1_57_0.tar.gz
[root@gqac031 gluster-emptyvol]# time tar xf boost_1_57_0.tar.gz

real2m31.467s
user0m5.475s
sys 0m15.471s

gqas015.sbu.lab.eng.bos.redhat.com:testvol on /gluster-mount type 
fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
gqas015.sbu.lab.eng.bos.redhat.com:emotyvol on /gluster-emptyvol type 
fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)


Pranith


Pranith

Thanks Shyam. :)

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: 

Re: [Gluster-devel] missing files

2015-02-12 Thread Pranith Kumar Karampuri


On 02/12/2015 09:14 AM, Justin Clift wrote:

On 12 Feb 2015, at 03:02, Shyam srang...@redhat.com wrote:

On 02/11/2015 08:28 AM, David F. Robinson wrote:

My base filesystem has 40-TB and the tar takes 19 minutes. I copied over 10-TB 
and it took the tar extraction from 1-minute to 7-minutes.

My suspicion is that it is related to number of files and not necessarily file 
size. Shyam is looking into reproducing this behavior on a redhat system.

I am able to reproduce the issue on a similar setup internally (at least at the 
surface it seems to be similar to what David is facing).

I will continue the investigation for the root cause.
Here is the initial analysis of my investigation: (Thanks for providing 
me with the setup shyam, keep the setup we may need it for further analysis)


On bad volume:
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
calls Fop
 -   ---   ---   --- 


  0.00   0.00 us   0.00 us   0.00 us 937104  FORGET
  0.00   0.00 us   0.00 us   0.00 us 872478 RELEASE
  0.00   0.00 us   0.00 us   0.00 us  23668 
RELEASEDIR

  0.00  41.86 us  23.00 us  86.00 us 92STAT
  0.01  39.40 us  24.00 us 104.00 us 218  STATFS
  0.28  55.99 us  43.00 us1152.00 us 4065SETXATTR
  0.58  56.89 us  25.00 us4505.00 us 8236 OPENDIR
  0.73  26.80 us  11.00 us 257.00 us 22238   FLUSH
  0.77 152.83 us  92.00 us8819.00 us 4065   RMDIR
  2.57  62.00 us  21.00 us 409.00 us 33643   WRITE
  5.46 199.16 us 108.00 us  469938.00 us 22238  UNLINK
  6.70  69.83 us  43.00 us.00 us 77809  LOOKUP
  6.97 447.60 us  21.00 us   54875.00 us 12631READDIRP
  7.73  79.42 us  33.00 us1535.00 us 78909 SETATTR
 14.112815.00 us 176.00 us 2106305.00 us 4065   MKDIR
 54.091972.62 us 138.00 us 1520773.00 us 22238  CREATE

On good volume:
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of 
calls Fop
 -   ---   ---   --- 


  0.00   0.00 us   0.00 us   0.00 us 58870  FORGET
  0.00   0.00 us   0.00 us   0.00 us 66016 RELEASE
  0.00   0.00 us   0.00 us   0.00 us  16480 
RELEASEDIR

  0.00  61.50 us  58.00 us  65.00 us 2OPEN
  0.01  39.56 us  16.00 us 112.00 us 71STAT
  0.02  41.29 us  27.00 us  79.00 us 163  STATFS
  0.03  36.06 us  17.00 us  98.00 us 301   FSTAT
  0.79  62.38 us  39.00 us 269.00 us 4065SETXATTR
  1.14 242.99 us  25.00 us   28636.00 us 1497READ
  1.54  59.76 us  25.00 us6325.00 us 8236 OPENDIR
  1.70 133.75 us  89.00 us 374.00 us 4065   RMDIR
  2.25  32.65 us  15.00 us 265.00 us 22006   FLUSH
  3.37 265.05 us 172.00 us2349.00 us 4065   MKDIR
  7.14  68.34 us  21.00 us   21902.00 us 33357   WRITE
 11.00 159.68 us 107.00 us2567.00 us 22003  UNLINK
 13.82 200.54 us 133.00 us   21762.00 us 22003  CREATE
 17.85 448.85 us  22.00 us   54046.00 us 12697READDIRP
 18.37  76.12 us  45.00 us 294.00 us 77044  LOOKUP
 20.95  85.54 us  35.00 us1404.00 us 78204 SETATTR

As we can see here, FORGET/RELEASE are way more in the brick from full 
volume compared to the brick from empty volume. It seems to suggest that 
the inode-table on the volume with lots of data is carrying too many 
passive inodes in the table which need to be displaced to create new 
ones. Need to check if they come in the fop-path. Need to continue my 
investigations further, will let you know.


Pranith

Thanks Shyam. :)

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-12 Thread David F. Robinson

Shyam,

You asked me to stop/start the slow volume to see if it fixed the timing 
issue.  I stopped/started homegfs_backup (the production volume with 40+ 
TB) and it didn't make it faster.  I didn't stop/start the fast volume 
to see if it made it slower.  I just did  that and sent out an email.  I 
saw a similar result as Pranith.


however, I tried this test below and saw no issues.  So, i don't know 
why restart the older volume of test3brick slowed it down but the test 
below shows no slowdown.



#... Create 2-new bricks
gluster volume create test4brick 
gfsib01bkp.corvidtec.com:/data/brick01bkp/test4brick 
gfsib01bkp.corvidtec.com:/data/brick02bkp/test4brick
gluster volume create test5brick 
gfsib01bkp.corvidtec.com:/data/brick01bkp/test5brick 
gfsib01bkp.corvidtec.com:/data/brick02bkp/test5brick

gluster volume start test4brick
gluster volume start test5brick

mount /test4brick
mount /test5brick

cp /root/boost_1_57_0.tar /test4brick
cp /root/boost_1_57_0.tar /test5brick

#... Stop/start test4brick to see if this causes a timing issue
umount /test4brick
gluster volume stop test4brick
gluster volume start test4brick
mount /test4brick


#... Run test on both new bricks
cd /test4brick
time tar -xPf boost_1_57_0.tar; time rm -rf boost_1_57_0

real1m29.712s
user0m0.415s
sys 0m2.772s

real0m18.866s
user0m0.087s
sys 0m0.556s

cd /test5brick
time tar -xPf boost_1_57_0.tar; time rm -rf boost_1_57_0

real 1m28.243s
user 0m0.366s
sys 0m2.502s

real 0m18.193s
user 0m0.075s
sys 0m0.543s

#... Repeat again after stop/start of test4brick
umount /test4brick
gluster volume stop test4brick
gluster volume start test4brick
mount /test4brick
cd /test4brick
time tar -xPf boost_1_57_0.tar; time rm -rf boost_1_57_0

real1m25.277s
user0m0.466s
sys 0m3.107s

real0m16.575s
user0m0.084s
sys 0m0.577s

-- Original Message --
From: Shyam srang...@redhat.com
To: Pranith Kumar Karampuri pkara...@redhat.com; Justin Clift 
jus...@gluster.org
Cc: Gluster Devel gluster-devel@gluster.org; David F. Robinson 
david.robin...@corvidtec.com

Sent: 2/12/2015 10:46:14 AM
Subject: Re: [Gluster-devel] missing files


On 02/12/2015 06:22 AM, Pranith Kumar Karampuri wrote:


On 02/12/2015 03:05 PM, Pranith Kumar Karampuri wrote:


On 02/12/2015 09:14 AM, Justin Clift wrote:

On 12 Feb 2015, at 03:02, Shyam srang...@redhat.com wrote:

On 02/11/2015 08:28 AM, David F. Robinson wrote:
Just to increase confidence performed one more test. Stopped the 
volumes

and re-started. Now on both the volumes, the numbers are almost same:

[root@gqac031 gluster-mount]# time rm -rf boost_1_57_0 ; time tar xf
boost_1_57_0.tar.gz

real 1m15.074s
user 0m0.550s
sys 0m4.656s

real 2m46.866s
user 0m5.347s
sys 0m16.047s

[root@gqac031 gluster-mount]# cd /gluster-emptyvol/
[root@gqac031 gluster-emptyvol]# ls
boost_1_57_0.tar.gz
[root@gqac031 gluster-emptyvol]# time tar xf boost_1_57_0.tar.gz

real 2m31.467s
user 0m5.475s
sys 0m15.471s

gqas015.sbu.lab.eng.bos.redhat.com:testvol on /gluster-mount type
fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
gqas015.sbu.lab.eng.bos.redhat.com:emotyvol on /gluster-emptyvol type
fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)


If I remember right, we performed a similar test on David's setup, but 
I believe there was no significant performance gain there. David could 
you clarify?


Just so we know where we are headed :)

Shyam


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-12 Thread Shyam

On 02/12/2015 11:18 AM, David F. Robinson wrote:

Shyam,

You asked me to stop/start the slow volume to see if it fixed the timing
issue.  I stopped/started homegfs_backup (the production volume with 40+
TB) and it didn't make it faster.  I didn't stop/start the fast volume
to see if it made it slower.  I just did  that and sent out an email.  I
saw a similar result as Pranith.


Just to be clear even after restart of the slow volume, we see ~19 
minutes for the tar to complete, correct?


Versus, on the fast volume it is anywhere between 00:55 - 3:00 minutes, 
irrespective of start, fresh create, etc. correct?


Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-12 Thread Justin Clift
On 12 Feb 2015, at 11:22, Pranith Kumar Karampuri pkara...@redhat.com wrote:
snip
 Just to increase confidence performed one more test. Stopped the volumes and 
 re-started. Now on both the volumes, the numbers are almost same:

Oh.  So it's a problem that turns up after a certain amount of
activity has happened on a volume?

eg a lot of intensive activity would show up quickly, but a
   less intense amount of activity would take longer to show
   the effect

Kaleb's long running cluster might be useful to catch this
kind of thing in future, depending on the workload running on
it, and the kind of pre/post tests we run. (eg to catch performance
regressions)

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-12 Thread David F. Robinson
That is very interesting.  I tried this test and received a similar 
result.  Start/stopping the volume causes a timing issue on the blank 
volume.  It seems like there is some parameter getting set when you 
create a volume and gets reset when you start/stop a volume.  Or, 
something gets set during the start/stop operation that causes the 
problem.  Is there a way to list all parameters that are set for a 
volume?  gluster volume info only shows the ones that the user has 
changed from defaults.


[root@gfs01bkp ~]# gluster volume stop test3brick
Stopping volume will make its data inaccessible. Do you want to 
continue? (y/n) y

volume stop: test3brick: success
[root@gfs01bkp ~]# gluster volume start test3brick
volume start: test3brick: success
[root@gfs01bkp ~]# mount /test3brick
[root@gfs01bkp ~]# cd /test3brick/
[root@gfs01bkp test3brick]# date; time tar -xPf boost_1_57_0.tar ; time 
rm -rf boost_1_57_0

Thu Feb 12 10:42:43 EST 2015

real3m46.002s
user0m0.421s
sys 0m2.812s

real0m15.406s
user0m0.092s
sys 0m0.549s


-- Original Message --
From: Pranith Kumar Karampuri pkara...@redhat.com
To: Justin Clift jus...@gluster.org; Shyam srang...@redhat.com
Cc: Gluster Devel gluster-devel@gluster.org; David F. Robinson 
david.robin...@corvidtec.com

Sent: 2/12/2015 6:22:23 AM
Subject: Re: [Gluster-devel] missing files



On 02/12/2015 03:05 PM, Pranith Kumar Karampuri wrote:


On 02/12/2015 09:14 AM, Justin Clift wrote:

On 12 Feb 2015, at 03:02, Shyam srang...@redhat.com wrote:

On 02/11/2015 08:28 AM, David F. Robinson wrote:
My base filesystem has 40-TB and the tar takes 19 minutes. I copied 
over 10-TB and it took the tar extraction from 1-minute to 
7-minutes.


My suspicion is that it is related to number of files and not 
necessarily file size. Shyam is looking into reproducing this 
behavior on a redhat system.
I am able to reproduce the issue on a similar setup internally (at 
least at the surface it seems to be similar to what David is 
facing).


I will continue the investigation for the root cause.
Here is the initial analysis of my investigation: (Thanks for 
providing me with the setup shyam, keep the setup we may need it for 
further analysis)


On bad volume:
 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
 - --- --- ---  
  0.00 0.00 us 0.00 us 0.00 us 937104 FORGET
  0.00 0.00 us 0.00 us 0.00 us 872478 RELEASE
  0.00 0.00 us 0.00 us 0.00 us 23668 RELEASEDIR
  0.00 41.86 us 23.00 us 86.00 us 92 STAT
  0.01 39.40 us 24.00 us 104.00 us 218 STATFS
  0.28 55.99 us 43.00 us 1152.00 us 4065 SETXATTR
  0.58 56.89 us 25.00 us 4505.00 us 8236 OPENDIR
  0.73 26.80 us 11.00 us 257.00 us 22238 FLUSH
  0.77 152.83 us 92.00 us 8819.00 us 4065 RMDIR
  2.57 62.00 us 21.00 us 409.00 us 33643 WRITE
  5.46 199.16 us 108.00 us 469938.00 us 22238 UNLINK
  6.70 69.83 us 43.00 us .00 us 77809 LOOKUP
  6.97 447.60 us 21.00 us 54875.00 us 12631 READDIRP
  7.73 79.42 us 33.00 us 1535.00 us 78909 SETATTR
 14.11 2815.00 us 176.00 us 2106305.00 us 4065 MKDIR
 54.09 1972.62 us 138.00 us 1520773.00 us 22238 CREATE

On good volume:
 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
 - --- --- ---  
  0.00 0.00 us 0.00 us 0.00 us 58870 FORGET
  0.00 0.00 us 0.00 us 0.00 us 66016 RELEASE
  0.00 0.00 us 0.00 us 0.00 us 16480 RELEASEDIR
  0.00 61.50 us 58.00 us 65.00 us 2 OPEN
  0.01 39.56 us 16.00 us 112.00 us 71 STAT
  0.02 41.29 us 27.00 us 79.00 us 163 STATFS
  0.03 36.06 us 17.00 us 98.00 us 301 FSTAT
  0.79 62.38 us 39.00 us 269.00 us 4065 SETXATTR
  1.14 242.99 us 25.00 us 28636.00 us 1497 READ
  1.54 59.76 us 25.00 us 6325.00 us 8236 OPENDIR
  1.70 133.75 us 89.00 us 374.00 us 4065 RMDIR
  2.25 32.65 us 15.00 us 265.00 us 22006 FLUSH
  3.37 265.05 us 172.00 us 2349.00 us 4065 MKDIR
  7.14 68.34 us 21.00 us 21902.00 us 33357 WRITE
 11.00 159.68 us 107.00 us 2567.00 us 22003 UNLINK
 13.82 200.54 us 133.00 us 21762.00 us 22003 CREATE
 17.85 448.85 us 22.00 us 54046.00 us 12697 READDIRP
 18.37 76.12 us 45.00 us 294.00 us 77044 LOOKUP
 20.95 85.54 us 35.00 us 1404.00 us 78204 SETATTR

As we can see here, FORGET/RELEASE are way more in the brick from full 
volume compared to the brick from empty volume. It seems to suggest 
that the inode-table on the volume with lots of data is carrying too 
many passive inodes in the table which need to be displaced to create 
new ones. Need to check if they come in the fop-path. Need to continue 
my investigations further, will let you know.
Just to increase confidence performed one more test. Stopped the 
volumes and re-started. Now on both the volumes, the numbers are almost 
same:


[root@gqac031 gluster-mount]# time rm -rf boost_1_57_0 ; time tar xf 
boost_1_57_0.tar.gz


real 1m15.074s

Re: [Gluster-devel] missing files

2015-02-12 Thread David F. Robinson



-- Original Message --
From: Shyam srang...@redhat.com
To: David F. Robinson david.robin...@corvidtec.com; Pranith Kumar 
Karampuri pkara...@redhat.com; Justin Clift jus...@gluster.org

Cc: Gluster Devel gluster-devel@gluster.org
Sent: 2/12/2015 11:26:51 AM
Subject: Re: [Gluster-devel] missing files


On 02/12/2015 11:18 AM, David F. Robinson wrote:

Shyam,

You asked me to stop/start the slow volume to see if it fixed the 
timing
issue. I stopped/started homegfs_backup (the production volume with 
40+

TB) and it didn't make it faster. I didn't stop/start the fast volume
to see if it made it slower. I just did that and sent out an email. I
saw a similar result as Pranith.


Just to be clear even after restart of the slow volume, we see ~19 
minutes for the tar to complete, correct?

Correct



Versus, on the fast volume it is anywhere between 00:55 - 3:00 minutes, 
irrespective of start, fresh create, etc. correct?

Correct



Shyam


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-12 Thread David F. Robinson
FWIW, starting/stopping a volume that is fast doesn't consistently make 
it slow.  I just tried it again on an older volume... It doesn't make it 
slow.  I also went back and re-ran the test on test3brick and it isn't 
slow any longer.  Maybe there is a time lag after stopping/starting a 
volume before it becomes fast.


Either way, stopping/starting a fast volume only makes it slow for 
some period of time and it doesn't consistently make it slow.  I don't 
think this is the issue.  red-herring.


[root@gfs01bkp /]# gluster volume stop test2brick
Stopping volume will make its data inaccessible. Do you want to 
continue? (y/n) y

[root@gfs01bkp /]# gluster volume start test2brick
volume start: test2brick: success
[root@gfs01bkp /]# mount /test2brick
[root@gfs01bkp /]# cd /test2brick
[root@gfs01bkp test2brick]# time tar -xPf boost_1_57_0.tar; time rm -rf 
boost_1_57_0


real1m1.124s
user0m0.432s
sys 0m3.136s

real0m16.630s
user0m0.083s
sys 0m0.570s


#... Retest on test3brick after it has been up after a volume restart 
for 20-minutes... Compare this to running the test immediately after a 
restart which gave a time of 3.5-minutes.
[root@gfs01bkp test3brick]#  time tar -xPf boost_1_57_0.tar; time rm -rf 
boost_1_57_0


real1m17.786s
user0m0.502s
sys 0m3.278s

real0m18.103s
user0m0.101s
sys 0m0.684s



-- Original Message --
From: Shyam srang...@redhat.com
To: David F. Robinson david.robin...@corvidtec.com; Pranith Kumar 
Karampuri pkara...@redhat.com; Justin Clift jus...@gluster.org

Cc: Gluster Devel gluster-devel@gluster.org
Sent: 2/12/2015 11:26:51 AM
Subject: Re: [Gluster-devel] missing files


On 02/12/2015 11:18 AM, David F. Robinson wrote:

Shyam,

You asked me to stop/start the slow volume to see if it fixed the 
timing
issue. I stopped/started homegfs_backup (the production volume with 
40+

TB) and it didn't make it faster. I didn't stop/start the fast volume
to see if it made it slower. I just did that and sent out an email. I
saw a similar result as Pranith.


Just to be clear even after restart of the slow volume, we see ~19 
minutes for the tar to complete, correct?


Versus, on the fast volume it is anywhere between 00:55 - 3:00 minutes, 
irrespective of start, fresh create, etc. correct?


Shyam


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-11 Thread David F. Robinson
My base filesystem has 40-TB and the tar takes 19 minutes. I copied over 10-TB 
and it took the tar extraction from 1-minute to 7-minutes. 

My suspicion is that it is related to number of files and not necessarily file 
size. Shyam is looking into reproducing this behavior on a redhat system. 

David  (Sent from mobile)

===
David F. Robinson, Ph.D. 
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 11, 2015, at 7:38 AM, Justin Clift jus...@gluster.org wrote:
 
 On 11 Feb 2015, at 12:31, David F. Robinson david.robin...@corvidtec.com 
 wrote:
 
 Some time ago I had a similar performance problem (with 3.4 if I remember 
 correctly): a just created volume started to work fine, but after some time 
 using it performance was worse. Removing all files from the volume didn't 
 improve the performance again.
 
 I guess my problem is a little better depending on how you look at it. If I 
 date the data from the volume, the performance goes back to that of an empty 
 volume. I don't have to delete the .glusterfs entries to regain my 
 performance. I only have to delete the data from the mount point.
 
 Interesting.  Do you have somewhat accurate stats on how much data (eg # of 
 entries, size
 of files) was in the data set that did this?
 
 Wondering if it's repeatable, so we can replicate the problem and solve. :)
 
 + Justin
 
 --
 GlusterFS - http://www.gluster.org
 
 An open source, distributed file system scaling to several
 petabytes, and handling thousands of clients.
 
 My personal twitter: twitter.com/realjustinclift
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-11 Thread David F. Robinson
Don't think it is the underlying file system. /data/brickxx is the underlying 
xfs. Performance to this is fine. When I created a volume it just puts the data 
in /data/brick/test2. The underlying filesystem shouldn't know/care that it is 
in a new directory. 

Also, if I create a /data/brick/test2 volume and put data on it, it gets slow 
in gluster. But, writing to /data/brick is still fine. And, after test2 gets 
slow, I can create a /data/test3 volume that is empty and its speed is fine. 

My knowledge is admittedly very limited here, but I don't see how it could be 
the underlying filesystem if the slowdown only occurs on the gluster mount and 
not on the underlying xfs filesystem. 

David  (Sent from mobile)

===
David F. Robinson, Ph.D. 
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 11, 2015, at 12:18 AM, Justin Clift jus...@gluster.org wrote:
 
 On 11 Feb 2015, at 03:06, Shyam srang...@redhat.com wrote:
 snip
 2) We ran an strace of tar and also collected io-stats outputs from these 
 volumes, both show that create and mkdir is slower on slow as compared to 
 the fast volume. This seems to be the overall reason for slowness
 
 Any idea's on why the create and mkdir is slower?
 
 Wondering if it's a case of underlying filesystem parameters (for the bricks)
 + maybe physical storage structure having become badly optimised over time.
 eg if its on spinning rust, not ssd, and sector placement is now bad
 
 Any idea if there are tools that can analyse this kind of thing?  eg meta
 data placement / fragmentation / on a drive for XFS/ext4
 
 + Justin
 
 --
 GlusterFS - http://www.gluster.org
 
 An open source, distributed file system scaling to several
 petabytes, and handling thousands of clients.
 
 My personal twitter: twitter.com/realjustinclift
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-11 Thread Xavier Hernandez
Some time ago I had a similar performance problem (with 3.4 if I 
remember correctly): a just created volume started to work fine, but 
after some time using it performance was worse. Removing all files from 
the volume didn't improve the performance again.


The only way I had to recover a performance similar to the initial one 
without recreating the volume was to remove all volume contents and also 
delete all 256 .glusterfs/xx/ directories from all bricks.


The backend filesystem was XFS.

Could you try if this is the same case ?

Xavi

On 02/11/2015 12:22 PM, David F. Robinson wrote:

Don't think it is the underlying file system. /data/brickxx is the underlying 
xfs. Performance to this is fine. When I created a volume it just puts the data 
in /data/brick/test2. The underlying filesystem shouldn't know/care that it is 
in a new directory.

Also, if I create a /data/brick/test2 volume and put data on it, it gets slow 
in gluster. But, writing to /data/brick is still fine. And, after test2 gets 
slow, I can create a /data/test3 volume that is empty and its speed is fine.

My knowledge is admittedly very limited here, but I don't see how it could be 
the underlying filesystem if the slowdown only occurs on the gluster mount and 
not on the underlying xfs filesystem.

David  (Sent from mobile)

===
David F. Robinson, Ph.D.
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com


On Feb 11, 2015, at 12:18 AM, Justin Clift jus...@gluster.org wrote:


On 11 Feb 2015, at 03:06, Shyam srang...@redhat.com wrote:
snip
2) We ran an strace of tar and also collected io-stats outputs from these 
volumes, both show that create and mkdir is slower on slow as compared to the 
fast volume. This seems to be the overall reason for slowness


Any idea's on why the create and mkdir is slower?

Wondering if it's a case of underlying filesystem parameters (for the bricks)
+ maybe physical storage structure having become badly optimised over time.
eg if its on spinning rust, not ssd, and sector placement is now bad

Any idea if there are tools that can analyse this kind of thing?  eg meta
data placement / fragmentation / on a drive for XFS/ext4

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-05 Thread Pranith Kumar Karampuri
:15 FEASABILITY STUDY.docx
-rwxrw 2 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: 


total 0
drwxrws--- 2 root root 10 Feb 4 18:12 .
drwxrws--x 6 root root 95 Feb 4 18:12 ..

[root@gfs02a ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: 


total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: 


total 72
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF SOLUTIONS.one
-rwxrw 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: 


total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: 


total 84
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one


[root@gfs02b ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: 


total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: 


total 72
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF SOLUTIONS.one
-rwxrw 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: 


total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: 


total 84
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one






-- Original Message --
From: Xavier Hernandez xhernan...@datalab.es
To: David F. Robinson david.robin...@corvidtec.com; Benjamin 
Turner bennytu...@gmail.com; Pranith Kumar Karampuri 
pkara...@redhat.com
Cc: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster 
Devel gluster-devel@gluster.org

Sent: 2/5/2015 5:14:22 AM
Subject: Re: [Gluster-devel] missing files


Is the failure repeatable ? with the same directories ?

It's very weird that the directories appear on the volume when you do 
an 'ls' on the bricks. Could it be that you only made a single 'ls' 
on fuse mount which not showed the directory ? Is it possible that 
this 'ls' triggered a self-heal that repaired the problem, whatever 
it was, and when you did another 'ls' on the fuse mount after the 
'ls' on the bricks, the directories were there ?


The first 'ls' could have healed the files, causing that the 
following 'ls' on the bricks showed the files as if nothing were 
damaged. If that's the case, it's possible that there were some 
disconnections during the copy.


Added Pranith because he knows better replication and self-heal details.

Xavi

On 02/04/2015 07:23 PM, David F. Robinson wrote:

Distributed/replicated

Volume Name: homegfs
Type: Distributed-Replicate
Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
Options Reconfigured:
performance.io

Re: [Gluster-devel] missing files

2015-02-05 Thread David F. Robinson
/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 2 root root 10 Feb 4 18:12 .
drwxrws--x 6 root root 95 Feb 4 18:12 ..

[root@gfs02a ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 72
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF 
SOLUTIONS.one

-rwxrw 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 84
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one


[root@gfs02b ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 72
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF 
SOLUTIONS.one

-rwxrw 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 84
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one






-- Original Message --
From: Xavier Hernandez xhernan...@datalab.es
To: David F. Robinson david.robin...@corvidtec.com; Benjamin 
Turner bennytu...@gmail.com; Pranith Kumar Karampuri 
pkara...@redhat.com
Cc: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster 
Devel gluster-devel@gluster.org

Sent: 2/5/2015 5:14:22 AM
Subject: Re: [Gluster-devel] missing files


Is the failure repeatable ? with the same directories ?

It's very weird that the directories appear on the volume when you do 
an 'ls' on the bricks. Could it be that you only made a single 'ls' on 
fuse mount which not showed the directory ? Is it possible that this 
'ls' triggered a self-heal that repaired the problem, whatever it was, 
and when you did another 'ls' on the fuse mount after the 'ls' on the 
bricks, the directories were there ?


The first 'ls' could have healed the files, causing that the following 
'ls' on the bricks showed the files as if nothing were damaged. If 
that's the case, it's possible that there were some disconnections 
during the copy.


Added Pranith because he knows better replication and self-heal 
details.


Xavi

On 02/04/2015 07:23 PM, David F. Robinson wrote:

Distributed/replicated

Volume Name: homegfs
Type: Distributed-Replicate
Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
Options Reconfigured:
performance.io-thread-count: 32
performance.cache-size: 128MB
performance.write-behind-window-size: 128MB
server.allow-insecure: on
network.ping-timeout: 10
storage.owner-gid: 100
geo-replication.indexing: off
geo

Re: [Gluster-devel] missing files

2015-02-05 Thread Xavier Hernandez

Is the failure repeatable ? with the same directories ?

It's very weird that the directories appear on the volume when you do an 
'ls' on the bricks. Could it be that you only made a single 'ls' on fuse 
mount which not showed the directory ? Is it possible that this 'ls' 
triggered a self-heal that repaired the problem, whatever it was, and 
when you did another 'ls' on the fuse mount after the 'ls' on the 
bricks, the directories were there ?


The first 'ls' could have healed the files, causing that the following 
'ls' on the bricks showed the files as if nothing were damaged. If 
that's the case, it's possible that there were some disconnections 
during the copy.


Added Pranith because he knows better replication and self-heal details.

Xavi

On 02/04/2015 07:23 PM, David F. Robinson wrote:

Distributed/replicated

Volume Name: homegfs
Type: Distributed-Replicate
Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
Options Reconfigured:
performance.io-thread-count: 32
performance.cache-size: 128MB
performance.write-behind-window-size: 128MB
server.allow-insecure: on
network.ping-timeout: 10
storage.owner-gid: 100
geo-replication.indexing: off
geo-replication.ignore-pid-check: on
changelog.changelog: on
changelog.fsync-interval: 3
changelog.rollover-time: 15
server.manage-gids: on


-- Original Message --
From: Xavier Hernandez xhernan...@datalab.es
To: David F. Robinson david.robin...@corvidtec.com; Benjamin
Turner bennytu...@gmail.com
Cc: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster
Devel gluster-devel@gluster.org
Sent: 2/4/2015 6:03:45 AM
Subject: Re: [Gluster-devel] missing files


On 02/04/2015 01:30 AM, David F. Robinson wrote:

Sorry. Thought about this a little more. I should have been clearer.
The files were on both bricks of the replica, not just one side. So,
both bricks had to have been up... The files/directories just don't show
up on the mount.
I was reading and saw a related bug
(https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it
suggested to run:
 find mount -d -exec getfattr -h -n trusted.ec.heal {} \;


This command is specific for a dispersed volume. It won't do anything
(aside from the error you are seeing) on a replicated volume.

I think you are using a replicated volume, right ?

In this case I'm not sure what can be happening. Is your volume a pure
replicated one or a distributed-replicated ? on a pure replicated it
doesn't make sense that some entries do not show in an 'ls' when the
file is in both replicas (at least without any error message in the
logs). On a distributed-replicated it could be caused by some problem
while combining contents of each replica set.

What's the configuration of your volume ?

Xavi



I get a bunch of errors for operation not supported:
[root@gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n
trusted.ec.heal {} \;
find: warning: the -d option is deprecated; please use -depth instead,
because the latter is a POSIX-compliant feature.
wks_backup/homer_backup/backup: trusted.ec.heal: Operation not supported
wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs: trusted.ec.heal: Operation not supported
wks_backup/homer_backup: trusted.ec.heal: Operation not supported
-- Original Message --
From: Benjamin Turner bennytu...@gmail.com
mailto:bennytu...@gmail.com
To: David F. Robinson david.robin...@corvidtec.com
mailto:david.robin...@corvidtec.com
Cc: Gluster Devel gluster-devel@gluster.org
mailto:gluster-devel@gluster.org; gluster-us...@gluster.org
gluster-us...@gluster.org mailto:gluster-us...@gluster.org
Sent: 2/3/2015 7:12:34 PM
Subject: Re: [Gluster-devel] missing files

It sounds to me like the files were only copied to one replica, werent
there for the initial for the initial ls which triggered a self heal,
and were there for the last ls because they were healed. Is there any
chance that one of the replicas was down during the rsync? It could
be that you lost a brick during copy or something like that. To
confirm I would look for disconnects in the brick logs as well as
checking glusterfshd.log

Re: [Gluster-devel] missing files

2015-02-04 Thread David F. Robinson

Distributed/replicated

Volume Name: homegfs
Type: Distributed-Replicate
Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
Options Reconfigured:
performance.io-thread-count: 32
performance.cache-size: 128MB
performance.write-behind-window-size: 128MB
server.allow-insecure: on
network.ping-timeout: 10
storage.owner-gid: 100
geo-replication.indexing: off
geo-replication.ignore-pid-check: on
changelog.changelog: on
changelog.fsync-interval: 3
changelog.rollover-time: 15
server.manage-gids: on


-- Original Message --
From: Xavier Hernandez xhernan...@datalab.es
To: David F. Robinson david.robin...@corvidtec.com; Benjamin 
Turner bennytu...@gmail.com
Cc: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster 
Devel gluster-devel@gluster.org

Sent: 2/4/2015 6:03:45 AM
Subject: Re: [Gluster-devel] missing files


On 02/04/2015 01:30 AM, David F. Robinson wrote:

Sorry. Thought about this a little more. I should have been clearer.
The files were on both bricks of the replica, not just one side. So,
both bricks had to have been up... The files/directories just don't 
show

up on the mount.
I was reading and saw a related bug
(https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it
suggested to run:
 find mount -d -exec getfattr -h -n trusted.ec.heal {} \;


This command is specific for a dispersed volume. It won't do anything 
(aside from the error you are seeing) on a replicated volume.


I think you are using a replicated volume, right ?

In this case I'm not sure what can be happening. Is your volume a pure 
replicated one or a distributed-replicated ? on a pure replicated it 
doesn't make sense that some entries do not show in an 'ls' when the 
file is in both replicas (at least without any error message in the 
logs). On a distributed-replicated it could be caused by some problem 
while combining contents of each replica set.


What's the configuration of your volume ?

Xavi



I get a bunch of errors for operation not supported:
[root@gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n
trusted.ec.heal {} \;
find: warning: the -d option is deprecated; please use -depth instead,
because the latter is a POSIX-compliant feature.
wks_backup/homer_backup/backup: trusted.ec.heal: Operation not 
supported
wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: 
Operation

not supported
wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: 
Operation

not supported
wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: 
Operation

not supported
wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: 
Operation

not supported
wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: 
Operation

not supported
wks_backup/homer_backup/logs: trusted.ec.heal: Operation not supported
wks_backup/homer_backup: trusted.ec.heal: Operation not supported
-- Original Message --
From: Benjamin Turner bennytu...@gmail.com 
mailto:bennytu...@gmail.com

To: David F. Robinson david.robin...@corvidtec.com
mailto:david.robin...@corvidtec.com
Cc: Gluster Devel gluster-devel@gluster.org
mailto:gluster-devel@gluster.org; gluster-us...@gluster.org
gluster-us...@gluster.org mailto:gluster-us...@gluster.org
Sent: 2/3/2015 7:12:34 PM
Subject: Re: [Gluster-devel] missing files
It sounds to me like the files were only copied to one replica, 
werent

there for the initial for the initial ls which triggered a self heal,
and were there for the last ls because they were healed. Is there any
chance that one of the replicas was down during the rsync? It could
be that you lost a brick during copy or something like that. To
confirm I would look for disconnects in the brick logs as well as
checking glusterfshd.log to verify the missing files were actually
healed.

-b

On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson
david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com
wrote:

I rsync'd 20-TB over to my gluster system and noticed that I had
some directories missing even though the rsync completed 
normally.

The rsync logs showed that the missing files were transferred.
I went to the bricks and did an 'ls -al
/data/brick*/homegfs/dir/*' the files were on the bricks. After I
did this 'ls', the files then showed up on the FUSE mounts.
1) Why are the files hidden on the fuse mount?
2) Why does the ls make them show up on the FUSE mount?
3) How can I prevent this from happening again?
Note, I also mounted the gluster volume using NFS and saw

Re: [Gluster-devel] missing files

2015-02-04 Thread Xavier Hernandez

On 02/04/2015 01:30 AM, David F. Robinson wrote:

Sorry.  Thought about this a little more. I should have been clearer.
The files were on both bricks of the replica, not just one side.  So,
both bricks had to have been up... The files/directories just don't show
up on the mount.
I was reading and saw a related bug
(https://bugzilla.redhat.com/show_bug.cgi?id=1159484).  I saw it
suggested to run:
 find mount -d -exec getfattr -h -n trusted.ec.heal {} \;


This command is specific for a dispersed volume. It won't do anything 
(aside from the error you are seeing) on a replicated volume.


I think you are using a replicated volume, right ?

In this case I'm not sure what can be happening. Is your volume a pure 
replicated one or a distributed-replicated ? on a pure replicated it 
doesn't make sense that some entries do not show in an 'ls' when the 
file is in both replicas (at least without any error message in the 
logs). On a distributed-replicated it could be caused by some problem 
while combining contents of each replica set.


What's the configuration of your volume ?

Xavi



I get a bunch of errors for operation not supported:
[root@gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n
trusted.ec.heal {} \;
find: warning: the -d option is deprecated; please use -depth instead,
because the latter is a POSIX-compliant feature.
wks_backup/homer_backup/backup: trusted.ec.heal: Operation not supported
wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs: trusted.ec.heal: Operation not supported
wks_backup/homer_backup: trusted.ec.heal: Operation not supported
-- Original Message --
From: Benjamin Turner bennytu...@gmail.com mailto:bennytu...@gmail.com
To: David F. Robinson david.robin...@corvidtec.com
mailto:david.robin...@corvidtec.com
Cc: Gluster Devel gluster-devel@gluster.org
mailto:gluster-devel@gluster.org; gluster-us...@gluster.org
gluster-us...@gluster.org mailto:gluster-us...@gluster.org
Sent: 2/3/2015 7:12:34 PM
Subject: Re: [Gluster-devel] missing files

It sounds to me like the files were only copied to one replica, werent
there for the initial for the initial ls which triggered a self heal,
and were there for the last ls because they were healed.  Is there any
chance that one of the replicas was down during the rsync?  It could
be that you lost a brick during copy or something like that.  To
confirm I would look for disconnects in the brick logs as well as
checking glusterfshd.log to verify the missing files were actually
healed.

-b

On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson
david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com
wrote:

I rsync'd 20-TB over to my gluster system and noticed that I had
some directories missing even though the rsync completed normally.
The rsync logs showed that the missing files were transferred.
I went to the bricks and did an 'ls -al
/data/brick*/homegfs/dir/*' the files were on the bricks.  After I
did this 'ls', the files then showed up on the FUSE mounts.
1) Why are the files hidden on the fuse mount?
2) Why does the ls make them show up on the FUSE mount?
3) How can I prevent this from happening again?
Note, I also mounted the gluster volume using NFS and saw the same
behavior.  The files/directories were not shown until I did the
ls on the bricks.
David
===
David F. Robinson, Ph.D.
President - Corvid Technologies
704.799.6944 x101 tel:704.799.6944%20x101 [office]
704.252.1310 tel:704.252.1310 [cell]
704.799.7974 tel:704.799.7974 [fax]
david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com
http://www.corvidtechnologies.com http://www.corvidtechnologies.com/

___
Gluster-devel mailing list
Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel





___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] missing files

2015-02-03 Thread David F. Robinson
I rsync'd 20-TB over to my gluster system and noticed that I had some 
directories missing even though the rsync completed normally.

The rsync logs showed that the missing files were transferred.

I went to the bricks and did an 'ls -al /data/brick*/homegfs/dir/*' the 
files were on the bricks.  After I did this 'ls', the files then showed 
up on the FUSE mounts.


1) Why are the files hidden on the fuse mount?
2) Why does the ls make them show up on the FUSE mount?
3) How can I prevent this from happening again?

Note, I also mounted the gluster volume using NFS and saw the same 
behavior.  The files/directories were not shown until I did the ls on 
the bricks.


David



===
David F. Robinson, Ph.D.
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310 [cell]
704.799.7974 [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-03 Thread David F. Robinson
Sorry.  Thought about this a little more. I should have been clearer.  
The files were on both bricks of the replica, not just one side.  So, 
both bricks had to have been up... The files/directories just don't show 
up on the mount.


I was reading and saw a related bug 
(https://bugzilla.redhat.com/show_bug.cgi?id=1159484).  I saw it 
suggested to run:


find mount -d -exec getfattr -h -n trusted.ec.heal {} \;


I get a bunch of errors for operation not supported:

[root@gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n 
trusted.ec.heal {} \;
find: warning: the -d option is deprecated; please use -depth instead, 
because the latter is a POSIX-compliant feature.

wks_backup/homer_backup/backup: trusted.ec.heal: Operation not supported
wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: Operation 
not supported
wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: Operation 
not supported
wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: Operation 
not supported
wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: Operation 
not supported
wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: Operation 
not supported

wks_backup/homer_backup/logs: trusted.ec.heal: Operation not supported
wks_backup/homer_backup: trusted.ec.heal: Operation not supported

-- Original Message --
From: Benjamin Turner bennytu...@gmail.com
To: David F. Robinson david.robin...@corvidtec.com
Cc: Gluster Devel gluster-devel@gluster.org; 
gluster-us...@gluster.org gluster-us...@gluster.org

Sent: 2/3/2015 7:12:34 PM
Subject: Re: [Gluster-devel] missing files

It sounds to me like the files were only copied to one replica, werent 
there for the initial for the initial ls which triggered a self heal, 
and were there for the last ls because they were healed.  Is there any 
chance that one of the replicas was down during the rsync?  It could be 
that you lost a brick during copy or something like that.  To confirm I 
would look for disconnects in the brick logs as well as checking 
glusterfshd.log to verify the missing files were actually healed.


-b

On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson 
david.robin...@corvidtec.com wrote:
I rsync'd 20-TB over to my gluster system and noticed that I had some 
directories missing even though the rsync completed normally.

The rsync logs showed that the missing files were transferred.

I went to the bricks and did an 'ls -al /data/brick*/homegfs/dir/*' 
the files were on the bricks.  After I did this 'ls', the files then 
showed up on the FUSE mounts.


1) Why are the files hidden on the fuse mount?
2) Why does the ls make them show up on the FUSE mount?
3) How can I prevent this from happening again?

Note, I also mounted the gluster volume using NFS and saw the same 
behavior.  The files/directories were not shown until I did the ls 
on the bricks.


David



===
David F. Robinson, Ph.D.
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310 [cell]
704.799.7974 [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-03 Thread Benjamin Turner
It sounds to me like the files were only copied to one replica, werent
there for the initial for the initial ls which triggered a self heal, and
were there for the last ls because they were healed.  Is there any chance
that one of the replicas was down during the rsync?  It could be that you
lost a brick during copy or something like that.  To confirm I would look
for disconnects in the brick logs as well as checking glusterfshd.log to
verify the missing files were actually healed.

-b

On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson 
david.robin...@corvidtec.com wrote:

  I rsync'd 20-TB over to my gluster system and noticed that I had some
 directories missing even though the rsync completed normally.
 The rsync logs showed that the missing files were transferred.

 I went to the bricks and did an 'ls -al /data/brick*/homegfs/dir/*' the
 files were on the bricks.  After I did this 'ls', the files then showed up
 on the FUSE mounts.

 1) Why are the files hidden on the fuse mount?
 2) Why does the ls make them show up on the FUSE mount?
 3) How can I prevent this from happening again?

 Note, I also mounted the gluster volume using NFS and saw the same
 behavior.  The files/directories were not shown until I did the ls on the
 bricks.

 David



  ===
 David F. Robinson, Ph.D.
 President - Corvid Technologies
 704.799.6944 x101 [office]
 704.252.1310 [cell]
 704.799.7974 [fax]
 david.robin...@corvidtec.com
 http://www.corvidtechnologies.com



 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-03 Thread David F. Robinson

Like these?

data-brick02a-homegfs.log:[2015-02-03 19:09:34.568842] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs02a.corvidtec.com-18563-2015/02/03-19:07:58:519134-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 19:09:41.286551] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-12804-2015/02/03-19:09:38:497808-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 19:16:35.906412] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs02b.corvidtec.com-27190-2015/02/03-19:15:53:458467-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 19:51:22.761293] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-25926-2015/02/03-19:51:02:89070-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1
data-brick02a-homegfs.log:[2015-02-03 22:44:47.458905] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-29467-2015/02/03-22:44:05:838129-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 22:47:42.830866] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-30069-2015/02/03-22:47:37:209436-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 22:48:26.785931] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-30256-2015/02/03-22:47:55:203659-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 22:53:25.530836] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-30658-2015/02/03-22:53:21:627538-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 22:56:14.033823] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-30893-2015/02/03-22:56:01:450507-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 22:56:55.622800] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-31080-2015/02/03-22:56:32:665370-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 22:59:11.445742] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-31383-2015/02/03-22:58:45:190874-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 23:06:26.482709] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-31720-2015/02/03-23:06:11:340012-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 23:10:54.807725] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-32083-2015/02/03-23:10:22:131678-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 23:13:35.545513] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-32284-2015/02/03-23:13:21:26552-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-03 23:14:19.065271] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-32471-2015/02/03-23:13:48:221126-homegfs-client-2-0-0
data-brick02a-homegfs.log:[2015-02-04 00:18:20.261428] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01a.corvidtec.com-1369-2015/02/04-00:16:53:613570-homegfs-client-2-0-0


-- Original Message --
From: Benjamin Turner bennytu...@gmail.com
To: David F. Robinson david.robin...@corvidtec.com
Cc: Gluster Devel gluster-devel@gluster.org; 
gluster-us...@gluster.org gluster-us...@gluster.org

Sent: 2/3/2015 7:12:34 PM
Subject: Re: [Gluster-devel] missing files

It sounds to me like the files were only copied to one replica, werent 
there for the initial for the initial ls which triggered a self heal, 
and were there for the last ls because they were healed.  Is there any 
chance that one of the replicas was down during the rsync?  It could be 
that you lost a brick during copy or something like that.  To confirm I 
would look for disconnects in the brick logs as well as checking 
glusterfshd.log to verify the missing files were actually healed.


-b

On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson 
david.robin...@corvidtec.com wrote:
I rsync'd 20-TB over to my gluster system and noticed that I had some 
directories missing even though the rsync completed normally.

The rsync logs showed that the missing files were transferred.

I went to the bricks and did an 'ls -al /data/brick*/homegfs/dir/*' 
the files were on the bricks.  After I did this 'ls', the files then 
showed up on the FUSE mounts.


1) Why are the files hidden on the fuse mount?
2) Why does the ls