Re: [Gluster-devel] Gluster testing matrix

2015-12-19 Thread Benjamin Turner
WRT performance.  I would be happy to kick of daily performance regression
runs on what ever you guys would like.  My current daily runs don't take
any more than 6-8 hours, I could fit another 4 hours in there for
upstream.  I could also run them on RDMA to help out on some coverage
there.  LMK what you think, I already have this implemented and its just a
matter of setting how often we want it run and on what config / builds.
LMK what you think and we can have something real quick.

-b

On Thu, Dec 17, 2015 at 6:03 AM, Deepak Shetty  wrote:

> Where / How do you indicate whether the underlying GlusterFS is
> used/tested using fusemount or libgfapi method or both ?
>
> On Wed, Nov 25, 2015 at 5:39 PM, Raghavendra Talur 
> wrote:
>
>> Hi All,
>>
>> Here is a table representing the current state of Gluster testing.
>>
>> * Things in green are categories for which we have some kind of testing
>> in place.
>> * Things in red are the ones which don't have any tests.
>> * Things in yellow are the ones which have no known tests or are not
>> managed by Gluster community.
>>
>>
>>
>> Test Category/Sub-Category
>>
>>
>>
>>
>>
>> smoke source build + Posix complaince + Dbench
>>
>>
>>
>>
>> functional tests/basic
>>
>>
>>
>>
>> regression tests/bugs
>>
>>
>>
>>
>> performance regression N/A
>>
>>
>>
>>
>> integration Backends/FS Protocols Consumers/Cloud Environments Tools libgfapi
>> bindings OS environment xfs smb qemu gdeploy java firewalld ext4 nfs
>> openstack/cinder heketi python ufw btrfs swift
>> openshift/docker/containers
>> ruby selinux zfs
>> aws
>> go apparmor
>>
>> azure
>>
>>
>>
>>
>> hadoop
>>
>>
>> update major version upgrades minor version upgrades
>>
>>
>>
>> longevity a. memory leaks
>> b. log accumulation
>>
>>
>>
>>
>> distro packaging a. pkg build + smoke
>> b. sysvinit/systemd
>>
>>
>>
>>
>>
>>
>> I will send separate mails for each of the categories above highlighting
>> plans for them.
>>
>> Use this thread to indicate addition/deletion/changes to this matrix.
>> We will put this at gluster.org website once it is final.
>>
>> Thanks,
>> Raghavendra Talur & MSV Bhat
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Front end for Gluster

2015-12-19 Thread Benjamin Turner
As long as the CLI is still fully available I think it could help some less
experienced users.  There are a couple opensource projects for managing
gluster, ceph, etc.  Would it make more sense to invest the development
time / resources in the web based GUI or would we want a TUI as well?  From
my personal perspective if we can get a good web GUI that includes building
a cluster from end to end and some perf metrics / monitoring I would favor
the web GUI over the TUI.  If the time spent building the TUI could be
spent enhancing / finishing the GUI I would vote for that.

-b

On Fri, Dec 18, 2015 at 10:22 PM, Prasanna Kumar Kalever <
pkale...@redhat.com> wrote:

> Hello Team,
>
> How many of us think an nursers based front for gluster will increase the
> ease of use ?
>
> Find the attached Images that will show basic POC (see in ascending order)
>
> Thank you,
> -Prasanna
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Iusses with Random read/write

2015-07-24 Thread Benjamin Turner
Pre 3.7 glusterfs had a singe threaded event listener that would peg out a
CPU to 100% causing it to become CPU bound.  With the 3.7 release we
changed to a multi threaded event listener that enabled the CPU load to be
spread across multiple threads / cores.  In my experience I still see
workloads becoming CPU bound with the default of 2, so I monitor things
with top -H while running my tests to look for hot threads(threads sitting
at 100% CPU).  In my testing I find that event threads =4 works best for
me, but each env is different and is worth doing some tuning to see what
works best  for you.

HTH

-b

On Fri, Jul 24, 2015 at 9:44 AM, Subrata Ghosh 
wrote:

>  Hi All,
>
> Thank you very much for the help. We  will  experiment and check .Your
> last two suggestions might help for better clarity.
>
>
>
> We are using gluster 3.3.2. Right now we have somelimitations in the code
> base to upgrade. We have a plan to upgrade later to the latest.
>
>
>
> Regards,
>
> Subrata
>
>
>
> *From:* Benjamin Turner [mailto:bennytu...@gmail.com]
> *Sent:* Thursday, July 23, 2015 11:22 PM
> *To:* Susant Palai
> *Cc:* Subrata Ghosh; Gluster Devel
> *Subject:* Re: [Gluster-devel] Iusses with Random read/write
>
>
>
> I run alot of random IO tests with gluster and it has really come a long
> way in the 3.7 release.  What version are you running on?  I have a couple
> of suggestions:
>
>
>
> -Run on 3.7 if you are not already.
>
> -Run the IOZone test you are running on the back end without gluster to
> verify that your HW meets your perf needs.  SSDs really scream with random
> IO if your current HW won't meet your needs.
>
> -When you are running tests watch top -H on both clients and servers, look
> for any threads hitting 100%
>
> -If you see hot threads bump up the server.event-threads and/or
> client.event-threads from the default of 2
>
>
>
> HTH!
>
>
>
> -b
>
>
>
>
>
> On Thu, Jul 23, 2015 at 3:04 AM, Susant Palai  wrote:
>
> ++CCing gluster-devel to have more eye on this problem.
>
> Susant
>
> - Original Message -
> > From: "Subrata Ghosh" 
> > To: "Susant Palai  (spa...@redhat.com)" <
> spa...@redhat.com>, "Vijay Bellur 
> > (vbel...@redhat.com)" 
> > Cc: "Subrata Ghosh" 
> > Sent: Sunday, 19 July, 2015 7:57:28 PM
> > Subject: Iusses with Random read/write
> >
> > Hi Vijay/Prashant,
> >
> > How you are  you :).
> >
> > We need your  immediate  help / suggestion  to meet our random I/IO
> > performance metrics.
> > Currently we have performance issues with random read/write - our basic
> > requirement 20 MB/sec for random I/O.
> >
> > We tried with both "iozone" and "fio", received almost same ( random I/O)
> > performance which is not meeting our fundamental I/IO requirements.
> >
> > Our use case is as below.
> >
> > "Application running on different cards Writes/Reads (random) continuous
> > files to the volume comprising with storage belonging from different
> cards
> > in the distributed system, where replica presence  across cards and
> > applications are using non-local storages."
> > We have verified and identified  the bottleneck mostly on Gluster Client
> side
> > inside the application, however gluster server to server I/O speed looks
> > enough good. Performance tuning on gluster server side would not
> expected to
> > help.
> >
> > We also cross verified checked using NFS client we are getting far better
> > performance, but we cannot use NFS client /libgfapi because of use case
> > limitation ( brick failures cases etc..)
> >
> > Please throw some lights or thoughts to improve gluster client to
> achieve >
> > 20 MB/Secs
> >
> > Observations:
> >
> > Fio:
> >
> >
> > lease find the test results of Random write & read in 2 APPs scenarios.
> >
> > Scenario
> >
> > APP_1
> >
> > APP_2
> >
> > File size
> >
> > No of AMC's
> >
> > Random-Write
> >
> > 3.06  MB/s
> >
> > 3.02 MB/s
> >
> > 100 MB
> >
> > 4
> >
> > Random-Read
> >
> > 8.1 MB/s
> >
> > 8.4 MB/s
> >
> > 100 MB
> >
> > 4
> >
> >
> >
> > Iozone:
> >
> > ./iozone -R -l 1 -u 1 -r 4k -s 2G -F /home/cdr/f1 | tee -a
> > /tmp/iozone_results.txt &
> >
> >
> > APP 1
> >
> > APP2
> >
> > File Size : 2GB
> >
> > File

Re: [Gluster-devel] Iusses with Random read/write

2015-07-23 Thread Benjamin Turner
I run alot of random IO tests with gluster and it has really come a long
way in the 3.7 release.  What version are you running on?  I have a couple
of suggestions:

-Run on 3.7 if you are not already.
-Run the IOZone test you are running on the back end without gluster to
verify that your HW meets your perf needs.  SSDs really scream with random
IO if your current HW won't meet your needs.
-When you are running tests watch top -H on both clients and servers, look
for any threads hitting 100%
-If you see hot threads bump up the server.event-threads and/or
client.event-threads from the default of 2

HTH!

-b


On Thu, Jul 23, 2015 at 3:04 AM, Susant Palai  wrote:

> ++CCing gluster-devel to have more eye on this problem.
>
> Susant
>
> - Original Message -
> > From: "Subrata Ghosh" 
> > To: "Susant Palai  (spa...@redhat.com)" <
> spa...@redhat.com>, "Vijay Bellur 
> > (vbel...@redhat.com)" 
> > Cc: "Subrata Ghosh" 
> > Sent: Sunday, 19 July, 2015 7:57:28 PM
> > Subject: Iusses with Random read/write
> >
> > Hi Vijay/Prashant,
> >
> > How you are  you :).
> >
> > We need your  immediate  help / suggestion  to meet our random I/IO
> > performance metrics.
> > Currently we have performance issues with random read/write - our basic
> > requirement 20 MB/sec for random I/O.
> >
> > We tried with both "iozone" and "fio", received almost same ( random I/O)
> > performance which is not meeting our fundamental I/IO requirements.
> >
> > Our use case is as below.
> >
> > "Application running on different cards Writes/Reads (random) continuous
> > files to the volume comprising with storage belonging from different
> cards
> > in the distributed system, where replica presence  across cards and
> > applications are using non-local storages."
> > We have verified and identified  the bottleneck mostly on Gluster Client
> side
> > inside the application, however gluster server to server I/O speed looks
> > enough good. Performance tuning on gluster server side would not
> expected to
> > help.
> >
> > We also cross verified checked using NFS client we are getting far better
> > performance, but we cannot use NFS client /libgfapi because of use case
> > limitation ( brick failures cases etc..)
> >
> > Please throw some lights or thoughts to improve gluster client to
> achieve >
> > 20 MB/Secs
> >
> > Observations:
> >
> > Fio:
> >
> >
> > lease find the test results of Random write & read in 2 APPs scenarios.
> >
> > Scenario
> >
> > APP_1
> >
> > APP_2
> >
> > File size
> >
> > No of AMC's
> >
> > Random-Write
> >
> > 3.06  MB/s
> >
> > 3.02 MB/s
> >
> > 100 MB
> >
> > 4
> >
> > Random-Read
> >
> > 8.1 MB/s
> >
> > 8.4 MB/s
> >
> > 100 MB
> >
> > 4
> >
> >
> >
> > Iozone:
> >
> > ./iozone -R -l 1 -u 1 -r 4k -s 2G -F /home/cdr/f1 | tee -a
> > /tmp/iozone_results.txt &
> >
> >
> > APP 1
> >
> > APP2
> >
> > File Size : 2GB
> >
> > File size : 2GB
> >
> > Record size = 4 Kbytes
> >
> > Record size = 4 Kbytes
> >
> > Output is in Kbytes/sec
> >
> > Output is in Kbytes/sec
> >
> >
> >
> >
> > Initial write41061.78
> >
> > Initial write41167.36
> >
> >
> > Rewrite40395.64
> >
> > Rewrite40810.41
> >
> >
> > Read   262685.69
> >
> > Read   269644.62
> >
> >
> > Re-read  263751.66
> >
> > Re-read   270760.62
> >
> >
> > Reverse Read   27715.72
> >
> > Reverse Read28604.22
> >
> >
> > Stride read   83776.44
> >
> > Stride read84347.88
> >
> >
> > Random read16239.74 (15.8 MB/s )
> >
> > Random read15815.94  (15.4 MB/s )
> >
> >
> > Mixed workload16260.95
> >
> > Mixed workload15787.55
> >
> >
> > Random write 3356.57 (3.3 MB/s )
> >
> > Random write 3365.17 ( 3.3 MB/s)
> >
> >
> > Pwrite40914.55
> >
> > Pwrite40692.34
> >
> >
> > Pread   260613.83
> >
> > Pread   269850.59
> >
> >
> > Fwrite40412.40
> >
> > Fwrite40369.78
> >
> >
> > Fread   261506.61
> >
> > Fread   267142.41
> >
> >
> >
> > Some of the info on performance testing is at
> >
> http://www.gluster.org/community/documentation/index.php/Performance_Testing
> > Also pls check iozone limitations listed there.
> >
> > "WARNING: random I/O testing in iozone is very restricted by iozone
> > constraint that it must randomly read then randomly write the entire
> file!
> > This is not what we want - instead it should randomly read/write for some
> > fraction of file size or time duration, allowing us to spread out more on
> > the disk while not waiting too long for test to finish. This is why fio
> > (below) is the preferred test tool for random I/O workloads."
> >
> >
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rebalance improvement design

2015-05-04 Thread Benjamin Turner
I see:

#define GF_DECIDE_DEFRAG_THROTTLE_COUNT(throttle_count, conf) { \
\
throttle_count = MAX ((get_nprocs() - 4), 4);
  \
\
if (!strcmp (conf->dthrottle, "lazy"))  \
conf->defrag->rthcount = 1; \
\
if (!strcmp (conf->dthrottle, "normal"))\
conf->defrag->rthcount = (throttle_count / 2);  \
\
if (!strcmp (conf->dthrottle, "aggressive"))\
conf->defrag->rthcount = throttle_count;  \

So aggressive will give us the default of (20 + 16), normal is that divided
by 2, and lazy is 1, is that correct?  If so that is what I was looking to
see.  The only other thing I can think of here is making the tunible a
number like event threads, but I like this.  IDK if I saw it documented but
if its not we should note this in help.

Also to note, the old time was 98500.00 the new one is 55088.00, that is a
44% improvement!

-b


On Mon, May 4, 2015 at 9:06 AM, Susant Palai  wrote:

> Ben,
> On no. of threads:
>  Sent throttle patch here:http://review.gluster.org/#/c/10526/ to
> limit thread numbers[Not merged]. The rebalance process in current model
> spawns 20 threads and in addition to that there will be a max 16 syncop
> threads.
>
> Crash:
>  The crash should be fixed by this:
> http://review.gluster.org/#/c/10459/.
>
>  Rebalance time taken is a factor of number of files and their size.
> If the frequency of files getting added to the global queue[on which the
> migrator threads act] is higher, faster will be the rebalance. I guess here
> we are seeing the effect of local crawl mostly as only 81GB is migrated out
> of 500GB.
>
> Thanks,
> Susant
>
> - Original Message -
> > From: "Benjamin Turner" 
> > To: "Vijay Bellur" 
> > Cc: "Gluster Devel" 
> > Sent: Monday, May 4, 2015 5:18:13 PM
> > Subject: Re: [Gluster-devel] Rebalance improvement design
> >
> > Thanks Vijay! I forgot to upgrade the kernel(thinp 6.6 perf bug gah)
> before I
> > created this data set, so its a bit smaller:
> >
> > total threads = 16
> > total files = 7,060,700 (64 kb files, 100 files per dir)
> > total data = 430.951 GB
> > 88.26% of requested files processed, minimum is 70.00
> > 10101.355737 sec elapsed time
> > 698.985382 files/sec
> > 698.985382 IOPS
> > 43.686586 MB/sec
> >
> > I updated everything and ran the rebalanace on
> > glusterfs-3.8dev-0.107.git275f724.el6.x86_64.:
> >
> > [root@gqas001 ~]# gluster v rebalance testvol status
> > Node Rebalanced-files size scanned failures skipped status run time in
> secs
> > - --- --- --- --- ---
> >  --
> > localhost 1327346 81.0GB 3999140 0 0 completed 55088.00
> > gqas013.sbu.lab.eng.bos.redhat.com 0 0Bytes 1 0 0 completed 26070.00
> > gqas011.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00
> > gqas014.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00
> > gqas016.sbu.lab.eng.bos.redhat.com 1325857 80.9GB 4000865 0 0 completed
> > 55088.00
> > gqas015.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00
> > volume rebalance: testvol: success:
> >
> >
> > A couple observations:
> >
> > I am seeing lots of threads / processes running:
> >
> > [root@gqas001 ~]# ps -eLf | grep glu | wc -l
> > 96 <- 96 gluster threads
> > [root@gqas001 ~]# ps -eLf | grep rebal | wc -l
> > 36 <- 36 rebal threads.
> >
> > Is this tunible? Is there a use case where we would need to limit this?
> Just
> > curious, how did we arrive at 36 rebal threads?
> >
> > # cat /var/log/glusterfs/testvol-rebalance.log | wc -l
> > 4,577,583
> > [root@gqas001 ~]# ll /var/log/glusterfs/testvol-rebalance.log -h
> > -rw--- 1 root root 1.6G May 3 12:29
> > /var/log/glusterfs/testvol-rebalance.log
> >
> > :) How big is this going to get when I do the 10-20 TB? I'll keep tabs on
> > this, my default test setup only has:
> >
> > [root@gqas001 ~]# df -h
> > Filesystem Size Used Avail Use% Mounted on
> > /dev/mapper/vg_gqas001-lv_root 50G 4.8G 42G 11% /
> > tmpfs 24G 0 24G 0% /dev/shm
> > /dev/sda1 477M 65M 387M 

Re: [Gluster-devel] Rebalance improvement design

2015-05-04 Thread Benjamin Turner
Thanks Vijay!  I forgot to upgrade the kernel(thinp 6.6 perf bug gah)
before I created this data set, so its a bit smaller:

total threads = 16
total files = 7,060,700 (64 kb files, 100 files per dir)
total data =   430.951 GB
 88.26% of requested files processed, minimum is  70.00
10101.355737 sec elapsed time
698.985382 files/sec
698.985382 IOPS
43.686586 MB/sec

I updated everything and ran the rebalanace
on glusterfs-3.8dev-0.107.git275f724.el6.x86_64.:

[root@gqas001 ~]# gluster v rebalance testvol status
Node Rebalanced-files  size
  scanned  failures   skipped   status   run time in
secs
   -  ---   ---
---   ---   --- 
--
   localhost  132734681.0GB
  3999140 0 0completed
55088.00
  gqas013.sbu.lab.eng.bos.redhat.com00Bytes
1 0 0completed
26070.00
  gqas011.sbu.lab.eng.bos.redhat.com00Bytes
0 0 0   failed
0.00
  gqas014.sbu.lab.eng.bos.redhat.com00Bytes
0 0 0   failed
0.00
  gqas016.sbu.lab.eng.bos.redhat.com  132585780.9GB
  4000865 0 0completed
55088.00
  gqas015.sbu.lab.eng.bos.redhat.com00Bytes
0 0 0   failed
0.00
volume rebalance: testvol: success:


A couple observations:

I am seeing lots of threads / processes running:

[root@gqas001 ~]# ps -eLf | grep glu | wc -l
96 <- 96 gluster threads
[root@gqas001 ~]# ps -eLf | grep rebal | wc -l
36 <- 36 rebal threads.

Is this tunible?  Is there a use case where we would need to limit this?
Just curious, how did we arrive at 36 rebal threads?

# cat /var/log/glusterfs/testvol-rebalance.log | wc -l
4,577,583
[root@gqas001 ~]# ll /var/log/glusterfs/testvol-rebalance.log -h
-rw--- 1 root root 1.6G May  3 12:29
/var/log/glusterfs/testvol-rebalance.log

:) How big is this going to get when I do the 10-20 TB?  I'll keep tabs on
this, my default test setup only has:

[root@gqas001 ~]# df -h
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/vg_gqas001-lv_root   50G  4.8G   42G  11% /
tmpfs  24G 0   24G   0% /dev/shm
/dev/sda1 477M   65M  387M  15% /boot
/dev/mapper/vg_gqas001-lv_home  385G   71M  366G   1% /home
/dev/mapper/gluster_vg-lv_bricks  9.5T  219G  9.3T   3% /bricks

Next run I want to fill up a 10TB cluster and double the # of bricks to
simulate running out of space doubling capacity.  Any other fixes or
changes that need to go in before I try a larger data set?  Before that I
may run my performance regression suite against a system while a rebal is
in progress and check how it affects performance.  I'll turn both these
cases into perf regression tests that I run with iozone smallfile and such,
any other use cases I should add?  Should I add hard / soft links /
whatever else tot he data set?

-b


On Sun, May 3, 2015 at 11:48 AM, Vijay Bellur  wrote:

> On 05/01/2015 10:23 AM, Benjamin Turner wrote:
>
>> Ok I have all my data created and I just started the rebalance.  One
>> thing to not in the client log I see the following spamming:
>>
>> [root@gqac006 ~]# cat /var/log/glusterfs/gluster-mount-.log | wc -l
>> 394042
>>
>> [2015-05-01 00:47:55.591150] I [MSGID: 109036]
>> [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
>> Setting layout of
>> /file_dstdir/
>> gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006
>> <
>> http://gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006
>> >
>> with [Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop:
>> 2141429669 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start:
>> 2141429670 , Stop: 4294967295 ],
>> [2015-05-01 00:47:55.596147] I
>> [dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht:
>> chunk size = 0x / 19920276 = 0xd7
>> [2015-05-01 00:47:55.596177] I
>> [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
>> assigning range size 0x7fa39fa6 to testvol-replicate-1
>>
>
>
> I also noticed the same set of excessive logs in my tests. Have sent
> across a patch [1] to address this problem.
>
> -Vijay
>
> [1] http://review.gluster.org/10281
>
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rebalance improvement design

2015-05-03 Thread Benjamin Turner
Current run is segfault free(yay!) so far:

[root@gqas001 ~]# gluster v rebalance testvol status
Node Rebalanced-files  size
  scanned  failures   skipped   status   run time in
secs
   -  ---   ---
---   ---   --- 
--
   localhost   89349354.5GB
  2692286 0 0  in progress
38643.00
  gqas013.sbu.lab.eng.bos.redhat.com00Bytes
1 0 0completed
26070.00
  gqas011.sbu.lab.eng.bos.redhat.com00Bytes
0 0 0   failed
0.00
  gqas014.sbu.lab.eng.bos.redhat.com00Bytes
0 0 0   failed
0.00
  gqas016.sbu.lab.eng.bos.redhat.com   89211054.4GB
  2692295 0 0  in progress
38643.00
  gqas015.sbu.lab.eng.bos.redhat.com00Bytes
0 0 0   failed
0.00
volume rebalance: testvol: success:

The beseline ran for 98,500.00 seconds.  This one is at 38,643.00(1/3 the
number of seconds) with 54 GB transferred so far.  The same data set last
run transferred 81 GB so at 54 GBs we are 66% there.  By my estimations we
should run for ~10,000-20,000 more seconds which would give us a 40-50%
improvement!  Lets see how it finishes out :)

Any idea why I am getting the "failed" for three of them?  This has been
consistent across each run I have tried.

-b




On Fri, May 1, 2015 at 3:05 AM, Ravishankar N 
wrote:

>  I sent  a fix <http://review.gluster.org/#/c/10478/> but abandoned it
> since Susant (CC'ed) has already sent one
> http://review.gluster.org/#/c/10459/
> I think it needs re-submission, but more review-eyes are welcome.
> -Ravi
>
>
> On 05/01/2015 12:18 PM, Benjamin Turner wrote:
>
> There was a segfault on gqas001, have a look when you get a sec:
>
>  Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id
> rebalance/testvol --xlator-option'.
> Program terminated with signal 11, Segmentation fault.
> #0  gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0,
> loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032
> 2032GF_FREE (tmp_container->parent_loc);
> (gdb) bt
> #0  gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0,
> loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032
> #1  gf_defrag_process_dir (this=0x7f26f8011180, defrag=0x7f26f8031ef0,
> loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2207
> #2  0x7f26fdae1eb8 in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbbfd0, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
> at dht-rebalance.c:2299
> #3  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc200, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
> at dht-rebalance.c:2416
> #4  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc430, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
> at dht-rebalance.c:2416
> #5  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc660, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
> at dht-rebalance.c:2416
> #6  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc890, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
> at dht-rebalance.c:2416
> #7  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbcac0, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
> at dht-rebalance.c:2416
> #8  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbccf0, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
> at dht-rebalance.c:2416
> #9  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbcf60, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
> at dht-rebalance.c:2416
> #10 0x7f26fdae2524 in gf_defrag_start_crawl (data=0x7f26f8011180) at
> dht-rebalance.c:2599
> #11 0x7f2709024f62 in synctask_wrap (old_task=)
> at syncop.c:375
> #12 0x003648c438f0 in ?? () from /lib64/libc-2.12.so
> #13 0x in ?? ()
>
>
> On Fri, May 1, 2015 at 12:53 AM, Benjamin Turner 
> wrote:
&

Re: [Gluster-devel] Gluster Slogans Revisited

2015-05-01 Thread Benjamin Turner
I liked:

Gluster: Redefine storage.
Gluster: Software-Defined Storage. Redefined
Gluster: RAID G
Gluster: RAISE (redundant array of inexpensive storage equipment)
Gluster: Software {re}defined storage+

And suggested:

Gluster {DS|FS|RAISE|RAIDG}: Software Defined Storage Redefined(some
combination of lines 38-42)

My thinking is:

: 

Gluster * - This is the change gluster fs to gluster ds idea we were
discussing this already.  Maybe we could even come up with a
longer acronym, like RAIDG or RAISE or something more definitive of what we
are?  I think instead of jsut using Gluster: we come up with the new way to
refer to glusterFS and use this as a way to push that as well.

Tagline - Whatever cool saying that get people excited to check out
glusterDS/glusterRAIDG/ whatever

So ex:

Gluster RAISE: Software Defined Storage Redifined
Gluster DS: Software defined storage defined your way
Gluster RAIDG: Storage from the ground up

Just my $0.02

-b


On Fri, May 1, 2015 at 2:51 PM, Tom Callaway  wrote:

> Hello Gluster Ants!
>
> Thanks for all the slogan suggestions that you've provided. I've made an
> etherpad page which collected them all, along with some additional
> suggestions made by Red Hat's Brand team:
>
> https://public.pad.fsfe.org/p/gluster-slogans
>
> Feel free to discuss them (either here or on the etherpad). If you like
> a particular slogan, feel free to put a + next to it on the etherpad.
>
> Before we can pick a new slogan, it needs to be cleared by Red Hat
> Legal, this is a small formality to make sure that we're not infringing
> someone else's trademark or doing anything that would cause Red Hat
> undue risk. We don't want to waste their time by having them clear every
> possible suggestion, so your feedback is very helpful to allow us to
> narrow down the list. At the end of the day, barring legal clearance,
> the slogan selection is up to the community.
>
> Thanks!
>
> ~tom
>
> ==
> Red Hat
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rebalance improvement design

2015-04-30 Thread Benjamin Turner
There was a segfault on gqas001, have a look when you get a sec:

Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id
rebalance/testvol --xlator-option'.
Program terminated with signal 11, Segmentation fault.
#0  gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0,
loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032
2032GF_FREE (tmp_container->parent_loc);
(gdb) bt
#0  gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0,
loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032
#1  gf_defrag_process_dir (this=0x7f26f8011180, defrag=0x7f26f8031ef0,
loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2207
#2  0x7f26fdae1eb8 in gf_defrag_fix_layout (this=0x7f26f8011180,
defrag=0x7f26f8031ef0, loc=0x7f26f4dbbfd0, fix_layout=0x7f2707874b5c,
migrate_data=0x7f2707874be8)
at dht-rebalance.c:2299
#3  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
defrag=0x7f26f8031ef0, loc=0x7f26f4dbc200, fix_layout=0x7f2707874b5c,
migrate_data=0x7f2707874be8)
at dht-rebalance.c:2416
#4  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
defrag=0x7f26f8031ef0, loc=0x7f26f4dbc430, fix_layout=0x7f2707874b5c,
migrate_data=0x7f2707874be8)
at dht-rebalance.c:2416
#5  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
defrag=0x7f26f8031ef0, loc=0x7f26f4dbc660, fix_layout=0x7f2707874b5c,
migrate_data=0x7f2707874be8)
at dht-rebalance.c:2416
#6  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
defrag=0x7f26f8031ef0, loc=0x7f26f4dbc890, fix_layout=0x7f2707874b5c,
migrate_data=0x7f2707874be8)
at dht-rebalance.c:2416
#7  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
defrag=0x7f26f8031ef0, loc=0x7f26f4dbcac0, fix_layout=0x7f2707874b5c,
migrate_data=0x7f2707874be8)
at dht-rebalance.c:2416
#8  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
defrag=0x7f26f8031ef0, loc=0x7f26f4dbccf0, fix_layout=0x7f2707874b5c,
migrate_data=0x7f2707874be8)
at dht-rebalance.c:2416
#9  0x7f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
defrag=0x7f26f8031ef0, loc=0x7f26f4dbcf60, fix_layout=0x7f2707874b5c,
migrate_data=0x7f2707874be8)
at dht-rebalance.c:2416
#10 0x7f26fdae2524 in gf_defrag_start_crawl (data=0x7f26f8011180) at
dht-rebalance.c:2599
#11 0x7f2709024f62 in synctask_wrap (old_task=) at
syncop.c:375
#12 0x003648c438f0 in ?? () from /lib64/libc-2.12.so
#13 0x in ?? ()


On Fri, May 1, 2015 at 12:53 AM, Benjamin Turner 
wrote:

> Ok I have all my data created and I just started the rebalance.  One thing
> to not in the client log I see the following spamming:
>
> [root@gqac006 ~]# cat /var/log/glusterfs/gluster-mount-.log | wc -l
> 394042
>
> [2015-05-01 00:47:55.591150] I [MSGID: 109036]
> [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
> Setting layout of /file_dstdir/
> gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006 with
> [Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop: 2141429669 ],
> [Subvol_name: testvol-replicate-1, Err: -1 , Start: 2141429670 , Stop:
> 4294967295 ],
> [2015-05-01 00:47:55.596147] I
> [dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht:
> chunk size = 0x / 19920276 = 0xd7
> [2015-05-01 00:47:55.596177] I
> [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
> assigning range size 0x7fa39fa6 to testvol-replicate-1
> [2015-05-01 00:47:55.596189] I
> [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
> assigning range size 0x7fa39fa6 to testvol-replicate-0
> [2015-05-01 00:47:55.597081] I [MSGID: 109036]
> [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
> Setting layout of /file_dstdir/
> gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_005 with
> [Subvol_name: testvol-replicate-0, Err: -1 , Start: 2141429670 , Stop:
> 4294967295 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start: 0 , Stop:
> 2141429669 ],
> [2015-05-01 00:47:55.601853] I
> [dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht:
> chunk size = 0x / 19920276 = 0xd7
> [2015-05-01 00:47:55.601882] I
> [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
> assigning range size 0x7fa39fa6 to testvol-replicate-1
> [2015-05-01 00:47:55.601895] I
> [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
> assigning range size 0x7fa39fa6 to testvol-replicate-0
>
> Just to confirm the patch is
> in, glusterfs-3.8dev-0.71.gita7f8482.el6.x86_64.  Correct?
>
> Here is the info on the data set:
>
> hosts in test : ['gqac006.sbu.lab.eng.bos.redhat.com', '
> gqas003.sbu.lab.eng.bos.redhat.com']
> top test directory(s) : ['

Re: [Gluster-devel] Rebalance improvement design

2015-04-30 Thread Benjamin Turner
Ok I have all my data created and I just started the rebalance.  One thing
to not in the client log I see the following spamming:

[root@gqac006 ~]# cat /var/log/glusterfs/gluster-mount-.log | wc -l
394042

[2015-05-01 00:47:55.591150] I [MSGID: 109036]
[dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
Setting layout of /file_dstdir/
gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006 with
[Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop: 2141429669 ],
[Subvol_name: testvol-replicate-1, Err: -1 , Start: 2141429670 , Stop:
4294967295 ],
[2015-05-01 00:47:55.596147] I
[dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht:
chunk size = 0x / 19920276 = 0xd7
[2015-05-01 00:47:55.596177] I
[dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
assigning range size 0x7fa39fa6 to testvol-replicate-1
[2015-05-01 00:47:55.596189] I
[dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
assigning range size 0x7fa39fa6 to testvol-replicate-0
[2015-05-01 00:47:55.597081] I [MSGID: 109036]
[dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
Setting layout of /file_dstdir/
gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_005 with
[Subvol_name: testvol-replicate-0, Err: -1 , Start: 2141429670 , Stop:
4294967295 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start: 0 , Stop:
2141429669 ],
[2015-05-01 00:47:55.601853] I
[dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht:
chunk size = 0x / 19920276 = 0xd7
[2015-05-01 00:47:55.601882] I
[dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
assigning range size 0x7fa39fa6 to testvol-replicate-1
[2015-05-01 00:47:55.601895] I
[dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
assigning range size 0x7fa39fa6 to testvol-replicate-0

Just to confirm the patch is
in, glusterfs-3.8dev-0.71.gita7f8482.el6.x86_64.  Correct?

Here is the info on the data set:

hosts in test : ['gqac006.sbu.lab.eng.bos.redhat.com', '
gqas003.sbu.lab.eng.bos.redhat.com']
top test directory(s) : ['/gluster-mount']
peration : create
files/thread : 50
threads : 8
record size (KB, 0 = maximum) : 0
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
total threads = 16
total files = 7222600
total data =   440.833 GB
 90.28% of requested files processed, minimum is  70.00
8107.852862 sec elapsed time
890.815377 files/sec
890.815377 IOPS
55.675961 MB/sec

Here is the rebalance run after about 5 or so minutes:

[root@gqas001 ~]# gluster v rebalance testvol status
Node Rebalanced-files  size
  scanned  failures   skipped   status   run time in
secs
   -  ---   ---
---   ---   --- 
--
   localhost32203 2.0GB
   120858 0  5184  in progress
 1294.00
  gqas011.sbu.lab.eng.bos.redhat.com00Bytes
0 0 0   failed
0.00
  gqas016.sbu.lab.eng.bos.redhat.com 9364   585.2MB
53121 0 0  in progress
 1294.00
  gqas013.sbu.lab.eng.bos.redhat.com00Bytes
14750 0 0  in progress
 1294.00
  gqas014.sbu.lab.eng.bos.redhat.com00Bytes
0 0 0   failed
0.00
  gqas015.sbu.lab.eng.bos.redhat.com00Bytes
   196382 0 0  in progress
 1294.00
volume rebalance: testvol: success:

The hostnames are there if you want to poke around.  I had a problem with
one of the added systems being on a different version of glusterfs so I had
to update everything to glusterfs-3.8dev-0.99.git7d7b80e.el6.x86_64, remove
the bricks I just added, and add them back.  Something may have went wrong
in that process but I thought I did everything correctly.  I'll start fresh
tomorrow.  I figured I'd let this run over night.

-b




On Wed, Apr 29, 2015 at 9:48 PM, Benjamin Turner 
wrote:

> Sweet!  Here is the baseline:
>
> [root@gqas001 ~]# gluster v rebalance testvol status
> Node Rebalanced-files  size
> scanned  failures   skipped   status   run time in
> secs
>-  ---   ---
> ---   ---   --- 
> --
>localhost  132857581.1GB
> 9402953 0 0completed
> 98500.00
>   gqas012.sbu.lab.eng.bos.redhat.com00

Re: [Gluster-devel] Rebalance improvement design

2015-04-29 Thread Benjamin Turner
Sweet!  Here is the baseline:

[root@gqas001 ~]# gluster v rebalance testvol status
Node Rebalanced-files  size
  scanned  failures   skipped   status   run time in
secs
   -  ---   ---
---   ---   --- 
--
   localhost  132857581.1GB
  9402953 0 0completed
98500.00
  gqas012.sbu.lab.eng.bos.redhat.com00Bytes
  811 0 0completed
51982.00
  gqas003.sbu.lab.eng.bos.redhat.com00Bytes
  811 0 0completed
51982.00
  gqas004.sbu.lab.eng.bos.redhat.com  132629081.0GB
  9708625 0 0completed
98500.00
  gqas013.sbu.lab.eng.bos.redhat.com00Bytes
  811 0 0completed
51982.00
  gqas014.sbu.lab.eng.bos.redhat.com00Bytes
  811 0 0completed
51982.00
volume rebalance: testvol: success:

I'll have a run on the patch started tomorrow.

-b

On Wed, Apr 29, 2015 at 12:51 PM, Nithya Balachandran 
wrote:

>
> Doh my mistake, I thought it was merged.  I was just running with the
> upstream 3.7 daily.  Can I use this run as my baseline and then I can run
> next time on the patch to show the % improvement?  I'll wipe everything and
> try on the patch, any idea when it will be merged?
>
> Yes, it would be very useful to have this run as the baseline. The patch
> has just been merged in master. It should be backported to 3.7 in a day or
> so.
>
> Regards,
> Nithya
>
>
> > > > >
> > > > > >
> > > > > > On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran
> > > > > > 
> > > > > > wrote:
> > > > > >
> > > > > > > That sounds great. Thanks.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Nithya
> > > > > > >
> > > > > > > - Original Message -
> > > > > > > From: "Benjamin Turner" 
> > > > > > > To: "Nithya Balachandran" 
> > > > > > > Cc: "Susant Palai" , "Gluster Devel" <
> > > > > > > gluster-devel@gluster.org>
> > > > > > > Sent: Wednesday, 22 April, 2015 12:14:14 AM
> > > > > > > Subject: Re: [Gluster-devel] Rebalance improvement design
> > > > > > >
> > > > > > > I am setting up a test env now, I'll have some feedback for you
> > this
> > > > > > > week.
> > > > > > >
> > > > > > > -b
> > > > > > >
> > > > > > > On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran
> > > > > > >  > > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Ben,
> > > > > > > >
> > > > > > > > Did you get a chance to try this out?
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Nithya
> > > > > > > >
> > > > > > > > - Original Message -
> > > > > > > > From: "Susant Palai" 
> > > > > > > > To: "Benjamin Turner" 
> > > > > > > > Cc: "Gluster Devel" 
> > > > > > > > Sent: Monday, April 13, 2015 9:55:07 AM
> > > > > > > > Subject: Re: [Gluster-devel] Rebalance improvement design
> > > > > > > >
> > > > > > > > Hi Ben,
> > > > > > > >   Uploaded a new patch here:
> > http://review.gluster.org/#/c/9657/.
> > > > > > > >   We
> > > > > > > >   can
> > > > > > > > start perf test on it. :)
> > > > > > > >
> > > > > > > > Susant
> > > > > > > >
> > > > > > > > - Original Message -
> > > > > > > > From: "Susant Palai" 
> > > > > > > > To: "Benjamin Turner" 
> > > > > > > > Cc: "Gluster 

Re: [Gluster-devel] Rebalance improvement design

2015-04-29 Thread Benjamin Turner
Doh my mistake, I thought it was merged.  I was just running with the
upstream 3.7 daily.  Can I use this run as my baseline and then I can run
next time on the patch to show the % improvement?  I'll wipe everything and
try on the patch, any idea when it will be merged?

-b

On Wed, Apr 29, 2015 at 5:34 AM, Susant Palai  wrote:

> Hi Ben
>I checked out the glusterfs process attaching gdb and I could not find
> the newer code. Can you confirm whether you took the new patch ? patch i:
> http://review.gluster.org/#/c/9657/
>
> Thanks,
> Susant
>
>
> - Original Message -
> > From: "Susant Palai" 
> > To: "Benjamin Turner" , "Nithya Balachandran" <
> nbala...@redhat.com>
> > Cc: "Shyamsundar Ranganathan" 
> > Sent: Wednesday, April 29, 2015 1:22:02 PM
> > Subject: Re: [Gluster-devel] Rebalance improvement design
> >
> > This is how it looks for 2000 file. each 1MB. Done rebalance on 2*2 + 2.
> >
> > OLDER:
> > [root@gprfs030 ~]# gluster v rebalance test1 status
> > Node Rebalanced-files  size
> > scanned  failures
> > skipped   status   run
> > time in secs
> >-  ---   ---
> >---   ---   ---
> > --
> >localhost 2000 1.9GB
> >3325 0 0
> >completed  63.00
> >gprfs032-10ge00Bytes
> >2158 0 0
> >completed   6.00
> > volume rebalance: test1: success:
> > [root@gprfs030 ~]#
> >
> >
> > NEW:
> > [root@gprfs030 upstream_rebalance]# gluster v rebalance test1 status
> > Node Rebalanced-files  size
> > scanned  failures
> > skipped   status   run
> > time in secs
> >-  ---   ---
> >---   ---   ---
> > --
> >localhost 2000 1.9GB
> >2011 0 0
> >completed  12.00
> >gprfs032-10ge00Bytes
> >0 0 0
> >failed   0.00 [Failed
> >because of a crash which I will address in
> next
> >patch]
> > volume rebalance: test1: success:
> >
> >
> > Just trying out replica behaviour for rebalance.
> >
> > Here is the volume info.
> > [root@gprfs030 ~]# gluster v i
> >
> > Volume Name: test1
> > Type: Distributed-Replicate
> > Volume ID: e12ef289-86f2-454a-beaa-72ea763dbada
> > Status: Started
> > Number of Bricks: 3 x 2 = 6
> > Transport-type: tcp
> > Bricks:
> > Brick1: gprfs030-10ge:/bricks/gprfs030/brick1
> > Brick2: gprfs032-10ge:/bricks/gprfs032/brick1
> > Brick3: gprfs030-10ge:/bricks/gprfs030/brick2
> > Brick4: gprfs032-10ge:/bricks/gprfs032/brick2
> > Brick5: gprfs030-10ge:/bricks/gprfs030/brick3
> > Brick6: gprfs032-10ge:/bricks/gprfs032/brick3
> >
> >
> >
> > - Original Message -
> > > From: "Susant Palai" 
> > > To: "Benjamin Turner" 
> > > Cc: "Gluster Devel" 
> > > Sent: Wednesday, April 29, 2015 1:13:04 PM
> > > Subject: Re: [Gluster-devel] Rebalance improvement design
> > >
> > > Ben, will you be able to give rebal stat for the same configuration and
> > > data
> > > set with older rebalance infra ?
> > >
> > > Thanks,
> > > Susant
> > >
> > > - Original Message -
> > > > From: "Susant Palai" 
> > > > To: "Benjamin Turner" 
> > > > Cc: "Gluster Devel" 
> > > > Sent: Wedn

Re: [Gluster-devel] Rebalance improvement design

2015-04-28 Thread Benjamin Turner
I am not seeing the performance you were.  I am running on 500GB of data:

[root@gqas001 ~]# gluster v rebalance testvol status
  Node Rebalanced-files
 size   scanned  failures   skipped   status   run
time in secs
-  ---
---   ---   ---   --- 
--
localhost   129021
7.9GB912104 0 0  in progress
10100.00
gqas012.sbu.lab.eng.bos.redhat.com00Bytes
1930312 0 0  in progress   10100.00
gqas003.sbu.lab.eng.bos.redhat.com00Bytes
1930312 0 0  in progress   10100.00
gqas004.sbu.lab.eng.bos.redhat.com   128903 7.9GB
 946730 0 0  in progress   10100.00
gqas013.sbu.lab.eng.bos.redhat.com00Bytes
1930312 0 0  in progress   10100.00
gqas014.sbu.lab.eng.bos.redhat.com00Bytes
1930312 0 0  in progress   10100.00

Based on what I am seeing I expect this to take 2 days.  Was you rebal run
on a pure dist volume?  I am trying on 2x2 + 2 new bricks.  Any idea why
mine is taking so long?

-b



On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran 
wrote:

> That sounds great. Thanks.
>
> Regards,
> Nithya
>
> - Original Message -
> From: "Benjamin Turner" 
> To: "Nithya Balachandran" 
> Cc: "Susant Palai" , "Gluster Devel" <
> gluster-devel@gluster.org>
> Sent: Wednesday, 22 April, 2015 12:14:14 AM
> Subject: Re: [Gluster-devel] Rebalance improvement design
>
> I am setting up a test env now, I'll have some feedback for you this week.
>
> -b
>
> On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran  >
> wrote:
>
> > Hi Ben,
> >
> > Did you get a chance to try this out?
> >
> > Regards,
> > Nithya
> >
> > - Original Message -
> > From: "Susant Palai" 
> > To: "Benjamin Turner" 
> > Cc: "Gluster Devel" 
> > Sent: Monday, April 13, 2015 9:55:07 AM
> > Subject: Re: [Gluster-devel] Rebalance improvement design
> >
> > Hi Ben,
> >   Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can
> > start perf test on it. :)
> >
> > Susant
> >
> > - Original Message -
> > From: "Susant Palai" 
> > To: "Benjamin Turner" 
> > Cc: "Gluster Devel" 
> > Sent: Thursday, 9 April, 2015 3:40:09 PM
> > Subject: Re: [Gluster-devel] Rebalance improvement design
> >
> > Thanks Ben. RPM is not available and I am planning to refresh the patch
> in
> > two days with some more regression fixes. I think we can run the tests
> post
> > that. Any larger data-set will be good(say 3 to 5 TB).
> >
> > Thanks,
> > Susant
> >
> > - Original Message -
> > From: "Benjamin Turner" 
> > To: "Vijay Bellur" 
> > Cc: "Susant Palai" , "Gluster Devel" <
> > gluster-devel@gluster.org>
> > Sent: Thursday, 9 April, 2015 2:10:30 AM
> > Subject: Re: [Gluster-devel] Rebalance improvement design
> >
> >
> > I have some rebalance perf regression stuff I have been working on, is
> > there an RPM with these patches anywhere so that I can try it on my
> > systems? If not I'll just build from:
> >
> >
> > git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 &&
> > git cherry-pick FETCH_HEAD
> >
> >
> >
> > I will have _at_least_ 10TB of storage, how many TBs of data should I run
> > with?
> >
> >
> > -b
> >
> >
> > On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur < vbel...@redhat.com >
> wrote:
> >
> >
> >
> >
> > On 04/07/2015 03:08 PM, Susant Palai wrote:
> >
> >
> > Here is one test performed on a 300GB data set and around 100%(1/2 the
> > time) improvement was seen.
> >
> > [root@gprfs031 ~]# gluster v i
> >
> > Volume Name: rbperf
> > Type: Distribute
> > Volume ID: 35562662-337e-4923-b862- d0bbb0748003
> > Status: Started
> > Number of Bricks: 4
> > Transport-type: tcp
> > Bricks:
> > Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1
> > Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1
> > Brick3: gprfs0

Re: [Gluster-devel] Gluster Benchmark Kit

2015-04-27 Thread Benjamin Turner
Hi Kiran, thanks for the feedback!  I already put up a repo on githib:

https://github.com/bennyturns/gluster-bench

On my TODO list is:

-The benchmark is currently RHEL / RHGS(Red Hat Gluster Storage) specific,
I want to make things work with at least non paid RPM distros and Ubuntu.
-Other filesystems(like you mentioned)
-No LVM and non thinp config options.
-EC, tiering, snapshot capabilities.

I'll probably fork things and have a Red Hat specific version and an
upstream version.  As soon as I have everything working on Centos I'll let
the list know and we can enhance things to do whatever we need.  I always
thought it would be interesting if we had a page where people could submit
their benchmark data and the HW / config used.  Having a standard tool /
tool set will help there.

-b


On Mon, Apr 27, 2015 at 3:31 AM, Kiran Patil  wrote:

> Hi,
>
> I came across "Gluster Benchmark Kit" while reading [Gluster-users]
> Disastrous performance with rsync to mounted Gluster volume thread.
>
> http://54.82.237.211/gluster-benchmark/gluster-bench-README
>
> http://54.82.237.211/gluster-benchmark
>
> The Kit includes tools such as iozone, smallfile and fio.
>
> This Kit is not documented and need to baseline this tool for Gluster
> Benchmark testing.
>
> The community is going to benefit by adopting and extending it as per
> their needs and the kit should be hosted on Github.
>
> The init.sh script in the Kit contains only XFS filesystem which can be
> extended to BTRFS and ZFS.
>
> Thanks Ben Turner for sharing it.
>
> Kiran.
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rebalance improvement design

2015-04-21 Thread Benjamin Turner
I am setting up a test env now, I'll have some feedback for you this week.

-b

On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran 
wrote:

> Hi Ben,
>
> Did you get a chance to try this out?
>
> Regards,
> Nithya
>
> - Original Message -
> From: "Susant Palai" 
> To: "Benjamin Turner" 
> Cc: "Gluster Devel" 
> Sent: Monday, April 13, 2015 9:55:07 AM
> Subject: Re: [Gluster-devel] Rebalance improvement design
>
> Hi Ben,
>   Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can
> start perf test on it. :)
>
> Susant
>
> - Original Message -
> From: "Susant Palai" 
> To: "Benjamin Turner" 
> Cc: "Gluster Devel" 
> Sent: Thursday, 9 April, 2015 3:40:09 PM
> Subject: Re: [Gluster-devel] Rebalance improvement design
>
> Thanks Ben. RPM is not available and I am planning to refresh the patch in
> two days with some more regression fixes. I think we can run the tests post
> that. Any larger data-set will be good(say 3 to 5 TB).
>
> Thanks,
> Susant
>
> - Original Message -
> From: "Benjamin Turner" 
> To: "Vijay Bellur" 
> Cc: "Susant Palai" , "Gluster Devel" <
> gluster-devel@gluster.org>
> Sent: Thursday, 9 April, 2015 2:10:30 AM
> Subject: Re: [Gluster-devel] Rebalance improvement design
>
>
> I have some rebalance perf regression stuff I have been working on, is
> there an RPM with these patches anywhere so that I can try it on my
> systems? If not I'll just build from:
>
>
> git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 &&
> git cherry-pick FETCH_HEAD
>
>
>
> I will have _at_least_ 10TB of storage, how many TBs of data should I run
> with?
>
>
> -b
>
>
> On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur < vbel...@redhat.com > wrote:
>
>
>
>
> On 04/07/2015 03:08 PM, Susant Palai wrote:
>
>
> Here is one test performed on a 300GB data set and around 100%(1/2 the
> time) improvement was seen.
>
> [root@gprfs031 ~]# gluster v i
>
> Volume Name: rbperf
> Type: Distribute
> Volume ID: 35562662-337e-4923-b862- d0bbb0748003
> Status: Started
> Number of Bricks: 4
> Transport-type: tcp
> Bricks:
> Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1
> Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1
> Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1
> Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1
>
>
> Added server 32 and started rebalance force.
>
> Rebalance stat for new changes:
> [root@gprfs031 ~]# gluster v rebalance rbperf status
> Node Rebalanced-files size scanned failures skipped status run time in secs
> - --- --- --- --- ---
>  --
> localhost 74639 36.1GB 297319 0 0 completed 1743.00
> 172.17.40.30 67512 33.5GB 269187 0 0 completed 1395.00
> gprfs029-10ge 79095 38.8GB 284105 0 0 completed 1559.00
> gprfs032-10ge 0 0Bytes 0 0 0 completed 402.00
> volume rebalance: rbperf: success:
>
> Rebalance stat for old model:
> [root@gprfs031 ~]# gluster v rebalance rbperf status
> Node Rebalanced-files size scanned failures skipped status run time in secs
> - --- --- --- --- ---
>  --
> localhost 86493 42.0GB 634302 0 0 completed 3329.00
> gprfs029-10ge 94115 46.2GB 687852 0 0 completed 3328.00
> gprfs030-10ge 74314 35.9GB 651943 0 0 completed 3072.00
> gprfs032-10ge 0 0Bytes 594166 0 0 completed 1943.00
> volume rebalance: rbperf: success:
>
>
> This is interesting. Thanks for sharing & well done! Maybe we should
> attempt a much larger data set and see how we fare there :).
>
> Regards,
>
>
> Vijay
>
>
> __ _
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/ mailman/listinfo/gluster-devel
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rebalance improvement design

2015-04-08 Thread Benjamin Turner
I have some rebalance perf regression stuff I have been working on, is
there an RPM with these patches anywhere so that I can try it on my
systems?  If not I'll just build from:

git fetch git://review.gluster.org/glusterfs refs/changes/57/9657/8 && git
cherry-pick FETCH_HEAD

I will have _at_least_ 10TB of storage, how many TBs of data should I run
with?

-b

On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur  wrote:

> On 04/07/2015 03:08 PM, Susant Palai wrote:
>
>> Here is one test performed on a 300GB data set and around 100%(1/2 the
>> time) improvement was seen.
>>
>> [root@gprfs031 ~]# gluster v i
>>
>> Volume Name: rbperf
>> Type: Distribute
>> Volume ID: 35562662-337e-4923-b862-d0bbb0748003
>> Status: Started
>> Number of Bricks: 4
>> Transport-type: tcp
>> Bricks:
>> Brick1: gprfs029-10ge:/bricks/gprfs029/brick1
>> Brick2: gprfs030-10ge:/bricks/gprfs030/brick1
>> Brick3: gprfs031-10ge:/bricks/gprfs031/brick1
>> Brick4: gprfs032-10ge:/bricks/gprfs032/brick1
>>
>>
>> Added server 32 and started rebalance force.
>>
>> Rebalance stat for new changes:
>> [root@gprfs031 ~]# gluster v rebalance rbperf status
>>  Node Rebalanced-files  size
>>  scanned  failures   skipped   status   run time in
>> secs
>> -  ---   ---
>>  ---   ---   --- 
>>  --
>> localhost7463936.1GB
>>   297319 0 0completed
>> 1743.00
>>  172.17.40.306751233.5GB
>>   269187 0 0completed
>> 1395.00
>> gprfs029-10ge7909538.8GB
>>   284105 0 0completed
>> 1559.00
>> gprfs032-10ge00Bytes
>>0 0 0completed
>>  402.00
>> volume rebalance: rbperf: success:
>>
>> Rebalance stat for old model:
>> [root@gprfs031 ~]# gluster v rebalance rbperf status
>>  Node Rebalanced-files  size
>>  scanned  failures   skipped   status   run time in
>> secs
>> -  ---   ---
>>  ---   ---   --- 
>>  --
>> localhost8649342.0GB
>>   634302 0 0completed
>> 3329.00
>> gprfs029-10ge9411546.2GB
>>   687852 0 0completed
>> 3328.00
>> gprfs030-10ge7431435.9GB
>>   651943 0 0completed
>> 3072.00
>> gprfs032-10ge00Bytes
>>   594166 0 0completed
>> 1943.00
>> volume rebalance: rbperf: success:
>>
>>
> This is interesting. Thanks for sharing & well done! Maybe we should
> attempt a much larger data set and see how we fare there :).
>
> Regards,
>
> Vijay
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Looking for volunteer to write up official "How to do GlusterFS in the Cloud: The Right Way" for Rackspace...

2015-02-17 Thread Benjamin Turner
This is interesting to me, I'd like the chance to run my performance tests
on a cloud provider's systems.  We could put together some recommendations
for configuration, tuning, and performance numbers?  Also it would be cool
to enhance my setup scripts to work with cloud instances.  Sound like what
you are looking for ish?

-b

On Tue, Feb 17, 2015 at 5:06 PM, Justin Clift  wrote:

> On 17 Feb 2015, at 21:49, Josh Boon  wrote:
> > Do we have use cases to focus on? Gluster is part of the answer to many
> different questions so if it's things like simple replication and
> distribution and basic performance tuning I could help. I also have a heavy
> Ubuntu tilt so if it's Red Hat oriented I'm not much help :)
>
> Jesse, thoughts on this?
>
> I kinda think it would be useful to have instructions which give
> correct steps for Ubuntu + Red Hat (and anything else suitable).
>
> Josh, if Jesse agrees, then your Ubuntu knowledge will probably
> be useful for this. ;)
>
> + Justin
>
>
> >
> > - Original Message -
> > From: "Justin Clift" 
> > To: "Gluster Users" , "Gluster Devel" <
> gluster-devel@gluster.org>
> > Cc: "Jesse Noller" 
> > Sent: Tuesday, February 17, 2015 9:37:05 PM
> > Subject: [Gluster-devel] Looking for volunteer to write up official "How
> to   do GlusterFS in the Cloud: The Right Way" for Rackspace...
> >
> > Yeah, huge subject line.  :)
> >
> > But it gets the message across... Rackspace provide us a *bunch* of
> online VM's
> > which we have our infrastructure in + run the majority of our regression
> tests
> > with.
> >
> > They've asked us if we could write up a "How to do GlusterFS in the
> Cloud: The
> > Right Way" (technical) doc, for them to add to their doc collection.
> > They get asked for this a lot by customers. :D
> >
> > Sooo... looking for volunteers to write this up.  And yep, you're
> welcome to
> > have your name all over it (eg this is good promo/CV material :>)
> >
> > VM's (in Rackspace obviously) will be provided of course.
> >
> > Anyone interested?
> >
> > (Note - not suitable for a GlusterFS newbie. ;))
> >
> > Regards and best wishes,
> >
> > Justin Clift
> >
> > --
> > GlusterFS - http://www.gluster.org
> >
> > An open source, distributed file system scaling to several
> > petabytes, and handling thousands of clients.
> >
> > My personal twitter: twitter.com/realjustinclift
> >
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
>
> --
> GlusterFS - http://www.gluster.org
>
> An open source, distributed file system scaling to several
> petabytes, and handling thousands of clients.
>
> My personal twitter: twitter.com/realjustinclift
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] missing files

2015-02-05 Thread Benjamin Turner
Correct!  I have seen(back in the day, its been 3ish years since I have
seen it) having say 50+ volumes each with a geo rep session take system
load levels to the point where pings couldn't be serviced within the ping
timeout.  So it is known to happen but there has been alot of work in the
geo rep space to help here, some of which is discussed:

https://medium.com/@msvbhat/distributed-geo-replication-in-glusterfs-ec95f4393c50

(think tar + ssh and other fixes)Your symptoms remind me of that case of
50+ geo repd volumes, thats why I mentioned it from the start.  My current
shoot from the hip theory is when rsyncing all that data the servers got
too busy to service the pings and it lead to disconnects.  This is common
across all of the clustering / distributed software I have worked on, if
the system gets too busy to service heartbeat within the timeout things go
crazy(think fork bomb on a single host).  Now this could be a case of me
putting symptoms from an old issue into what you are describing, but thats
where my head is at.  If I'm correct I should be able to repro using a
similar workload.  I think that the multi threaded epoll changes that
_just_ landed in master will help resolve this, but they are so new I
haven't been able to test this.  I'll know more when I get a chance to test
tomorrow.

-b

On Thu, Feb 5, 2015 at 6:04 PM, David F. Robinson <
david.robin...@corvidtec.com> wrote:

> Isn't rsync what geo-rep uses?
>
> David  (Sent from mobile)
>
> ===
> David F. Robinson, Ph.D.
> President - Corvid Technologies
> 704.799.6944 x101 [office]
> 704.252.1310  [cell]
> 704.799.7974  [fax]
> david.robin...@corvidtec.com
> http://www.corvidtechnologies.com
>
> > On Feb 5, 2015, at 5:41 PM, Ben Turner  wrote:
> >
> > - Original Message -
> >> From: "Ben Turner" 
> >> To: "David F. Robinson" 
> >> Cc: "Pranith Kumar Karampuri" , "Xavier
> Hernandez" , "Benjamin Turner"
> >> , gluster-us...@gluster.org, "Gluster Devel" <
> gluster-devel@gluster.org>
> >> Sent: Thursday, February 5, 2015 5:22:26 PM
> >> Subject: Re: [Gluster-users] [Gluster-devel] missing files
> >>
> >> - Original Message -
> >>> From: "David F. Robinson" 
> >>> To: "Ben Turner" 
> >>> Cc: "Pranith Kumar Karampuri" , "Xavier
> Hernandez"
> >>> , "Benjamin Turner"
> >>> , gluster-us...@gluster.org, "Gluster Devel"
> >>> 
> >>> Sent: Thursday, February 5, 2015 5:01:13 PM
> >>> Subject: Re: [Gluster-users] [Gluster-devel] missing files
> >>>
> >>> I'll send you the emails I sent Pranith with the logs. What causes
> these
> >>> disconnects?
> >>
> >> Thanks David!  Disconnects happen when there are interruption in
> >> communication between peers, normally there is ping timeout that
> happens.
> >> It could be anything from a flaky NW to the system was to busy to
> respond
> >> to the pings.  My initial take is more towards the ladder as rsync is
> >> absolutely the worst use case for gluster - IIRC it writes in 4kb
> blocks.  I
> >> try to keep my writes at least 64KB as in my testing that is the
> smallest
> >> block size I can write with before perf starts to really drop off.
> I'll try
> >> something similar in the lab.
> >
> > Ok I do think that the file being self healed is RCA for what you were
> seeing.  Lets look at one of the disconnects:
> >
> > data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I
> [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection
> from
> gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1
> >
> > And in the glustershd.log from the gfs01b_glustershd.log file:
> >
> > [2015-02-03 20:55:48.001797] I
> [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0:
> performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448
> > [2015-02-03 20:55:49.341996] I
> [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0:
> Completed entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448. source=1
> sinks=0
> > [2015-02-03 20:55:49.343093] I
> [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0:
> performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69
> > [2015-02-03 20:55:50.463652] I
> [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0:
> Completed entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69. source

Re: [Gluster-devel] missing files

2015-02-03 Thread Benjamin Turner
It sounds to me like the files were only copied to one replica, werent
there for the initial for the initial ls which triggered a self heal, and
were there for the last ls because they were healed.  Is there any chance
that one of the replicas was down during the rsync?  It could be that you
lost a brick during copy or something like that.  To confirm I would look
for disconnects in the brick logs as well as checking glusterfshd.log to
verify the missing files were actually healed.

-b

On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson <
david.robin...@corvidtec.com> wrote:

>  I rsync'd 20-TB over to my gluster system and noticed that I had some
> directories missing even though the rsync completed normally.
> The rsync logs showed that the missing files were transferred.
>
> I went to the bricks and did an 'ls -al /data/brick*/homegfs/dir/*' the
> files were on the bricks.  After I did this 'ls', the files then showed up
> on the FUSE mounts.
>
> 1) Why are the files hidden on the fuse mount?
> 2) Why does the ls make them show up on the FUSE mount?
> 3) How can I prevent this from happening again?
>
> Note, I also mounted the gluster volume using NFS and saw the same
> behavior.  The files/directories were not shown until I did the "ls" on the
> bricks.
>
> David
>
>
>
>  ===
> David F. Robinson, Ph.D.
> President - Corvid Technologies
> 704.799.6944 x101 [office]
> 704.252.1310 [cell]
> 704.799.7974 [fax]
> david.robin...@corvidtec.com
> http://www.corvidtechnologies.com
>
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NFS directory not empty log messages.

2014-07-21 Thread Benjamin Turner
>From what I can tell I incorrectly assumed that I was seeing a problem
deleting directories as described in:

https://bugzilla.redhat.com/show_bug.cgi?id=1121347#c4

This is definitely different than Anders is seeing in the duplicate entries
email thread.  Should I close my original issue and split it up as I
descried in the above comment?

-b
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume

2014-07-19 Thread Benjamin Turner
On Fri, Jul 18, 2014 at 10:43 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
> On 07/18/2014 07:57 PM, Anders Blomdell wrote:
>
>> During testing of a 3*4 gluster (from master as of yesterday), I
>> encountered
>> two major weirdnesses:
>>
>>1. A 'rm -rf ' needed several invocations to finish, each
>> time
>>   reporting a number of lines like these:
>> rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty
>>
>
This is reproducible for me when running dbench on nfs mounts.  I think I
may have seen it on glusterfs mounts as well but it seems more reproducible
on nfs.  I should have caught it sooner but it doesn't error out client
side when cleaning up, and the next test I run the deletes are successful.
 When this happens in the nfs.log I see:

This spams the log, from what I can tell it happens when dbench is creating
the files:
[2014-07-19 13:37:03.271651] I [MSGID: 109036]
[dht-common.c:5694:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
Setting layout of /clients/client3/~dmtmp/SEED with [Subvol_name:
testvol-replicate-0, Err: -1 , Start: 2147483647 , Stop: 4294967295 ],
[Subvol_name: testvol-replicate-1, Err: -1 , Start: 0 , Stop: 2147483646 ],

Then when the deletes fail I see the following when the client is removing
the files:
[2014-07-18 23:31:44.272465] W [nfs3.c:3518:nfs3svc_rmdir_cbk] 0-nfs:
74a6541a: /run8063_dbench/clients => -1 (Directory not empty)
.
.
[2014-07-18 23:31:44.452988] W [nfs3.c:3518:nfs3svc_rmdir_cbk] 0-nfs:
7ea9541a: /run8063_dbench/clients => -1 (Directory not empty)
[2014-07-18 23:31:45.262651] W
[client-rpc-fops.c:1354:client3_3_access_cbk] 0-testvol-client-0: remote
operation failed: Stale file handle
[2014-07-18 23:31:45.263151] W [MSGID: 108008]
[afr-read-txn.c:218:afr_read_txn] 0-testvol-replicate-0: Unreadable
subvolume -1 found with e
vent generation 2. (Possible split-brain)
[2014-07-18 23:31:45.264196] W [nfs3.c:1532:nfs3svc_access_cbk] 0-nfs:
32ac541a:  => -1 (Stale fi
le handle)
[2014-07-18 23:31:45.264217] W [nfs3-helpers.c:3401:nfs3_log_common_res]
0-nfs-nfsv3: XID: 32ac541a, ACCESS: NFS: 70(Invalid file handle), P
OSIX: 116(Stale file handle)
[2014-07-18 23:31:45.266818] W [nfs3.c:1532:nfs3svc_access_cbk] 0-nfs:
33ac541a:  => -1 (Stale fi
le handle)
[2014-07-18 23:31:45.266853] W [nfs3-helpers.c:3401:nfs3_log_common_res]
0-nfs-nfsv3: XID: 33ac541a, ACCESS: NFS: 70(Invalid file handle), P
OSIX: 116(Stale file handle)

Occasionally I see:
[2014-07-19 13:50:46.091429] W [socket.c:529:__socket_rwv] 0-NLM-client:
readv on 192.168.11.102:45823 failed (No data available)
[2014-07-19 13:50:46.091570] E [rpc-transport.c:485:rpc_transport_unref]
(-->/usr/lib64/glusterfs/3.5qa2/xlator/nfs/server.so(nlm_rpcclnt_notify+0x5a)
[0x7f53775128ea]
(-->/usr/lib64/glusterfs/3.5qa2/xlator/nfs/server.so(nlm_unset_rpc_clnt+0x75)
[0x7f537750e3e5] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_unref+0x63)
[0x7f5388914693]))) 0-rpc_transport: invalid argument: this

I'm opening a BZ now, I'll leave systems up and put the repro steps +
hostnames in the BZ in case anyone wants to poke around.

-b



>
>>2. After having successfully deleted all files from the volume,
>>   i have a single directory that is duplicated in gluster-fuse,
>>   like this:
>> # ls -l /mnt/gluster
>>  total 24
>>  drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/
>>  drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/
>>
>> any idea on how to debug this issue?
>>
> What are the steps to recreate? We need to first find what lead to this.
> Then probably which xlator leads to this.
>

I have not seen this but I am running on a 6x2 volume.  I wonder if this
may only happen with replica > 2?


>
> Pranith
>
>>
>> /Anders
>>
>>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Creating the Gluster Software Distribution Bundle

2014-07-09 Thread Benjamin Turner
On Wed, Jul 9, 2014 at 12:36 PM, John Mark Walker 
wrote:

> Greetings,
>
> One of the things I wanted to do with the forge, was use it as the basis
> for creating a software distribution of tools related to GlusterFS.
> GlusterFS would remain the centerpiece, and we would add all some tools
> into the mix that would be helpful for admins and developers. Some of the
> packages I had in mind include:
>
> - GlusterFS 3.5.x
> - oVirt, pre-configured for GlusterFS
> - puppet-gluster
> - pmux + gflocator - file-based map/reduce tools
> - pmux-gw - RESTful gateway for interacting with pmux
> - glubix - zabbix integration
> - nagios integration
> - Swift on File
> - GlusterFlow (ELK-based profiling and analysis web app for GlusterFS)
> - HDFS integration
> - various GFAPI bindings
> - GluPy (unless already included in 3.5)
> - gluster-deploy
> - dispersed volume (I know it's marked for 3.6, but you can use it with
> 3.5, and this encourages early adopters)
> - smallfile perf testing
>

IOZone and iperf.  What about samba / CTDB packages as well?


>
> Any others? For many of these, we'll need some packaging help. For a first
> cut release, I don't think the packaging needs to be perfect. In fact, this
> may be the sort of thing we want to release as a self-contained and fully
> deployed virtual machine/container/appliance.
>
> Thoughts?
>
> -JM
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] FS Sanity daily results.

2014-07-05 Thread Benjamin Turner
Hi all.  I have been running FS sanity on daily builds(glusterfs mounts
only at this point) for a few days for a few days and I have been hitting a
couple of problems:

 final pass/fail report =
   Test Date: Sat Jul  5 01:53:00 EDT 2014
   Total : [44]
   Passed: [41]
   Failed: [3]
   Abort : [0]
   Crash : [0]
-
   [   PASS   ]  FS Sanity Setup
   [   PASS   ]  Running tests.
   [   PASS   ]  FS SANITY TEST - arequal
   [   PASS   ]  FS SANITY LOG SCAN - arequal
   [   PASS   ]  FS SANITY LOG SCAN - bonnie
   [   PASS   ]  FS SANITY TEST - glusterfs_build
   [   PASS   ]  FS SANITY LOG SCAN - glusterfs_build
   [   PASS   ]  FS SANITY TEST - compile_kernel
   [   PASS   ]  FS SANITY LOG SCAN - compile_kernel
   [   PASS   ]  FS SANITY TEST - dbench
   [   PASS   ]  FS SANITY LOG SCAN - dbench
   [   PASS   ]  FS SANITY TEST - dd
   [   PASS   ]  FS SANITY LOG SCAN - dd
   [   PASS   ]  FS SANITY TEST - ffsb
   [   PASS   ]  FS SANITY LOG SCAN - ffsb
   [   PASS   ]  FS SANITY TEST - fileop
   [   PASS   ]  FS SANITY LOG SCAN - fileop
   [   PASS   ]  FS SANITY TEST - fsx
   [   PASS   ]  FS SANITY LOG SCAN - fsx
   [   PASS   ]  FS SANITY LOG SCAN - fs_mark
   [   PASS   ]  FS SANITY TEST - iozone
   [   PASS   ]  FS SANITY LOG SCAN - iozone
   [   PASS   ]  FS SANITY TEST - locks
   [   PASS   ]  FS SANITY LOG SCAN - locks
   [   PASS   ]  FS SANITY TEST - ltp
   [   PASS   ]  FS SANITY LOG SCAN - ltp
   [   PASS   ]  FS SANITY TEST - multiple_files
   [   PASS   ]  FS SANITY LOG SCAN - multiple_files
   [   PASS   ]  FS SANITY TEST - posix_compliance
   [   PASS   ]  FS SANITY LOG SCAN - posix_compliance
   [   PASS   ]  FS SANITY TEST - postmark
   [   PASS   ]  FS SANITY LOG SCAN - postmark
   [   PASS   ]  FS SANITY TEST - read_large
   [   PASS   ]  FS SANITY LOG SCAN - read_large
   [   PASS   ]  FS SANITY TEST - rpc
   [   PASS   ]  FS SANITY LOG SCAN - rpc
   [   PASS   ]  FS SANITY TEST - syscallbench
   [   PASS   ]  FS SANITY LOG SCAN - syscallbench
   [   PASS   ]  FS SANITY TEST - tiobench
   [   PASS   ]  FS SANITY LOG SCAN - tiobench
   [   PASS   ]  FS Sanity Cleanup

   [   FAIL   ]  FS SANITY TEST - bonnie
   [   FAIL   ]  FS SANITY TEST - fs_mark
   [   FAIL   ]
/rhs-tests/beaker/rhs/auto-tests/components/sanity/fs-sanity-tests-v2


Bonnie++ is just very slow(running for 10+ hours on 1 16 GB file) and
FS mark has been failing.  The bonnie slowness is in re read, here is
the best explanation I can find on it:

https://blogs.oracle.com/roch/entry/decoding_bonnie

*Rewriting...done*

This gets a little interesting. It actually reads 8K, lseek back to
the start of the block, overwrites the 8K with new data and loops.
(see article for more.).

On FS mark I am seeing:

#  fs_mark  -d  .  -D  4  -t  4  -S  5
#   Version 3.3, 4 thread(s) starting at Sat Jul  5 00:54:00 2014
#   Sync method: POST: Reopen and fsync() each file in order after main
write loop.
#   Directories:  Time based hash between directories across 4
subdirectories with 180 seconds per subdirectory.
#   File names: 40 bytes long, (16 initial bytes of time stamp with 24
random bytes at end of name)
#   Files info: size 51200 bytes, written with an IO size of 16384 bytes 
per write
#   App overhead is time in microseconds spent in the test not doing
file writing related system calls.

FSUse%Count SizeFiles/sec App Overhead
Error in unlink of ./00/53b784e8SKZ0QS9BO7O2EG1DIFQLRDYY : No
such file or directory
fopen failed to open: fs_log.txt.26676
fs-mark pass # 5 failed

I am working on reporting so look for a daily status report email from
my jenkins server soon.  How do we want to handle failures like this
moving forward?  Should I just open a BZ after I triage?  Do you guys
do a new BZ for every failure in the normal regressions tests?


-b
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.5.1 beta 2 Sanity tests

2014-06-20 Thread Benjamin Turner
On Fri, Jun 20, 2014 at 4:01 AM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
> On 06/19/2014 11:32 PM, Justin Clift wrote:
>
>> On 19/06/2014, at 6:55 PM, Benjamin Turner wrote:
>> 
>>
>>> I went through these a while back and removed anything that wasn't valid
>>> for GlusterFS.  This test was passing on 3.4.59 when it was released, i am
>>> thinking it may have something to do with a sym link to the same directory
>>> bz i found a while back? Idk, I'll get it sorted tomorrow.
>>>
>>> I got this sorted, I needed to add a sleep between the file create and
>>> the link.  I ran through it manually and it worked every time, took me a
>>> few goes to think of timing issue.  I didn't need this on 3.4.0.59, is
>>> there anything that needs investigated?
>>>
>> Any ideas? :)
>>
> Nope :-(


Ok no problem.  I was unable to repro outside the script and I tried
everything I could think of.  I am just going to leave the sleep 1s in
there and keep an eye on these moving forward.  Thanks guys!

-b


>
> Pranith
>
>
>> + Justin
>>
>> --
>> GlusterFS - http://www.gluster.org
>>
>> An open source, distributed file system scaling to several
>> petabytes, and handling thousands of clients.
>>
>> My personal twitter: twitter.com/realjustinclift
>>
>>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.5.1 beta 2 Sanity tests

2014-06-19 Thread Benjamin Turner
On Tue, Jun 17, 2014 at 7:54 PM, Benjamin Turner 
wrote:

> Yup,
> On Jun 17, 2014 7:45 PM, "Justin Clift"  wrote:
> >
> > On 17/06/2014, at 11:33 PM, Benjamin Turner wrote:
> > > Here are the tests that failed.  Note that n0 is a generated wname,
> name255 is a 255 character string, and path 1023 is a 1023 long path
>
> > >
> > > /opt/qa/tools/posix-testsuite/tests/link/02.t(Wstat: 0 Tests: 10
> Failed: 2)
> > >   Failed tests:  4, 6
> > >
> > > expect 0 link ${n0} ${name255}   #4
> > > expect 0 unlink ${n0} #5   <- this passed
> > > expect 0 unlink ${name255}   #6
> > >
> > > /opt/qa/tools/posix-testsuite/tests/link/03.t(Wstat: 0 Tests: 16
> Failed: 2)
> > >   Failed tests:  8-9
> > >
> > > expect 0 link ${n0} ${path1023}  #8
> > > expect 0 unlink ${path1023}   #9
> > >
> > > I gotta go for the day, I'll try to repro outside the script tomorrow.
> >
> > As a data point, people have occasionally mentioned to me in IRC
> > and via email that these "posix" tests fail for them... even when
> > run against a (non-glustered) ext4/xfs filesystem.  So, it _could_
> > be just some weird spurious thing.  If you figure out what though,
> > that'd be cool. :)
> >
> > + Justin
> >
> > --
> > GlusterFS - http://www.gluster.org
> >
> > An open source, distributed file system scaling to several
> > petabytes, and handling thousands of clients.
> >
> > My personal twitter: twitter.com/realjustinclift
> >
> I went through these a while back and removed anything that wasn't valid
> for GlusterFS.  This test was passing on 3.4.59 when it was released, i am
> thinking it may have something to do with a sym link to the same directory
> bz i found a while back? Idk, I'll get it sorted tomorrow.
>
>  I got this sorted, I needed to add a sleep between the file create and
the link.  I ran through it manually and it worked every time, took me a
few goes to think of timing issue.  I didn't need this on 3.4.0.59, is
there anything that needs investigated?

-b
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Beta2 NFS sanity tests.

2014-06-18 Thread Benjamin Turner
I re ran the individual test 20 times, re ran the fs mark test suite 10
times, and re ran the whole fs sanity test suite again and was unable to
reproduce.

-b


On Tue, Jun 17, 2014 at 4:23 PM, Benjamin Turner 
wrote:

> I saw 1 failure on NFS mounts, I am investigating:
>
>  final pass/fail report =
>Test Date: Tue Jun 17 16:15:38 EDT 2014
>Total : [43]
>Passed: [41]
>Failed: [2]
>Abort : [0]
>Crash : [0]
> -
>[   PASS   ]  FS Sanity Setup
>[   PASS   ]  Running tests.
>[   PASS   ]  FS SANITY TEST - arequal
>[   PASS   ]  FS SANITY LOG SCAN - arequal
>[   PASS   ]  FS SANITY TEST - bonnie
>[   PASS   ]  FS SANITY LOG SCAN - bonnie
>[   PASS   ]  FS SANITY TEST - glusterfs_build
>[   PASS   ]  FS SANITY LOG SCAN - glusterfs_build
>[   PASS   ]  FS SANITY TEST - compile_kernel
>[   PASS   ]  FS SANITY LOG SCAN - compile_kernel
>[   PASS   ]  FS SANITY TEST - dbench
>[   PASS   ]  FS SANITY LOG SCAN - dbench
>[   PASS   ]  FS SANITY TEST - dd
>[   PASS   ]  FS SANITY LOG SCAN - dd
>[   PASS   ]  FS SANITY TEST - ffsb
>[   PASS   ]  FS SANITY LOG SCAN - ffsb
>[   PASS   ]  FS SANITY TEST - fileop
>[   PASS   ]  FS SANITY LOG SCAN - fileop
>[   PASS   ]  FS SANITY TEST - fsx
>[   PASS   ]  FS SANITY LOG SCAN - fsx
>[   PASS   ]  FS SANITY LOG SCAN - fs_mark
>[   PASS   ]  FS SANITY TEST - iozone
>[   PASS   ]  FS SANITY LOG SCAN - iozone
>[   PASS   ]  FS SANITY TEST - locks
>[   PASS   ]  FS SANITY LOG SCAN - locks
>[   PASS   ]  FS SANITY TEST - ltp
>[   PASS   ]  FS SANITY LOG SCAN - ltp
>[   PASS   ]  FS SANITY TEST - multiple_files
>[   PASS   ]  FS SANITY LOG SCAN - multiple_files
>[   PASS   ]  FS SANITY LOG SCAN - posix_compliance
>[   PASS   ]  FS SANITY TEST - postmark
>[   PASS   ]  FS SANITY LOG SCAN - postmark
>[   PASS   ]  FS SANITY TEST - read_large
>[   PASS   ]  FS SANITY LOG SCAN - read_large
>[   PASS   ]  FS SANITY TEST - rpc
>[   PASS   ]  FS SANITY LOG SCAN - rpc
>[   PASS   ]  FS SANITY TEST - syscallbench
>[   PASS   ]  FS SANITY LOG SCAN - syscallbench
>[   PASS   ]  FS SANITY TEST - tiobench
>[   PASS   ]  FS SANITY LOG SCAN - tiobench
>[   PASS   ]  FS Sanity Cleanup
>
>[   FAIL   ]  FS SANITY TEST - fs_mark
>[   FAIL   ]  
> /rhs-tests/beaker/rhs/auto-tests/components/sanity/fs-sanity-tests-v2
>
> The failed test was:
>
> #  fs_mark  -d  .  -D  4  -t  4  -S  1
> # Version 3.3, 4 thread(s) starting at Tue Jun 17 13:39:36 2014
> # Sync method: INBAND FSYNC: fsync() per file in write loop.
> # Directories:  Time based hash between directories across 4 
> subdirectories with 180 seconds per subdirectory.
> # File names: 40 bytes long, (16 initial bytes of time stamp with 24 
> random bytes at end of name)
> # Files info: size 51200 bytes, written with an IO size of 16384 bytes 
> per write
> # App overhead is time in microseconds spent in the test not doing file 
> writing related system calls.
>
> FSUse%Count SizeFiles/sec App Overhead
> Error in unlink of ./00/53a07d587SRWZLFBMIUOEVGM4RY9F5P3 : No such 
> file or directory
> fopen failed to open: fs_log.txt.19509
> fs-mark pass # 1 failed
>
> I will investigate and open a BZ if this is reproducible.
>
> -b
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.5.1 beta 2 Sanity tests

2014-06-17 Thread Benjamin Turner
Here are the tests that failed.  Note that n0 is a generated name, name255
is a 255 character string, and path 1023 is a 1023 long path

/opt/qa/tools/posix-testsuite/tests/link/02.t(Wstat: 0 Tests: 10
Failed: 2)
  Failed tests:  4, 6

expect 0 link ${n0} ${name255}   #4
expect 0 unlink ${n0} #5   <- this passed
expect 0 unlink ${name255}   #6

/opt/qa/tools/posix-testsuite/tests/link/03.t(Wstat: 0 Tests: 16
Failed: 2)
  Failed tests:  8-9

expect 0 link ${n0} ${path1023}  #8
expect 0 unlink ${path1023}   #9

I gotta go for the day, I'll try to repro outside the script tomorrow.

-b


On Tue, Jun 17, 2014 at 11:09 AM, Benjamin Turner 
wrote:

> I ran through fs sanity on the beta 2 bits:
>
>  final pass/fail report =
>Test Date: Mon Jun 16 23:41:51 EDT 2014
>Total : [44]
>Passed: [42]
>Failed: [2]
>Abort : [0]
>Crash : [0]
> -
>[   PASS   ]  FS Sanity Setup
>[   PASS   ]  Running tests.
>[   PASS   ]  FS SANITY TEST - arequal
>[   PASS   ]  FS SANITY LOG SCAN - arequal
>[   PASS   ]  FS SANITY TEST - bonnie
>[   PASS   ]  FS SANITY LOG SCAN - bonnie
>[   PASS   ]  FS SANITY TEST - glusterfs_build
>[   PASS   ]  FS SANITY LOG SCAN - glusterfs_build
>[   PASS   ]  FS SANITY TEST - compile_kernel
>[   PASS   ]  FS SANITY LOG SCAN - compile_kernel
>[   PASS   ]  FS SANITY TEST - dbench
>[   PASS   ]  FS SANITY LOG SCAN - dbench
>[   PASS   ]  FS SANITY TEST - dd
>[   PASS   ]  FS SANITY LOG SCAN - dd
>[   PASS   ]  FS SANITY TEST - ffsb
>[   PASS   ]  FS SANITY LOG SCAN - ffsb
>[   PASS   ]  FS SANITY TEST - fileop
>[   PASS   ]  FS SANITY LOG SCAN - fileop
>[   PASS   ]  FS SANITY TEST - fsx
>[   PASS   ]  FS SANITY LOG SCAN - fsx
>[   PASS   ]  FS SANITY TEST - fs_mark
>[   PASS   ]  FS SANITY LOG SCAN - fs_mark
>[   PASS   ]  FS SANITY TEST - iozone
>[   PASS   ]  FS SANITY LOG SCAN - iozone
>[   PASS   ]  FS SANITY TEST - locks
>[   PASS   ]  FS SANITY LOG SCAN - locks
>[   PASS   ]  FS SANITY TEST - ltp
>[   PASS   ]  FS SANITY LOG SCAN - ltp
>[   PASS   ]  FS SANITY TEST - multiple_files
>[   PASS   ]  FS SANITY LOG SCAN - multiple_files
>[   PASS   ]  FS SANITY LOG SCAN - posix_compliance
>[   PASS   ]  FS SANITY TEST - postmark
>[   PASS   ]  FS SANITY LOG SCAN - postmark
>[   PASS   ]  FS SANITY TEST - read_large
>[   PASS   ]  FS SANITY LOG SCAN - read_large
>[   PASS   ]  FS SANITY TEST - rpc
>[   PASS   ]  FS SANITY LOG SCAN - rpc
>[   PASS   ]  FS SANITY TEST - syscallbench
>[   PASS   ]  FS SANITY LOG SCAN - syscallbench
>[   PASS   ]  FS SANITY TEST - tiobench
>[   PASS   ]  FS SANITY LOG SCAN - tiobench
>[   PASS   ]  FS Sanity Cleanup
>
>[   FAIL   ]  FS SANITY TEST - posix_compliance
>[   FAIL   ]  
> /rhs-tests/beaker/rhs/auto-tests/components/sanity/fs-sanity-tests-v2
>
>
> The posix_compliance failures are:
>
> /opt/qa/tools/posix-testsuite/tests/link/02.t ..
>
> Failed 2/10 subtests
>
> /opt/qa/tools/posix-testsuite/tests/link/03.t ..
> Failed 2/16 subtests
>
> I am looking into the failures now as well as running on NFS mounts, I will 
> open a BZ if they are valid.
>
> -b
>
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Beta2 NFS sanity tests.

2014-06-17 Thread Benjamin Turner
I saw 1 failure on NFS mounts, I am investigating:

 final pass/fail report =
   Test Date: Tue Jun 17 16:15:38 EDT 2014
   Total : [43]
   Passed: [41]
   Failed: [2]
   Abort : [0]
   Crash : [0]
-
   [   PASS   ]  FS Sanity Setup
   [   PASS   ]  Running tests.
   [   PASS   ]  FS SANITY TEST - arequal
   [   PASS   ]  FS SANITY LOG SCAN - arequal
   [   PASS   ]  FS SANITY TEST - bonnie
   [   PASS   ]  FS SANITY LOG SCAN - bonnie
   [   PASS   ]  FS SANITY TEST - glusterfs_build
   [   PASS   ]  FS SANITY LOG SCAN - glusterfs_build
   [   PASS   ]  FS SANITY TEST - compile_kernel
   [   PASS   ]  FS SANITY LOG SCAN - compile_kernel
   [   PASS   ]  FS SANITY TEST - dbench
   [   PASS   ]  FS SANITY LOG SCAN - dbench
   [   PASS   ]  FS SANITY TEST - dd
   [   PASS   ]  FS SANITY LOG SCAN - dd
   [   PASS   ]  FS SANITY TEST - ffsb
   [   PASS   ]  FS SANITY LOG SCAN - ffsb
   [   PASS   ]  FS SANITY TEST - fileop
   [   PASS   ]  FS SANITY LOG SCAN - fileop
   [   PASS   ]  FS SANITY TEST - fsx
   [   PASS   ]  FS SANITY LOG SCAN - fsx
   [   PASS   ]  FS SANITY LOG SCAN - fs_mark
   [   PASS   ]  FS SANITY TEST - iozone
   [   PASS   ]  FS SANITY LOG SCAN - iozone
   [   PASS   ]  FS SANITY TEST - locks
   [   PASS   ]  FS SANITY LOG SCAN - locks
   [   PASS   ]  FS SANITY TEST - ltp
   [   PASS   ]  FS SANITY LOG SCAN - ltp
   [   PASS   ]  FS SANITY TEST - multiple_files
   [   PASS   ]  FS SANITY LOG SCAN - multiple_files
   [   PASS   ]  FS SANITY LOG SCAN - posix_compliance
   [   PASS   ]  FS SANITY TEST - postmark
   [   PASS   ]  FS SANITY LOG SCAN - postmark
   [   PASS   ]  FS SANITY TEST - read_large
   [   PASS   ]  FS SANITY LOG SCAN - read_large
   [   PASS   ]  FS SANITY TEST - rpc
   [   PASS   ]  FS SANITY LOG SCAN - rpc
   [   PASS   ]  FS SANITY TEST - syscallbench
   [   PASS   ]  FS SANITY LOG SCAN - syscallbench
   [   PASS   ]  FS SANITY TEST - tiobench
   [   PASS   ]  FS SANITY LOG SCAN - tiobench
   [   PASS   ]  FS Sanity Cleanup

   [   FAIL   ]  FS SANITY TEST - fs_mark
   [   FAIL   ]
/rhs-tests/beaker/rhs/auto-tests/components/sanity/fs-sanity-tests-v2

The failed test was:

#  fs_mark  -d  .  -D  4  -t  4  -S  1
#   Version 3.3, 4 thread(s) starting at Tue Jun 17 13:39:36 2014
#   Sync method: INBAND FSYNC: fsync() per file in write loop.
#   Directories:  Time based hash between directories across 4
subdirectories with 180 seconds per subdirectory.
#   File names: 40 bytes long, (16 initial bytes of time stamp with 24
random bytes at end of name)
#   Files info: size 51200 bytes, written with an IO size of 16384 bytes 
per write
#   App overhead is time in microseconds spent in the test not doing
file writing related system calls.

FSUse%Count SizeFiles/sec App Overhead
Error in unlink of ./00/53a07d587SRWZLFBMIUOEVGM4RY9F5P3 : No
such file or directory
fopen failed to open: fs_log.txt.19509
fs-mark pass # 1 failed

I will investigate and open a BZ if this is reproducible.

-b
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] 3.5.1 beta 2 Sanity tests

2014-06-17 Thread Benjamin Turner
I ran through fs sanity on the beta 2 bits:

 final pass/fail report =
   Test Date: Mon Jun 16 23:41:51 EDT 2014
   Total : [44]
   Passed: [42]
   Failed: [2]
   Abort : [0]
   Crash : [0]
-
   [   PASS   ]  FS Sanity Setup
   [   PASS   ]  Running tests.
   [   PASS   ]  FS SANITY TEST - arequal
   [   PASS   ]  FS SANITY LOG SCAN - arequal
   [   PASS   ]  FS SANITY TEST - bonnie
   [   PASS   ]  FS SANITY LOG SCAN - bonnie
   [   PASS   ]  FS SANITY TEST - glusterfs_build
   [   PASS   ]  FS SANITY LOG SCAN - glusterfs_build
   [   PASS   ]  FS SANITY TEST - compile_kernel
   [   PASS   ]  FS SANITY LOG SCAN - compile_kernel
   [   PASS   ]  FS SANITY TEST - dbench
   [   PASS   ]  FS SANITY LOG SCAN - dbench
   [   PASS   ]  FS SANITY TEST - dd
   [   PASS   ]  FS SANITY LOG SCAN - dd
   [   PASS   ]  FS SANITY TEST - ffsb
   [   PASS   ]  FS SANITY LOG SCAN - ffsb
   [   PASS   ]  FS SANITY TEST - fileop
   [   PASS   ]  FS SANITY LOG SCAN - fileop
   [   PASS   ]  FS SANITY TEST - fsx
   [   PASS   ]  FS SANITY LOG SCAN - fsx
   [   PASS   ]  FS SANITY TEST - fs_mark
   [   PASS   ]  FS SANITY LOG SCAN - fs_mark
   [   PASS   ]  FS SANITY TEST - iozone
   [   PASS   ]  FS SANITY LOG SCAN - iozone
   [   PASS   ]  FS SANITY TEST - locks
   [   PASS   ]  FS SANITY LOG SCAN - locks
   [   PASS   ]  FS SANITY TEST - ltp
   [   PASS   ]  FS SANITY LOG SCAN - ltp
   [   PASS   ]  FS SANITY TEST - multiple_files
   [   PASS   ]  FS SANITY LOG SCAN - multiple_files
   [   PASS   ]  FS SANITY LOG SCAN - posix_compliance
   [   PASS   ]  FS SANITY TEST - postmark
   [   PASS   ]  FS SANITY LOG SCAN - postmark
   [   PASS   ]  FS SANITY TEST - read_large
   [   PASS   ]  FS SANITY LOG SCAN - read_large
   [   PASS   ]  FS SANITY TEST - rpc
   [   PASS   ]  FS SANITY LOG SCAN - rpc
   [   PASS   ]  FS SANITY TEST - syscallbench
   [   PASS   ]  FS SANITY LOG SCAN - syscallbench
   [   PASS   ]  FS SANITY TEST - tiobench
   [   PASS   ]  FS SANITY LOG SCAN - tiobench
   [   PASS   ]  FS Sanity Cleanup

   [   FAIL   ]  FS SANITY TEST - posix_compliance
   [   FAIL   ]
/rhs-tests/beaker/rhs/auto-tests/components/sanity/fs-sanity-tests-v2


The posix_compliance failures are:

/opt/qa/tools/posix-testsuite/tests/link/02.t ..

Failed 2/10 subtests

/opt/qa/tools/posix-testsuite/tests/link/03.t ..
Failed 2/16 subtests

I am looking into the failures now as well as running on NFS mounts, I
will open a BZ if they are valid.

-b
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] glusterfs-3.5.1beta2 released

2014-06-10 Thread Benjamin Turner
I'll kick off some automated runs, I didn't see any RPMs in:

http://download.gluster.org/pub/gluster/glusterfs/qa-releases/

Any idea when they will they will get built?  Its not a blocker, just
easier with RPMs.

-b


On Tue, Jun 10, 2014 at 1:16 PM, Niels de Vos  wrote:

> On Tue, Jun 10, 2014 at 09:56:38AM -0700, Gluster Build System wrote:
> >
> >
> > SRC:
> http://bits.gluster.org/pub/gluster/glusterfs/src/glusterfs-3.5.1beta2.tar.gz
> >
>
> This beta release supposedly fixes the bugs listed below since 3.5.0 was
> made available.  Thanks to all how provided patches, and reviewed these
> changes.  Now it's time to test before a final release of 3.5.1 can get
> shipped.
>
> Cheers,
> Niels
>
>
> #765202 - lgetxattr called with invalid keys on the bricks
> #833586 - inodelk hang from marker_rename_release_newp_lock
> #859581 - self-heal process can sometimes create directories instead of
> symlinks for the root gfid file in .glusterfs
> #986429 - Backupvolfile server option should work internal to GlusterFS
> framework
> #1039544 - [FEAT] "gluster volume heal info" should list the entries that
> actually required to be healed.
> #1046624 - Unable to heal symbolic Links
> #1046853 - AFR : For every file self-heal there are warning messages
> reported in glustershd.log file
> #1063190 - [RHEV-RHS] Volume was not accessible after server side quorum
> was met
> #1064096 - The old Python Translator code (not Glupy) should be removed
> #1066996 - Using sanlock on a gluster mount with replica 3 (quorum-type
> auto) leads to a split-brain
> #1071191 - [3.5.1] Sporadic SIGBUS with mmap() on a sparse file created
> with open(), seek(), write()
> #1078061 - Need ability to heal mismatching user extended attributes
> without any changelogs
> #1078365 - New xlators are linked as versioned .so files, creating
> .so.0.0.0
> #1086743 - Add documentation for the Feature: RDMA-connection manager
> (RDMA-CM)
> #1086748 - Add documentation for the Feature: AFR CLI enhancements
> #1086749 - Add documentation for the Feature: Exposing Volume Capabilities
> #1086750 - Add documentation for the Feature: File Snapshots in GlusterFS
> #1086751 - Add documentation for the Feature: gfid-access
> #1086752 - Add documentation for the Feature: On-Wire
> Compression/Decompression
> #1086754 - Add documentation for the Feature: Quota Scalability
> #1086755 - Add documentation for the Feature: readdir-ahead
> #1086756 - Add documentation for the Feature: zerofill API for GlusterFS
> #1086758 - Add documentation for the Feature: Changelog based parallel
> geo-replication
> #1086760 - Add documentation for the Feature: Write Once Read Many (WORM)
> volume
> #1086762 - Add documentation for the Feature: BD Xlator - Block Device
> translator
> #1086766 - Add documentation for the Feature: Libgfapi
> #1086774 - Add documentation for the Feature: Access Control List -
> Version 3 support for Gluster NFS
> #1086782 - Add documentation for the Feature: glusterfs and  oVirt
> integration
> #1086783 - Add documentation for the Feature: qemu 1.3 - libgfapi
> integration
> #1088848 - Spelling errors in rpc/rpc-transport/rdma/src/rdma.c
> #1089054 - gf-error-codes.h is missing from source tarball
> #1089470 - SMB: Crash on brick process during compile kernel.
> #1089934 - list dir with more than N files results in Input/output error
> #1091340 - Doc: Add glfs_fini known issue to release notes 3.5
> #1091392 - glusterfs.spec.in: minor/nit changes to sync with Fedora spec
> #1095775 - Add support in libgfapi to fetch volume info from glusterd.
> #1095971 - Stopping/Starting a Gluster volume resets ownership
> #1096040 - AFR : self-heal-daemon not clearing the change-logs of all the
> sources after self-heal
> #1096425 - i/o error when one user tries to access RHS volume over NFS
> with 100+ GIDs
> #1099878 - Need support for handle based Ops to fetch/modify extended
> attributes of a file
> #1102306 - license: xlators/features/glupy dual license GPLv2 and LGPLv3+
> #1103413 - Failure in gf_log_init reopening stderr
> #1104592 - heal info may give Success instead of transport end point not
> connected when a brick is down.
> #1104919 - Fix memory leaks in gfid-access xlator.
> #1104959 - Dist-geo-rep : some of the files not accessible on slave after
> the geo-rep sync from master to slave.
> #1105188 - Two instances each, of brick processes, glusterfs-nfs and
> quotad seen after glusterd restart
> #1105524 - Disable nfs.drc by default
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel