Re: [Gluster-users] how to get the true used capacity of the volume

2018-04-13 Thread Alastair Neil
 You will get weird results like these if you put two bricks on a single
filesystem.  In use case one (presumably replica 2) the data gets written
to both bricks, which means there are two copies on the disk and so twice
the disk space consumed.  In the second case there is some overhead
involved in creating a volume that will consume some disk space even absent
any user data added, how much will depend on the factors like the block
size you used to create the filesystem.

Best practice is that each brick should be on it's own block device with
it's own filesystem and not shared with other bricks or applications.  If
you must share physical devices then use lvm (or partitions - but lvm is
more flexible) to create separate volumes each with it's own filesystem for
each brick.



On 11 April 2018 at 23:32, hannan...@shudun.com 
wrote:

> I create a volume,and mounted it, and use df command to view the volume
> Available and used .
>
> After some testing, I think the used information displayed by df is the
> sum of the capacities of the disks on which the brick is located.
> Not the sum of the used of the brick directory.
> (I know the Available capacity, is the physical space of all disks if not
> quota,
> but used of space should not be sum of the space used by the hard disk,
> should be the sum of the size of the brick directory
> beacuse, There may be different volumes of bricks on one disk)
>
> In my case:
> I want to create multiple volumes on some disks(For better performance,
> each volume will use all disks of our server cluster),one volume for NFS
> and replica 2,one volume for NFS and replica 3, one volume for SAMBA。
> I want get the capacity  already used of each volume, but now one of the
> volumes write data, the other volumes used will also increase when viewed
> using df command.
>
> Example:
> eg1:
> I create a volume with two bricks and the two bricks are on one disk. And
> write 1TB of data for the volume
> using the df command, View the space used by the volume.
> Display volume uses 2TB of space
>
> eg2:
> such as :When I create a volume on the root partition,I didn't write any
> data to the volume,But using df shows that this volume has used some space。
> In fact, these spaces are not the size of the brick directory, but the
> size of the disk on which the brick is located.
>
> How do I get the capacity of each volume in this case?
>
> [root@f08n29glusterfs-3.7.20]# df -hT | grep f08n29
> *f08n29:/usage_test fuse.glusterfs   50G   24G   27G  48% /mnt*
>
> [root@f08n29glusterfs-3.7.20]# gluster volume info usage_test
> Volume Name: usage_test
> Type: Distribute
> Volume ID: d9b5abff-9f69-41ce-80b3-3dc4ba1d77b3
> Status: Started
> Number of Bricks: 1
> Transport-type: tcp
> Bricks:
> *Brick1: f08n29:/brick1*
> Options Reconfigured:
> performance.readdir-ahead: on
>
> [root@f08n29glusterfs-3.7.20]# du -sh /brick1
> *100K/brick1*
>
> Is there any command that can check the actual space used by each volume
> in this situation?
>
>
>
>
>
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Announcing Glusterfs release 3.12.2 (Long Term Maintenance)

2017-12-11 Thread Alastair Neil
Neil I  don;t know if this is adequate but I did run a simple smoke test
today on the 3.12.3-1 bits.   I installed the 3.12.3-1 but on 3 fresh
install Centos 7 VMs

created a 2G image  files and wrote a xfs files system on them on each
system

mount each under /export/brick1,  and created /export/birck1/test  on each
node.
probes the two other systems from one node (a).  abd created a replica 3
volume using the bricks at export/brick1/test on each node.

started the volume and mounted it under /mnt/gluster test on nodes a.

did some brief  tests using dd into the mount point on node a, all seemed
fine - no errors nothing unexpected.








On 23 October 2017 at 17:42, Niels de Vos <nde...@redhat.com> wrote:

> On Mon, Oct 23, 2017 at 02:12:53PM -0400, Alastair Neil wrote:
> > Any idea when these packages will be in the CentOS mirrors? there is no
> > sign of them on download.gluster.org.
>
> We're waiting for someone other than me to test the new packages at
> least a little. Installing the packages and run something on top of a
> Gluster volume is already sufficient, just describe a bit what was
> tested. Once a confirmation is sent that it works for someone, we can
> mark the packages for releasing to the mirrors.
>
> Getting the (unsigned) RPMs is easy, run this on your test environment:
>
>   # yum --enablrepo=centos-gluster312-test update glusterfs
>
> This does not restart the brick processes so I/O is not affected with
> the installation. Make sure to restart the processes (or just reboot)
> and do whatever validation you deem sufficient.
>
> Thanks,
> Niels
>
>
> >
> > On 13 October 2017 at 08:45, Jiffin Tony Thottan <jthot...@redhat.com>
> > wrote:
> >
> > > The Gluster community is pleased to announce the release of Gluster
> 3.12.2
> > > (packages available at [1,2,3]).
> > >
> > > Release notes for the release can be found at [4].
> > >
> > > We still carry following major issues that is reported in the
> > > release-notes as follows,
> > >
> > > 1.) - Expanding a gluster volume that is sharded may cause file
> corruption
> > >
> > > Sharded volumes are typically used for VM images, if such volumes
> are
> > > expanded or possibly contracted (i.e add/remove bricks and rebalance)
> there
> > > are reports of VM images getting corrupted.
> > >
> > > The last known cause for corruption (Bug #1465123) has a fix with
> this
> > > release. As further testing is still in progress, the issue is
> retained as
> > > a major issue.
> > >
> > > Status of this bug can be tracked here, #1465123
> > >
> > >
> > > 2 .) Gluster volume restarts fail if the sub directory export feature
> is
> > > in use. Status of this issue can be tracked here, #1501315
> > >
> > > 3.) Mounting a gluster snapshot will fail, when attempting a FUSE based
> > > mount of the snapshot. So for the current users, it is recommend to
> only
> > > access snapshot via
> > >
> > > ".snaps" directory on a mounted gluster volume. Status of this issue
> can
> > > be tracked here, #1501378
> > >
> > > Thanks,
> > >  Gluster community
> > >
> > >
> > > [1] https://download.gluster.org/pub/gluster/glusterfs/3.12/3.12.2/
> > > <https://download.gluster.org/pub/gluster/glusterfs/3.12/3.12.1/>
> > > [2] https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.12
> > > <https://launchpad.net/%7Egluster/+archive/ubuntu/glusterfs-3.11>
> > > [3] https://build.opensuse.org/project/subprojects/home:glusterfs
> > >
> > > [4] Release notes: https://gluster.readthedocs.
> > > io/en/latest/release-notes/3.12.2/
> > > <https://gluster.readthedocs.io/en/latest/release-notes/3.11.3/>
> > >
> > > ___
> > > Gluster-devel mailing list
> > > gluster-de...@gluster.org
> > > http://lists.gluster.org/mailman/listinfo/gluster-devel
> > >
>
> > ___
> > Gluster-devel mailing list
> > gluster-de...@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] SMB copies failing with GlusterFS 3.10

2017-11-28 Thread Alastair Neil
What is the volume configuration, is it replicated, distributed,
distribute-replicate, disperse?

have you tried setting:
performance.strict-write-ordering to on?

On 14 November 2017 at 06:24, Brett Randall  wrote:

> Hi all
>
> We've got a brand new 6-node GlusterFS 3.10 deployment (previous 20 nodes
> were GlusterFS 3.6). Running on CentOS 7 using legit repos, so
> glusterfs-3.10.7-1.el7.x86_64 is the base.
>
> Our issue is that when we create a file with a Gluster client, e.g. a Mac
> or Windows machine, it works fine. However if we copy a file from a Mac or
> Windows machine to the Samba share, it fails with a complaint that the file
> already exists, even though it doesn't (or didn't). It appears as though
> the file is tested to see if it exists, and it doesn't, so then the client
> goes and tries to create the file but at that stage it DOES exist, maybe
> something to do with the previous stat? Anyway, it is repeatable and
> killing us! NFS clients work fine on any platform, but SMB does not. There
> aren't that many client-side options for SMB mounts so the solution has to
> be server-side.
>
> Here is a pcap of the copy attempt from one computer:
>
> https://www.dropbox.com/s/yhn3s1qbxtdvnoh/sambacap.pcapng?dl=0
>
> You'll see a the request to look for the file which results in a
> STATUS_OBJECT_NAME_NOT_FOUND (good), followed by a STATUS_SHARING_VIOLATION
> (???) followed by a STATUS_OBJECT_NAME_COLLISION (bad).
>
> Here are the options from the volume:
>
> Options Reconfigured:
>
> nfs.acl: off
>
> features.cache-invalidation: on
>
> storage.batch-fsync-delay-usec: 0
>
> transport.address-family: inet6
>
> nfs.disable: on
>
> performance.stat-prefetch: off
>
> server.allow-insecure: on
>
> ganesha.enable: ganesha-nfs
>
> user.smb: enable
>
> Any thoughts on why samba isn't enjoying our copies?
>
> Thanks!
>
> Brett.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Request for Comments: Upgrades from 3.x to 4.0+

2017-11-06 Thread Alastair Neil
Ahh OK I see, thanks


On 6 November 2017 at 00:54, Kaushal M <kshlms...@gmail.com> wrote:

> On Fri, Nov 3, 2017 at 8:50 PM, Alastair Neil <ajneil.t...@gmail.com>
> wrote:
> > Just so I am clear the upgrade process will be as follows:
> >
> > upgrade all clients to 4.0
> >
> > rolling upgrade all servers to 4.0 (with GD1)
> >
> > kill all GD1 daemons on all servers and run upgrade script (new clients
> > unable to connect at this point)
> >
> > start GD2 ( necessary or does the upgrade script do this?)
> >
> >
> > I assume that once the cluster had been migrated to GD2 the glusterd
> startup
> > script will be smart enough to start the correct version?
> >
>
> This should be the process, mostly.
>
> The upgrade script needs to GD2 running on all nodes before it can
> begin migration.
> But they don't need to have a cluster formed, the script should take
> care of forming the cluster.
>
>
> > -Thanks
> >
> >
> >
> >
> >
> > On 3 November 2017 at 04:06, Kaushal M <kshlms...@gmail.com> wrote:
> >>
> >> On Thu, Nov 2, 2017 at 7:53 PM, Darrell Budic <bu...@onholyground.com>
> >> wrote:
> >> > Will the various client packages (centos in my case) be able to
> >> > automatically handle the upgrade vs new install decision, or will we
> be
> >> > required to do something manually to determine that?
> >>
> >> We should be able to do this with CentOS (and other RPM based distros)
> >> which have well split glusterfs packages currently.
> >> At this moment, I don't know exactly how much can be handled
> >> automatically, but I expect the amount of manual intervention to be
> >> minimal.
> >> The least minimum amount of manual work needed would be enabling and
> >> starting GD2 and starting the migration script.
> >>
> >> >
> >> > It’s a little unclear that things will continue without interruption
> >> > because
> >> > of the way you describe the change from GD1 to GD2, since it sounds
> like
> >> > it
> >> > stops GD1.
> >>
> >> With the described upgrade strategy, we can ensure continuous volume
> >> access to clients during the whole process (provided volumes have been
> >> setup with replication or ec).
> >>
> >> During the migration from GD1 to GD2, any existing clients still
> >> retain access, and can continue to work without interruption.
> >> This is possible because gluster keeps the management  (glusterds) and
> >> data (bricks and clients) parts separate.
> >> So it is possible to interrupt the management parts, without
> >> interrupting data access to existing clients.
> >> Clients and the server side brick processes need GlusterD to start up.
> >> But once they're running, they can run without GlusterD. GlusterD is
> >> only required again if something goes wrong.
> >> Stopping GD1 during the migration process, will not lead to any
> >> interruptions for existing clients.
> >> The brick process continue to run, and any connected clients continue
> >> to remain connected to the bricks.
> >> Any new clients which try to mount the volumes during this migration
> >> will fail, as a GlusterD will not be available (either GD1 or GD2).
> >>
> >> > Early days, obviously, but if you could clarify if that’s what
> >> > we’re used to as a rolling upgrade or how it works, that would be
> >> > appreciated.
> >>
> >> A Gluster rolling upgrade process, allows data access to volumes
> >> during the process, while upgrading the brick processes as well.
> >> Rolling upgrades with uninterrupted access requires that volumes have
> >> redundancy (replicate or ec).
> >> Rolling upgrades involves upgrading servers belonging to a redundancy
> >> set (replica set or ec set), one at a time.
> >> One at a time,
> >> - A server is picked from a redundancy set
> >> - All Gluster processes are killed on the server, glusterd, bricks and
> >> other daemons included.
> >> - Gluster is upgraded and restarted on the server
> >> - A heal is performed to heal new data onto the bricks.
> >> - Move onto next server after heal finishes.
> >>
> >> Clients maintain uninterrupted access, because a full redundancy set
> >> is never taken offline all at once.
> >>
> >> > Also clarification that we’ll be able to upgrade from 3.x
> >

Re: [Gluster-users] [Gluster-devel] Request for Comments: Upgrades from 3.x to 4.0+

2017-11-03 Thread Alastair Neil
Just so I am clear the upgrade process will be as follows:

upgrade all clients to 4.0

rolling upgrade all servers to 4.0 (with GD1)

kill all GD1 daemons on all servers and run upgrade script (new clients
unable to connect at this point)

start GD2 ( necessary or does the upgrade script do this?)


I assume that once the cluster had been migrated to GD2 the glusterd
startup script will be smart enough to start the correct version?

-Thanks





On 3 November 2017 at 04:06, Kaushal M  wrote:

> On Thu, Nov 2, 2017 at 7:53 PM, Darrell Budic 
> wrote:
> > Will the various client packages (centos in my case) be able to
> > automatically handle the upgrade vs new install decision, or will we be
> > required to do something manually to determine that?
>
> We should be able to do this with CentOS (and other RPM based distros)
> which have well split glusterfs packages currently.
> At this moment, I don't know exactly how much can be handled
> automatically, but I expect the amount of manual intervention to be
> minimal.
> The least minimum amount of manual work needed would be enabling and
> starting GD2 and starting the migration script.
>
> >
> > It’s a little unclear that things will continue without interruption
> because
> > of the way you describe the change from GD1 to GD2, since it sounds like
> it
> > stops GD1.
>
> With the described upgrade strategy, we can ensure continuous volume
> access to clients during the whole process (provided volumes have been
> setup with replication or ec).
>
> During the migration from GD1 to GD2, any existing clients still
> retain access, and can continue to work without interruption.
> This is possible because gluster keeps the management  (glusterds) and
> data (bricks and clients) parts separate.
> So it is possible to interrupt the management parts, without
> interrupting data access to existing clients.
> Clients and the server side brick processes need GlusterD to start up.
> But once they're running, they can run without GlusterD. GlusterD is
> only required again if something goes wrong.
> Stopping GD1 during the migration process, will not lead to any
> interruptions for existing clients.
> The brick process continue to run, and any connected clients continue
> to remain connected to the bricks.
> Any new clients which try to mount the volumes during this migration
> will fail, as a GlusterD will not be available (either GD1 or GD2).
>
> > Early days, obviously, but if you could clarify if that’s what
> > we’re used to as a rolling upgrade or how it works, that would be
> > appreciated.
>
> A Gluster rolling upgrade process, allows data access to volumes
> during the process, while upgrading the brick processes as well.
> Rolling upgrades with uninterrupted access requires that volumes have
> redundancy (replicate or ec).
> Rolling upgrades involves upgrading servers belonging to a redundancy
> set (replica set or ec set), one at a time.
> One at a time,
> - A server is picked from a redundancy set
> - All Gluster processes are killed on the server, glusterd, bricks and
> other daemons included.
> - Gluster is upgraded and restarted on the server
> - A heal is performed to heal new data onto the bricks.
> - Move onto next server after heal finishes.
>
> Clients maintain uninterrupted access, because a full redundancy set
> is never taken offline all at once.
>
> > Also clarification that we’ll be able to upgrade from 3.x
> > (3.1x?) to 4.0, manually or automatically?
>
> Rolling upgrades from 3.1x to 4.0 are a manual process. But I believe,
> gdeploy has playbooks to automate it.
> At the end of this you will be left with a 4.0 cluster, but still be
> running GD1.
> Upgrading from GD1 to GD2, in 4.0 will be a manual process. A script
> that automates this is planned only for 4.1.
>
> >
> >
> > 
> > From: Kaushal M 
> > Subject: [Gluster-users] Request for Comments: Upgrades from 3.x to 4.0+
> > Date: November 2, 2017 at 3:56:05 AM CDT
> > To: gluster-users@gluster.org; Gluster Devel
> >
> > We're fast approaching the time for Gluster-4.0. And we would like to
> > set out the expected upgrade strategy and try to polish it to be as
> > user friendly as possible.
> >
> > We're getting this out here now, because there was quite a bit of
> > concern and confusion regarding the upgrades between 3.x and 4.0+.
> >
> > ---
> > ## Background
> >
> > Gluster-4.0 will bring a newer management daemon, GlusterD-2.0 (GD2),
> > which is backwards incompatible with the GlusterD (GD1) in
> > GlusterFS-3.1+.  As a hybrid cluster of GD1 and GD2 cannot be
> > established, rolling upgrades are not possible. This meant that
> > upgrades from 3.x to 4.0 would require a volume downtime and possible
> > client downtime.
> >
> > This was a cause of concern among many during the recently concluded
> > Gluster Summit 2017.
> >
> > We would like to keep pains experienced by our users to a 

Re: [Gluster-users] brick is down but gluster volume status says it's fine

2017-10-24 Thread Alastair Neil
It looks like this is to do with the stale port issue.

I think it's pretty clear from the below that the digitalcorpora brick
process is shown by volume status as having the same TCP port as the public
volume brick on gluster-2, 49156. But is actually listening on 49154.  So
although the brick process is technically up nothing is talking to it.  I
am surprised I don't see more errors in the brick log for brick8/public.
It also explains the wack-a-mole problem,  Every time I kill and restart
the daemon it must be grabbing the port of another brick and then that
volume brick  goes silent.

I killed all the brick processes and restarted glusterd and everything came
up ok.


[root@gluster-2 ~]# glv status digitalcorpora | grep -v ^Self
Status of volume: digitalcorpora
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick gluster-2:/export/brick7/digitalcorpo
ra  49156 0  Y
125708
Brick gluster1.vsnet.gmu.edu:/export/brick7
/digitalcorpora 49152 0  Y
12345
Brick gluster0:/export/brick7/digitalcorpor
a   49152 0  Y
16098

Task Status of Volume digitalcorpora
--
There are no active volume tasks

[root@gluster-2 ~]# glv status public  | grep -v ^Self
Status of volume: public
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick gluster1:/export/brick8/public49156 0  Y
3519
Brick gluster2:/export/brick8/public49156 0  Y
8578
Brick gluster0:/export/brick8/public49156 0  Y
3176

Task Status of Volume public
--
There are no active volume tasks

[root@gluster-2 ~]# netstat -pant | grep 8578 | grep 0.0.0.0
tcp0  0 0.0.0.0:49156   0.0.0.0:*
LISTEN  8578/glusterfsd
[root@gluster-2 ~]# netstat -pant | grep 125708 | grep 0.0.0.0
tcp0  0 0.0.0.0:49154   0.0.0.0:*
LISTEN  125708/glusterfsd
[root@gluster-2 ~]# ps -c  --pid  125708 8578
   PID CLS PRI TTY  STAT   TIME COMMAND
  8578 TS   19 ?Ssl  224:20 /usr/sbin/glusterfsd -s gluster2
--volfile-id public.gluster2.export-brick8-public -p
/var/lib/glusterd/vols/public/run/gluster2-export-bric
125708 TS   19 ?Ssl0:08 /usr/sbin/glusterfsd -s gluster-2
--volfile-id digitalcorpora.gluster-2.export-brick7-digitalcorpora -p
/var/lib/glusterd/vols/digitalcorpor
[root@gluster-2 ~]#


On 24 October 2017 at 13:56, Atin Mukherjee <amukh...@redhat.com> wrote:

>
>
> On Tue, Oct 24, 2017 at 11:13 PM, Alastair Neil <ajneil.t...@gmail.com>
> wrote:
>
>> gluster version 3.10.6, replica 3 volume, daemon is present but does not
>> appear to be functioning
>>
>> peculiar behaviour.  If I kill the glusterfs brick daemon and restart
>> glusterd then the brick becomes available - but one of my other volumes
>> bricks on the same server goes down in the same way it's like wack-a-mole.
>>
>> any ideas?
>>
>
> The subject and the data looks to be contradictory to me. Brick log (what
> you shared) doesn't have a cleanup_and_exit () trigger for a shutdown. Are
> you sure brick is down? OTOH, I see a mismatch of port for
> brick7/digitalcorpora where the brick process has 49154 but gluster volume
> status shows 49152. There is an issue with stale port which we're trying to
> address through https://review.gluster.org/18541 . But could you specify
> what exactly the problem is? Is it the stale port  or the conflict between
> volume status output and actual brick health? If it's the latter, I'd need
> further information like output of "gluster get-state" command from the
> same node.
>
>
>>
>> [root@gluster-2 bricks]# glv status digitalcorpora
>>
>>> Status of volume: digitalcorpora
>>> Gluster process TCP Port  RDMA Port
>>> Online  Pid
>>> 
>>> --
>>> Brick gluster-2:/export/brick7/digitalcorpo
>>> ra  49156 0
>>> Y   125708
>>> Brick gluster1.vsnet.gmu.edu:/export/brick7
>>> /digitalcorpora 49152 0
>>> Y   12345
>>> Brick gluster0:/export/brick7/digitalcorpor
>>> a   49152 0
>>> Y   16098
>>> S

[Gluster-users] brick is down but gluster volume status says it's fine

2017-10-24 Thread Alastair Neil
gluster version 3.10.6, replica 3 volume, daemon is present but does not
appear to be functioning

peculiar behaviour.  If I kill the glusterfs brick daemon and restart
glusterd then the brick becomes available - but one of my other volumes
bricks on the same server goes down in the same way it's like wack-a-mole.

any ideas?


[root@gluster-2 bricks]# glv status digitalcorpora

> Status of volume: digitalcorpora
> Gluster process TCP Port  RDMA Port  Online
> Pid
>
> --
> Brick gluster-2:/export/brick7/digitalcorpo
> ra  49156 0  Y
> 125708
> Brick gluster1.vsnet.gmu.edu:/export/brick7
> /digitalcorpora 49152 0  Y
> 12345
> Brick gluster0:/export/brick7/digitalcorpor
> a   49152 0  Y
> 16098
> Self-heal Daemon on localhost   N/A   N/AY
> 126625
> Self-heal Daemon on gluster1N/A   N/AY
> 15405
> Self-heal Daemon on gluster0N/A   N/AY
> 18584
>
> Task Status of Volume digitalcorpora
>
> --
> There are no active volume tasks
>
> [root@gluster-2 bricks]# glv heal digitalcorpora info
> Brick gluster-2:/export/brick7/digitalcorpora
> Status: Transport endpoint is not connected
> Number of entries: -
>
> Brick gluster1.vsnet.gmu.edu:/export/brick7/digitalcorpora
> /.trashcan
> /DigitalCorpora/hello2.txt
> /DigitalCorpora
> Status: Connected
> Number of entries: 3
>
> Brick gluster0:/export/brick7/digitalcorpora
> /.trashcan
> /DigitalCorpora/hello2.txt
> /DigitalCorpora
> Status: Connected
> Number of entries: 3
>
> [2017-10-24 17:18:48.288505] W [glusterfsd.c:1360:cleanup_and_exit]
> (-->/lib64/libpthread.so.0(+0x7e25) [0x7f6f83c9de25]
> -->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xe5) [0x55a148eeb135]
> -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x55a148eeaf5b] ) 0-:
> received signum (15), shutting down
> [2017-10-24 17:18:59.270384] I [MSGID: 100030] [glusterfsd.c:2503:main]
> 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.10.6
> (args: /usr/sbin/glusterfsd -s gluster-2 --volfile-id
> digitalcorpora.gluster-2.export-brick7-digitalcorpora -p
> /var/lib/glusterd/vols/digitalcorpora/run/gluster-2-export-brick7-digitalcorpora.pid
> -S /var/run/gluster/f8e0b3393e47dc51a07c6609f9b40841.socket --brick-name
> /export/brick7/digitalcorpora -l
> /var/log/glusterfs/bricks/export-brick7-digitalcorpora.log --xlator-option
> *-posix.glusterd-uuid=032c17f5-8cc9-445f-aa45-897b5a066b43 --brick-port
> 49154 --xlator-option digitalcorpora-server.listen-port=49154)
> [2017-10-24 17:18:59.285279] I [MSGID: 101190]
> [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2017-10-24 17:19:04.611723] I
> [rpcsvc.c:2237:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured
> rpc.outstanding-rpc-limit with value 64
> [2017-10-24 17:19:04.611815] W [MSGID: 101002]
> [options.c:954:xl_opt_validate] 0-digitalcorpora-server: option
> 'listen-port' is deprecated, preferred is 'transport.socket.listen-port',
> continuing with correction
> [2017-10-24 17:19:04.615974] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option
> 'rpc-auth.auth-glusterfs' is not recognized
> [2017-10-24 17:19:04.616033] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option
> 'rpc-auth.auth-unix' is not recognized
> [2017-10-24 17:19:04.616070] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option
> 'rpc-auth.auth-null' is not recognized
> [2017-10-24 17:19:04.616134] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option
> 'auth-path' is not recognized
> [2017-10-24 17:19:04.616177] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option
> 'ping-timeout' is not recognized
> [2017-10-24 17:19:04.616203] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-/export/brick7/digitalcorpora:
> option 'rpc-auth-allow-insecure' is not recognized
> [2017-10-24 17:19:04.616215] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-/export/brick7/digitalcorpora:
> option 'auth.addr./export/brick7/digitalcorpora.allow' is not recognized
> [2017-10-24 17:19:04.616226] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-/export/brick7/digitalcorpora:
> option 'auth-path' is not recognized
> [2017-10-24 17:19:04.616237] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-/export/brick7/digitalcorpora:
> option 'auth.login.b17f2513-7d9c-4174-a0c5-de4a752d46ca.password' is not
> recognized
> [2017-10-24 17:19:04.616248] W [MSGID: 101174]
> [graph.c:361:_log_if_unknown_option] 0-/export/brick7/digitalcorpora:
> option 

Re: [Gluster-users] [Gluster-devel] Announcing Glusterfs release 3.12.2 (Long Term Maintenance)

2017-10-23 Thread Alastair Neil
Any idea when these packages will be in the CentOS mirrors? there is no
sign of them on download.gluster.org.

On 13 October 2017 at 08:45, Jiffin Tony Thottan 
wrote:

> The Gluster community is pleased to announce the release of Gluster 3.12.2
> (packages available at [1,2,3]).
>
> Release notes for the release can be found at [4].
>
> We still carry following major issues that is reported in the
> release-notes as follows,
>
> 1.) - Expanding a gluster volume that is sharded may cause file corruption
>
> Sharded volumes are typically used for VM images, if such volumes are
> expanded or possibly contracted (i.e add/remove bricks and rebalance) there
> are reports of VM images getting corrupted.
>
> The last known cause for corruption (Bug #1465123) has a fix with this
> release. As further testing is still in progress, the issue is retained as
> a major issue.
>
> Status of this bug can be tracked here, #1465123
>
>
> 2 .) Gluster volume restarts fail if the sub directory export feature is
> in use. Status of this issue can be tracked here, #1501315
>
> 3.) Mounting a gluster snapshot will fail, when attempting a FUSE based
> mount of the snapshot. So for the current users, it is recommend to only
> access snapshot via
>
> ".snaps" directory on a mounted gluster volume. Status of this issue can
> be tracked here, #1501378
>
> Thanks,
>  Gluster community
>
>
> [1] https://download.gluster.org/pub/gluster/glusterfs/3.12/3.12.2/
> 
> [2] https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.12
> 
> [3] https://build.opensuse.org/project/subprojects/home:glusterfs
>
> [4] Release notes: https://gluster.readthedocs.
> io/en/latest/release-notes/3.12.2/
> 
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] warning spam in the logs after tiering experiment

2017-10-18 Thread Alastair Neil
forgot to mention Gluster version 3.10.6

On 18 October 2017 at 13:26, Alastair Neil <ajneil.t...@gmail.com> wrote:

> a short while ago I experimented with tiering on one of my volumes.  I
> decided it was not working out so I removed the tier.  I now have spam in
> the glusterd.log evert 7 seconds:
>
> [2017-10-18 17:17:29.578327] W [socket.c:3207:socket_connect] 0-tierd:
> Ignore failed connection attempt on /var/run/gluster/
> 2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or directory)
> [2017-10-18 17:17:36.579276] W [socket.c:3207:socket_connect] 0-tierd:
> Ignore failed connection attempt on /var/run/gluster/
> 2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or directory)
> [2017-10-18 17:17:43.580238] W [socket.c:3207:socket_connect] 0-tierd:
> Ignore failed connection attempt on /var/run/gluster/
> 2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or directory)
> [2017-10-18 17:17:50.581185] W [socket.c:3207:socket_connect] 0-tierd:
> Ignore failed connection attempt on /var/run/gluster/
> 2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or directory)
> [2017-10-18 17:17:57.582136] W [socket.c:3207:socket_connect] 0-tierd:
> Ignore failed connection attempt on /var/run/gluster/
> 2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or directory)
> [2017-10-18 17:18:04.583148] W [socket.c:3207:socket_connect] 0-tierd:
> Ignore failed connection attempt on /var/run/gluster/
> 2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or directory)
>
>
> gluster volume status is showing the tier daemon status on all the nodes
> as 'N', but lists the pids of a nonexistent processes.
>
> Just wondering if I messed up removing the tier?
>
> -Alastair
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] warning spam in the logs after tiering experiment

2017-10-18 Thread Alastair Neil
a short while ago I experimented with tiering on one of my volumes.  I
decided it was not working out so I removed the tier.  I now have spam in
the glusterd.log evert 7 seconds:

[2017-10-18 17:17:29.578327] W [socket.c:3207:socket_connect] 0-tierd:
Ignore failed connection attempt on
/var/run/gluster/2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or
directory)
[2017-10-18 17:17:36.579276] W [socket.c:3207:socket_connect] 0-tierd:
Ignore failed connection attempt on
/var/run/gluster/2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or
directory)
[2017-10-18 17:17:43.580238] W [socket.c:3207:socket_connect] 0-tierd:
Ignore failed connection attempt on
/var/run/gluster/2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or
directory)
[2017-10-18 17:17:50.581185] W [socket.c:3207:socket_connect] 0-tierd:
Ignore failed connection attempt on
/var/run/gluster/2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or
directory)
[2017-10-18 17:17:57.582136] W [socket.c:3207:socket_connect] 0-tierd:
Ignore failed connection attempt on
/var/run/gluster/2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or
directory)
[2017-10-18 17:18:04.583148] W [socket.c:3207:socket_connect] 0-tierd:
Ignore failed connection attempt on
/var/run/gluster/2e3df1c501d0a19e5076304179d1e43e.socket, (No such file or
directory)


gluster volume status is showing the tier daemon status on all the nodes as
'N', but lists the pids of a nonexistent processes.

Just wondering if I messed up removing the tier?

-Alastair
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster volume + lvm : recommendation or neccessity ?

2017-10-11 Thread Alastair Neil
LVM is also good if you want to add ssd cache.  It is more flexible and
easier to manage and expand than bcache.

On 11 October 2017 at 04:00, Mohammed Rafi K C  wrote:

>
> Volumes are aggregation of bricks, so I would consider bricks as a
> unique entity here rather than volumes. Taking the constraints from the
> blog [1].
>
> * All bricks should be carved out from an independent thinly provisioned
> logical volume (LV). In other words, no two brick should share a common
> LV. More details about thin provisioning and thin provisioned snapshot
> can be found here.
> * This thinly provisioned LV should only be used for forming a brick.
> * Thin pool from which the thin LVs are created should have sufficient
> space and also it should have sufficient space for pool metadata.
>
> You can refer the blog post here [1].
>
> [1] : http://rajesh-joseph.blogspot.in/p/gluster-volume-snapshot-
> howto.html
>
> Regards
> Rafi KC
>
>
> On 10/11/2017 01:23 PM, ML wrote:
> > Thanks Rafi, that's understood now :)
> >
> > I'm considering to deploy gluster on a 4 x 40 TB  bricks, do you think
> > it would better to make 1 LVM partition for each Volume I need or to
> > make one Big LVM partition and start multiple volumes on it ?
> >
> > We'll store mostly big files (videos) on this environement.
> >
> >
> >
> >
> > Le 11/10/2017 à 09:34, Mohammed Rafi K C a écrit :
> >>
> >> On 10/11/2017 12:20 PM, ML wrote:
> >>> Hi everyone,
> >>>
> >>> I've read on the gluster & redhat documentation, that it seems
> >>> recommended to use XFS over LVM before creating & using gluster
> >>> volumes.
> >>>
> >>> Sources :
> >>> https://access.redhat.com/documentation/en-US/Red_Hat_
> Storage/3/html/Administration_Guide/Formatting_and_Mounting_Bricks.html
> >>>
> >>>
> >>> http://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/Setting%20Up%20Volumes/
> >>>
> >>>
> >>>
> >>> My point is : do we really need LVM ?
> >> This recommendations was added after gluster-snapshot. Gluster snapshot
> >> relays on LVM snapshot. So if you start with out lvm, in future if you
> >> want to use snapshot then it would be difficult, hence the
> >> recommendation to use xfs on top of lvm.
> >>
> >>
> >> Regards
> >> Rafi KC
> >>
> >>> For example , on a dedicated server with disks & partitions that will
> >>> not change of size, it doesn't seems necessary to use LVM.
> >>>
> >>> I can't understand clearly wich partitioning strategy would be the
> >>> best for "static size" hard drives :
> >>>
> >>> 1 LVM+XFS partition = multiple gluster volumes
> >>> or 1 LVM+XFS partition = 1 gluster volume per LVM+XFS partition
> >>> or 1 XFS partition = multiple gluster volumes
> >>> or 1 XFS partition = 1 gluster volume per XFS partition
> >>>
> >>> What do you use on your servers ?
> >>>
> >>> Thanks for your help! :)
> >>>
> >>> Quentin
> >>>
> >>>
> >>> ___
> >>> Gluster-users mailing list
> >>> Gluster-users@gluster.org
> >>> http://lists.gluster.org/mailman/listinfo/gluster-users
> >
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] small files performance

2017-10-10 Thread Alastair Neil
I just tried setting:

performance.parallel-readdir on
features.cache-invalidation on
features.cache-invalidation-timeout 600
performance.stat-prefetch
performance.cache-invalidation
performance.md-cache-timeout 600
network.inode-lru-limit 5
performance.cache-invalidation on

and clients could not see their files with ls when accessing via a fuse
mount.  The files and directories were there, however, if you accessed them
directly. Server are 3.10.5 and the clients are 3.10 and 3.12.

Any ideas?


On 10 October 2017 at 10:53, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 2017-10-10 8:25 GMT+02:00 Karan Sandha :
>
>> Hi Gandalf,
>>
>> We have multiple tuning to do for small-files which decrease the time for
>> negative lookups , meta-data caching, parallel readdir. Bumping the server
>> and client event threads will help you out in increasing the small file
>> performance.
>>
>> gluster v set   group metadata-cache
>> gluster v set  group nl-cache
>> gluster v set  performance.parallel-readdir on (Note : readdir
>> should be on)
>>
>
> This is what i'm getting with suggested parameters.
> I'm running "fio" from a mounted gluster client:
> 172.16.0.12:/gv0 on /mnt2 type fuse.glusterfs
> (rw,relatime,user_id=0,group_id=0,default_permissions,
> allow_other,max_read=131072)
>
>
>
> # fio --ioengine=libaio --filename=fio.test --size=256M
> --direct=1 --rw=randrw --refill_buffers --norandommap
> --bs=8k --rwmixread=70 --iodepth=16 --numjobs=16
> --runtime=60 --group_reporting --name=fio-test
> fio-test: (g=0): rw=randrw, bs=8K-8K/8K-8K/8K-8K, ioengine=libaio,
> iodepth=16
> ...
> fio-2.16
> Starting 16 processes
> fio-test: Laying out IO file(s) (1 file(s) / 256MB)
> Jobs: 14 (f=13): [m(5),_(1),m(8),f(1),_(1)] [33.9% done] [1000KB/440KB/0KB
> /s] [125/55/0 iops] [eta 01m:59s]
> fio-test: (groupid=0, jobs=16): err= 0: pid=2051: Tue Oct 10 16:51:46 2017
>   read : io=43392KB, bw=733103B/s, iops=89, runt= 60610msec
> slat (usec): min=14, max=1992.5K, avg=177873.67, stdev=382294.06
> clat (usec): min=768, max=6016.8K, avg=1871390.57, stdev=1082220.06
>  lat (usec): min=872, max=6630.6K, avg=2049264.23, stdev=1158405.41
> clat percentiles (msec):
>  |  1.00th=[   20],  5.00th=[  208], 10.00th=[  457], 20.00th=[  873],
>  | 30.00th=[ 1237], 40.00th=[ 1516], 50.00th=[ 1795], 60.00th=[ 2073],
>  | 70.00th=[ 2442], 80.00th=[ 2835], 90.00th=[ 3326], 95.00th=[ 3785],
>  | 99.00th=[ 4555], 99.50th=[ 4948], 99.90th=[ 5211], 99.95th=[ 5800],
>  | 99.99th=[ 5997]
>   write: io=18856KB, bw=318570B/s, iops=38, runt= 60610msec
> slat (usec): min=17, max=3428, avg=212.62, stdev=287.88
> clat (usec): min=59, max=6015.6K, avg=1693729.12, stdev=1003122.83
>  lat (usec): min=79, max=6015.9K, avg=1693941.74, stdev=1003126.51
> clat percentiles (usec):
>  |  1.00th=[  724],  5.00th=[144384], 10.00th=[403456],
> 20.00th=[765952],
>  | 30.00th=[1105920], 40.00th=[1368064], 50.00th=[1630208],
> 60.00th=[1875968],
>  | 70.00th=[2179072], 80.00th=[2572288], 90.00th=[3031040],
> 95.00th=[3489792],
>  | 99.00th=[4227072], 99.50th=[4423680], 99.90th=[4751360],
> 99.95th=[5210112],
>  | 99.99th=[5996544]
> lat (usec) : 100=0.15%, 250=0.05%, 500=0.06%, 750=0.09%, 1000=0.05%
> lat (msec) : 2=0.28%, 4=0.09%, 10=0.15%, 20=0.39%, 50=1.81%
> lat (msec) : 100=1.02%, 250=1.63%, 500=5.59%, 750=6.03%, 1000=7.31%
> lat (msec) : 2000=35.61%, >=2000=39.67%
>   cpu  : usr=0.01%, sys=0.01%, ctx=8218, majf=11, minf=295
>   IO depths: 1=0.2%, 2=0.4%, 4=0.8%, 8=1.6%, 16=96.9%, 32=0.0%,
> >=64=0.0%
>  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>  complete  : 0=0.0%, 4=99.8%, 8=0.0%, 16=0.2%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>  issued: total=r=5424/w=2357/d=0, short=r=0/w=0/d=0,
> drop=r=0/w=0/d=0
>  latency   : target=0, window=0, percentile=100.00%, depth=16
>
> Run status group 0 (all jobs):
>READ: io=43392KB, aggrb=715KB/s, minb=715KB/s, maxb=715KB/s,
> mint=60610msec, maxt=60610msec
>   WRITE: io=18856KB, aggrb=311KB/s, minb=311KB/s, maxb=311KB/s,
> mint=60610msec, maxt=60610msec
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Access from multiple hosts where users have different uid/gid

2017-10-06 Thread Alastair Neil
nfsv4 id mapping is based on username not uid so you could use ganesha nfs
to share the files.

On 5 October 2017 at 04:42, Frizz  wrote:

> I have a setup with multiple hosts, each of them are administered
> separately. So there are no unified uid/gid for the users.
>
> When mounting a GlusterFS volume, a file owned by user1 on host1 might
> become owned by user2 on host2.
>
> I was looking into POSIX ACL or bindfs, but that won't help me much.
>
> What did other people do with this kind of problem?
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-07 Thread Alastair Neil
True but to work your way into that problem with replica 3 is a lot harder
to achieve than with just replica 2 + arbiter.

On 7 September 2017 at 14:06, Pavel Szalbot <pavel.szal...@gmail.com> wrote:

> Hi Neil, docs mention two live nodes of replica 3 blaming each other and
> refusing to do IO.
>
> https://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/Split%20brain%20and%20ways%20to%20deal%20with%
> 20it/#1-replica-3-volume
>
>
>
> On Sep 7, 2017 17:52, "Alastair Neil" <ajneil.t...@gmail.com> wrote:
>
>> *shrug* I don't use arbiter for vm work loads just straight replica 3.
>> There are some gotchas with using an arbiter for VM workloads.  If
>> quorum-type is auto and a brick that is not the arbiter drop out then if
>> the up brick is dirty as far as the arbiter is concerned i.e. the only good
>> copy is on the down brick you will get ENOTCONN and your VMs will halt on
>> IO.
>>
>> On 6 September 2017 at 16:06, <lemonni...@ulrar.net> wrote:
>>
>>> Mh, I never had to do that and I never had that problem. Is that an
>>> arbiter specific thing ? With replica 3 it just works.
>>>
>>> On Wed, Sep 06, 2017 at 03:59:14PM -0400, Alastair Neil wrote:
>>> > you need to set
>>> >
>>> > cluster.server-quorum-ratio 51%
>>> >
>>> > On 6 September 2017 at 10:12, Pavel Szalbot <pavel.szal...@gmail.com>
>>> wrote:
>>> >
>>> > > Hi all,
>>> > >
>>> > > I have promised to do some testing and I finally find some time and
>>> > > infrastructure.
>>> > >
>>> > > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
>>> > > replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
>>> > > with disk accessible through gfapi. Volume group is set to virt
>>> > > (gluster volume set gv_openstack_1 virt). VM runs current (all
>>> > > packages updated) Ubuntu Xenial.
>>> > >
>>> > > I set up following fio job:
>>> > >
>>> > > [job1]
>>> > > ioengine=libaio
>>> > > size=1g
>>> > > loops=16
>>> > > bs=512k
>>> > > direct=1
>>> > > filename=/tmp/fio.data2
>>> > >
>>> > > When I run fio fio.job and reboot one of the data nodes, IO
>>> statistics
>>> > > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
>>> > > filesystem gets remounted as read-only.
>>> > >
>>> > > If you care about infrastructure, setup details etc., do not
>>> hesitate to
>>> > > ask.
>>> > >
>>> > > Gluster info on volume:
>>> > >
>>> > > Volume Name: gv_openstack_1
>>> > > Type: Replicate
>>> > > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
>>> > > Status: Started
>>> > > Snapshot Count: 0
>>> > > Number of Bricks: 1 x (2 + 1) = 3
>>> > > Transport-type: tcp
>>> > > Bricks:
>>> > > Brick1: gfs-2.san:/export/gfs/gv_1
>>> > > Brick2: gfs-3.san:/export/gfs/gv_1
>>> > > Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
>>> > > Options Reconfigured:
>>> > > nfs.disable: on
>>> > > transport.address-family: inet
>>> > > performance.quick-read: off
>>> > > performance.read-ahead: off
>>> > > performance.io-cache: off
>>> > > performance.stat-prefetch: off
>>> > > performance.low-prio-threads: 32
>>> > > network.remote-dio: enable
>>> > > cluster.eager-lock: enable
>>> > > cluster.quorum-type: auto
>>> > > cluster.server-quorum-type: server
>>> > > cluster.data-self-heal-algorithm: full
>>> > > cluster.locking-scheme: granular
>>> > > cluster.shd-max-threads: 8
>>> > > cluster.shd-wait-qlength: 1
>>> > > features.shard: on
>>> > > user.cifs: off
>>> > >
>>> > > Partial KVM XML dump:
>>> > >
>>> > > 
>>> > >   
>>> > >   >> > > name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
>>> > > 
>>> > >   
>>> > >   
>>> > >   
>>> > >   77ebfd13-6a92-4f38-b036-e9

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-07 Thread Alastair Neil
*shrug* I don't use arbiter for vm work loads just straight replica 3.
There are some gotchas with using an arbiter for VM workloads.  If
quorum-type is auto and a brick that is not the arbiter drop out then if
the up brick is dirty as far as the arbiter is concerned i.e. the only good
copy is on the down brick you will get ENOTCONN and your VMs will halt on
IO.

On 6 September 2017 at 16:06, <lemonni...@ulrar.net> wrote:

> Mh, I never had to do that and I never had that problem. Is that an
> arbiter specific thing ? With replica 3 it just works.
>
> On Wed, Sep 06, 2017 at 03:59:14PM -0400, Alastair Neil wrote:
> > you need to set
> >
> > cluster.server-quorum-ratio 51%
> >
> > On 6 September 2017 at 10:12, Pavel Szalbot <pavel.szal...@gmail.com>
> wrote:
> >
> > > Hi all,
> > >
> > > I have promised to do some testing and I finally find some time and
> > > infrastructure.
> > >
> > > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
> > > replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
> > > with disk accessible through gfapi. Volume group is set to virt
> > > (gluster volume set gv_openstack_1 virt). VM runs current (all
> > > packages updated) Ubuntu Xenial.
> > >
> > > I set up following fio job:
> > >
> > > [job1]
> > > ioengine=libaio
> > > size=1g
> > > loops=16
> > > bs=512k
> > > direct=1
> > > filename=/tmp/fio.data2
> > >
> > > When I run fio fio.job and reboot one of the data nodes, IO statistics
> > > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
> > > filesystem gets remounted as read-only.
> > >
> > > If you care about infrastructure, setup details etc., do not hesitate
> to
> > > ask.
> > >
> > > Gluster info on volume:
> > >
> > > Volume Name: gv_openstack_1
> > > Type: Replicate
> > > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
> > > Status: Started
> > > Snapshot Count: 0
> > > Number of Bricks: 1 x (2 + 1) = 3
> > > Transport-type: tcp
> > > Bricks:
> > > Brick1: gfs-2.san:/export/gfs/gv_1
> > > Brick2: gfs-3.san:/export/gfs/gv_1
> > > Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
> > > Options Reconfigured:
> > > nfs.disable: on
> > > transport.address-family: inet
> > > performance.quick-read: off
> > > performance.read-ahead: off
> > > performance.io-cache: off
> > > performance.stat-prefetch: off
> > > performance.low-prio-threads: 32
> > > network.remote-dio: enable
> > > cluster.eager-lock: enable
> > > cluster.quorum-type: auto
> > > cluster.server-quorum-type: server
> > > cluster.data-self-heal-algorithm: full
> > > cluster.locking-scheme: granular
> > > cluster.shd-max-threads: 8
> > > cluster.shd-wait-qlength: 1
> > > features.shard: on
> > > user.cifs: off
> > >
> > > Partial KVM XML dump:
> > >
> > > 
> > >   
> > >> > name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
> > > 
> > >   
> > >   
> > >   
> > >   77ebfd13-6a92-4f38-b036-e9e55d752e1e
> > >   
> > >> > function='0x0'/>
> > > 
> > >
> > > Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps
> > > SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
> > > nodes (including arbiter).
> > >
> > > I would really love to know what am I doing wrong, because this is my
> > > experience with Gluster for a long time a and a reason I would not
> > > recommend it as VM storage backend in production environment where you
> > > cannot start/stop VMs on your own (e.g. providing private clouds for
> > > customers).
> > > -ps
> > >
> > >
> > > On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti <g.da...@assyoma.it>
> > > wrote:
> > > > Il 30-08-2017 17:07 Ivan Rossi ha scritto:
> > > >>
> > > >> There has ben a bug associated to sharding that led to VM corruption
> > > >> that has been around for a long time (difficult to reproduce I
> > > >> understood). I have not seen reports on that for some time after the
> > > >> last fix, so hopefully now VM hosting is stable.
> > > >
> > > >
> &

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-06 Thread Alastair Neil
you need to set

cluster.server-quorum-ratio 51%

On 6 September 2017 at 10:12, Pavel Szalbot  wrote:

> Hi all,
>
> I have promised to do some testing and I finally find some time and
> infrastructure.
>
> So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
> replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
> with disk accessible through gfapi. Volume group is set to virt
> (gluster volume set gv_openstack_1 virt). VM runs current (all
> packages updated) Ubuntu Xenial.
>
> I set up following fio job:
>
> [job1]
> ioengine=libaio
> size=1g
> loops=16
> bs=512k
> direct=1
> filename=/tmp/fio.data2
>
> When I run fio fio.job and reboot one of the data nodes, IO statistics
> reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
> filesystem gets remounted as read-only.
>
> If you care about infrastructure, setup details etc., do not hesitate to
> ask.
>
> Gluster info on volume:
>
> Volume Name: gv_openstack_1
> Type: Replicate
> Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gfs-2.san:/export/gfs/gv_1
> Brick2: gfs-3.san:/export/gfs/gv_1
> Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> performance.low-prio-threads: 32
> network.remote-dio: enable
> cluster.eager-lock: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-max-threads: 8
> cluster.shd-wait-qlength: 1
> features.shard: on
> user.cifs: off
>
> Partial KVM XML dump:
>
> 
>   
>name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
> 
>   
>   
>   
>   77ebfd13-6a92-4f38-b036-e9e55d752e1e
>   
>function='0x0'/>
> 
>
> Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps
> SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
> nodes (including arbiter).
>
> I would really love to know what am I doing wrong, because this is my
> experience with Gluster for a long time a and a reason I would not
> recommend it as VM storage backend in production environment where you
> cannot start/stop VMs on your own (e.g. providing private clouds for
> customers).
> -ps
>
>
> On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti 
> wrote:
> > Il 30-08-2017 17:07 Ivan Rossi ha scritto:
> >>
> >> There has ben a bug associated to sharding that led to VM corruption
> >> that has been around for a long time (difficult to reproduce I
> >> understood). I have not seen reports on that for some time after the
> >> last fix, so hopefully now VM hosting is stable.
> >
> >
> > Mmmm... this is precisely the kind of bug that scares me... data
> corruption
> > :|
> > Any more information on what causes it and how to resolve? Even if in
> newer
> > Gluster releases it is a solved bug, knowledge on how to treat it would
> be
> > valuable.
> >
> >
> > Thanks.
> >
> > --
> > Danti Gionatan
> > Supporto Tecnico
> > Assyoma S.r.l. - www.assyoma.it
> > email: g.da...@assyoma.it - i...@assyoma.it
> > GPG public key ID: FF5F32A8
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] RECOMMENDED CONFIGURATIONS - DISPERSED VOLUME

2017-07-31 Thread Alastair Neil
Dmitri the recommendation from redhat is likely because it is recommended
to have the data stripes be a power of two otherwise there is a performance
penalty.

On 31 July 2017 at 14:28, Dmitri Chebotarov <4dim...@gmail.com> wrote:

> Hi
>
> I'm looking for an advise to configure a dispersed volume.
> I have 12 servers and would like to use 10:2 ratio.
>
> Yet RH recommends 8:3 or 8:4 in this case:
>
> https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/
> Administration_Guide/chap-Recommended-Configuration_Dispersed.html
>
> My goal is to create 2PT volume, and going with 10:2 vs 8:3/4 saves a few
> bricks. With 10:2 I'll use 312 8TB bricks and with 8:3 it's 396 8TB bricks
> (36 8:3 slices to evenly distribute between all servers/bricks)
>
> As I see it 8:3/4 vs 10:2 gives more data redundancy (3 servers vs 2
> servers can be offline), but is critical with 12 nodes? Nodes are new and
> under warranty, it's unlikely I will lose 3 servers  at the same time (10:2
> goes offline). Or should I follow RH recommended configuration and use
> 8:3/4?
>
> Thank you.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Multi petabyte gluster

2017-06-30 Thread Alastair Neil
I can ask our other engineer but I don't have those figues.

-Alastair


On 30 June 2017 at 13:52, Serkan Çoban <cobanser...@gmail.com> wrote:

> Did you test healing by increasing disperse.shd-max-threads?
> What is your heal times per brick now?
>
> On Fri, Jun 30, 2017 at 8:01 PM, Alastair Neil <ajneil.t...@gmail.com>
> wrote:
> > We are using 3.10 and have a 7 PB cluster.  We decided against 16+3 as
> the
> > rebuild time are bottlenecked by matrix operations which scale as the
> square
> > of the number of data stripes.  There are some savings because of larger
> > data chunks but we ended up using 8+3 and heal times are about half
> compared
> > to 16+3.
> >
> > -Alastair
> >
> > On 30 June 2017 at 02:22, Serkan Çoban <cobanser...@gmail.com> wrote:
> >>
> >> >Thanks for the reply. We will mainly use this for archival - near-cold
> >> > storage.
> >> Archival usage is good for EC
> >>
> >> >Anything, from your experience, to keep in mind while planning large
> >> > installations?
> >> I am using 3.7.11 and only problem is slow rebuild time when a disk
> >> fails. It takes 8 days to heal a 8TB disk.(This might be related with
> >> my EC configuration 16+4)
> >> 3.9+ versions has some improvements about this but I cannot test them
> >> yet...
> >>
> >> On Thu, Jun 29, 2017 at 2:49 PM, jkiebzak <jkieb...@gmail.com> wrote:
> >> > Thanks for the reply. We will mainly use this for archival - near-cold
> >> > storage.
> >> >
> >> >
> >> > Anything, from your experience, to keep in mind while planning large
> >> > installations?
> >> >
> >> >
> >> > Sent from my Verizon, Samsung Galaxy smartphone
> >> >
> >> >  Original message 
> >> > From: Serkan Çoban <cobanser...@gmail.com>
> >> > Date: 6/29/17 4:39 AM (GMT-05:00)
> >> > To: Jason Kiebzak <jkieb...@gmail.com>
> >> > Cc: Gluster Users <gluster-users@gluster.org>
> >> > Subject: Re: [Gluster-users] Multi petabyte gluster
> >> >
> >> > I am currently using 10PB single volume without problems. 40PB is on
> >> > the way. EC is working fine.
> >> > You need to plan ahead with large installations like this. Do complete
> >> > workload tests and make sure your use case is suitable for EC.
> >> >
> >> >
> >> > On Wed, Jun 28, 2017 at 11:18 PM, Jason Kiebzak <jkieb...@gmail.com>
> >> > wrote:
> >> >> Has anyone scaled to a multi petabyte gluster setup? How well does
> >> >> erasure
> >> >> code do with such a large setup?
> >> >>
> >> >> Thanks
> >> >>
> >> >> ___
> >> >> Gluster-users mailing list
> >> >> Gluster-users@gluster.org
> >> >> http://lists.gluster.org/mailman/listinfo/gluster-users
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/gluster-users
> >
> >
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Multi petabyte gluster

2017-06-30 Thread Alastair Neil
We are using 3.10 and have a 7 PB cluster.  We decided against 16+3 as the
rebuild time are bottlenecked by matrix operations which scale as the
square of the number of data stripes.  There are some savings because of
larger data chunks but we ended up using 8+3 and heal times are about half
compared to 16+3.

-Alastair

On 30 June 2017 at 02:22, Serkan Çoban  wrote:

> >Thanks for the reply. We will mainly use this for archival - near-cold
> storage.
> Archival usage is good for EC
>
> >Anything, from your experience, to keep in mind while planning large
> installations?
> I am using 3.7.11 and only problem is slow rebuild time when a disk
> fails. It takes 8 days to heal a 8TB disk.(This might be related with
> my EC configuration 16+4)
> 3.9+ versions has some improvements about this but I cannot test them
> yet...
>
> On Thu, Jun 29, 2017 at 2:49 PM, jkiebzak  wrote:
> > Thanks for the reply. We will mainly use this for archival - near-cold
> > storage.
> >
> >
> > Anything, from your experience, to keep in mind while planning large
> > installations?
> >
> >
> > Sent from my Verizon, Samsung Galaxy smartphone
> >
> >  Original message 
> > From: Serkan Çoban 
> > Date: 6/29/17 4:39 AM (GMT-05:00)
> > To: Jason Kiebzak 
> > Cc: Gluster Users 
> > Subject: Re: [Gluster-users] Multi petabyte gluster
> >
> > I am currently using 10PB single volume without problems. 40PB is on
> > the way. EC is working fine.
> > You need to plan ahead with large installations like this. Do complete
> > workload tests and make sure your use case is suitable for EC.
> >
> >
> > On Wed, Jun 28, 2017 at 11:18 PM, Jason Kiebzak 
> wrote:
> >> Has anyone scaled to a multi petabyte gluster setup? How well does
> erasure
> >> code do with such a large setup?
> >>
> >> Thanks
> >>
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] issue with trash feature and arbiter volumes

2017-06-29 Thread Alastair Neil
Gluster 3.10.2

I have a replica 3 (2+1) volume and I have just seen both data bricks go
down (arbiter stayed up).  I had to disable trash feature to get the bricks
to start.  I had a quick look on bugzilla but did not see anything that
looked similar.  I  just wanted to check that I was not hitting some know
issue and/or doing something stupid, before I open a bug. This is from the
brick log:

[2017-06-28 17:38:43.565378] E [posix.c:3327:_fill_writev_xdata]
> (-->/usr/lib64/glusterfs/3.10.2/xlator/features/trash.so(+0x2bd3)
> [0x7ff81964ebd3]
> -->/usr/lib64/glusterfs/3.10.2/xlator/storage/posix.so(+0x1e546)
> [0x7ff819e96546]
> -->/usr/lib64/glusterfs/3.10.2/xlator/storage/posix.so(+0x1e2ff)
> [0x7ff819e962ff]
> ) 0-homes-posix: fd: 0x7ff7b4121bf0 inode:
> 0x7ff7b41222b0gfid:---- [Invalid argument]
> pending frames:
> frame : type(0) op(24)
> patchset: git://git.gluster.org/glusterfs.git
> signal received: 11
> time of crash:
> 2017-06-28 17:38:49
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1

libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.10.2
> /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7ff8274ed4d0]
> /lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7ff8274f6dd4]
> /lib64/libc.so.6(+0x35250)[0x7ff825bd1250]
> /lib64/libc.so.6(+0x163ea1)[0x7ff825cffea1]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/trash.so(+0x11c29)[0x7ff81965dc29]
>
> /usr/lib64/glusterfs/3.10.2/xlator/storage/posix.so(+0x7d5a)[0x7ff819e7fd5a]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/trash.so(+0x13676)[0x7ff81965f676]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/changetimerecorder.so(+0x810d)[0x7ff81943510d]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/changelog.so(+0xbf40)[0x7ff818d4ff40]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/bitrot-stub.so(+0xeafd)[0x7ff818924afd]
> /lib64/libglusterfs.so.0(default_ftruncate+0xc8)[0x7ff827568ec8]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/locks.so(+0x182a5)[0x7ff8184ea2a5]
>
> /usr/lib64/glusterfs/3.10.2/xlator/storage/posix.so(+0x7d5a)[0x7ff819e7fd5a]
> /lib64/libglusterfs.so.0(default_fstat+0xbe)[0x7ff82756848e]
> /lib64/libglusterfs.so.0(default_fstat+0xbe)[0x7ff82756848e]
> /lib64/libglusterfs.so.0(default_fstat+0xbe)[0x7ff82756848e]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/bitrot-stub.so(+0x9f4f)[0x7ff81891ff4f]
> /lib64/libglusterfs.so.0(default_fstat+0xbe)[0x7ff82756848e]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/locks.so(+0x7d8a)[0x7ff8184d9d8a]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/worm.so(+0x898e)[0x7ff8182cc98e]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/read-only.so(+0x2ca3)[0x7ff8180beca3]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/leases.so(+0xad5f)[0x7ff813df5d5f]
>
> /usr/lib64/glusterfs/3.10.2/xlator/features/upcall.so(+0x13209)[0x7ff813be3209]
> /lib64/libglusterfs.so.0(default_ftruncate_resume+0x1b7)[0x7ff827585d77]
> /lib64/libglusterfs.so.0(call_resume+0x75)[0x7ff82755]
>
> /usr/lib64/glusterfs/3.10.2/xlator/performance/io-threads.so(+0x4dd4)[0x7ff8139c9dd4]
> /lib64/libpthread.so.0(+0x7dc5)[0x7ff82634edc5]
> /lib64/libc.so.6(clone+0x6d)[0x7ff825c9376d]
>

output from gluster volume info | sort :

auth.allow: 192.168.0.*
> auto-delete: enable
> Brick1: gluster2:/export/brick2/home
> Brick2: gluster1:/export/brick2/home
> Brick3: gluster0:/export/brick9/homes-arbiter (arbiter)
> Bricks:
> client.event-threads: 4
> cluster.background-self-heal-count: 8
> cluster.consistent-metadata: no
> cluster.data-self-heal-algorithm: diff
> cluster.data-self-heal: off
> cluster.eager-lock: on
> cluster.enable-shared-storage: enable
> cluster.entry-self-heal: off
> cluster.heal-timeout: 180
> cluster.lookup-optimize: off
> cluster.metadata-self-heal: off
> cluster.min-free-disk: 5%
> cluster.quorum-type: auto
> cluster.readdir-optimize: on
> cluster.read-hash-mode: 2
> cluster.rebalance-stats: on
> cluster.self-heal-daemon: on
> cluster.self-heal-readdir-size: 64KB
> cluster.self-heal-window-size: 4
> cluster.server-quorum-ratio: 51%
> diagnostics.brick-log-level: WARNING
> diagnostics.client-log-level: ERROR
> diagnostics.count-fop-hits: on
> diagnostics.latency-measurement: off
> features.barrier: disable
> features.quota: off
> features.show-snapshot-directory: enable
> features.trash-internal-op: off
> features.trash-max-filesize: 1GB
> features.trash: off
> features.uss: off
> network.ping-timeout: 20
> nfs.disable: on
> nfs.export-dirs: on
> nfs.export-volumes: on
> nfs.rpc-auth-allow: 192.168.0.*
> Number of Bricks: 1 x (2 + 1) = 3
> Options Reconfigured:
> performance.cache-size: 256MB
> performance.client-io-threads: on
> performance.io-thread-count: 16
> performance.strict-write-ordering: off
> performance.write-behind: off
> server.allow-insecure: on
> server.event-threads: 8
> server.root-squash: off
> server.statedump-path: /tmp
> snap-activate-on-create: enable
> Snapshot 

Re: [Gluster-users] disperse volume brick counts limits in RHES

2017-05-08 Thread Alastair Neil
so the bottleneck is that computations with 16x20 matrix require  ~4 times
the cycles?  It seems then that there is ample room for improvement, as
there are many linear algebra packages out there that scale better than
O(nxm).  Is the healing time dominated by the EC compute time?  If Serkan
saw a hard 2x scaling then it seems likely.

-Alastair




On 8 May 2017 at 03:02, Xavier Hernandez <xhernan...@datalab.es> wrote:

> On 05/05/17 13:49, Pranith Kumar Karampuri wrote:
>
>>
>>
>> On Fri, May 5, 2017 at 2:38 PM, Serkan Çoban <cobanser...@gmail.com
>> <mailto:cobanser...@gmail.com>> wrote:
>>
>> It is the over all time, 8TB data disk healed 2x faster in 8+2
>> configuration.
>>
>>
>> Wow, that is counter intuitive for me. I will need to explore about this
>> to find out why that could be. Thanks a lot for this feedback!
>>
>
> Matrix multiplication for encoding/decoding of 8+2 is 4 times faster than
> 16+4 (one matrix of 16x16 is composed by 4 submatrices of 8x8), however
> each matrix operation on a 16+4 configuration takes twice the amount of
> data of a 8+2, so net effect is that 8+2 is twice as fast as 16+4.
>
> An 8+2 also uses bigger blocks on each brick, processing the same amount
> of data in less I/O operations and bigger network packets.
>
> Probably these are the reasons why 16+4 is slower than 8+2.
>
> See my other email for more detailed description.
>
> Xavi
>
>
>>
>>
>> On Fri, May 5, 2017 at 10:00 AM, Pranith Kumar Karampuri
>> <pkara...@redhat.com <mailto:pkara...@redhat.com>> wrote:
>> >
>> >
>> > On Fri, May 5, 2017 at 11:42 AM, Serkan Çoban
>> <cobanser...@gmail.com <mailto:cobanser...@gmail.com>> wrote:
>> >>
>> >> Healing gets slower as you increase m in m+n configuration.
>> >> We are using 16+4 configuration without any problems other then
>> heal
>> >> speed.
>> >> I tested heal speed with 8+2 and 16+4 on 3.9.0 and see that heals
>> on
>> >> 8+2 is faster by 2x.
>> >
>> >
>> > As you increase number of nodes that are participating in an EC
>> set number
>> > of parallel heals increase. Is the heal speed you saw improved per
>> file or
>> > the over all time it took to heal the data?
>> >
>> >>
>> >>
>> >>
>> >> On Fri, May 5, 2017 at 9:04 AM, Ashish Pandey
>> <aspan...@redhat.com <mailto:aspan...@redhat.com>> wrote:
>> >> >
>> >> > 8+2 and 8+3 configurations are not the limitation but just
>> suggestions.
>> >> > You can create 16+3 volume without any issue.
>> >> >
>> >> > Ashish
>> >> >
>> >> > 
>> >> > From: "Alastair Neil" <ajneil.t...@gmail.com
>> <mailto:ajneil.t...@gmail.com>>
>> >> > To: "gluster-users" <gluster-users@gluster.org
>> <mailto:gluster-users@gluster.org>>
>> >> > Sent: Friday, May 5, 2017 2:23:32 AM
>> >> > Subject: [Gluster-users] disperse volume brick counts limits in
>> RHES
>> >> >
>> >> >
>> >> > Hi
>> >> >
>> >> > we are deploying a large (24node/45brick) cluster and noted
>> that the
>> >> > RHES
>> >> > guidelines limit the number of data bricks in a disperse set to
>> 8.  Is
>> >> > there
>> >> > any reason for this.  I am aware that you want this to be a
>> power of 2,
>> >> > but
>> >> > as we have a large number of nodes we were planning on going
>> with 16+3.
>> >> > Dropping to 8+2 or 8+3 will be a real waste for us.
>> >> >
>> >> > Thanks,
>> >> >
>> >> >
>> >> > Alastair
>> >> >
>> >> >
>> >> > ___
>> >> > Gluster-users mailing list
>> >> > Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
>> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
>> <http://lists.gluster.org/mailman/listinfo/gluster-users>
>

Re: [Gluster-users] disperse volume brick counts limits in RHES

2017-05-05 Thread Alastair Neil
What network do you have?


On 5 May 2017 at 09:51, Serkan Çoban <cobanser...@gmail.com> wrote:

> In our use case every node has 26 bricks. I am using 60 nodes, one 9PB
> volume with 16+4 EC configuration, each brick in a sub-volume is on
> different host.
> We put 15-20k 2GB files every day into 10-15 folders. So it is 1500K
> files/folder. Our gluster version is 3.7.11.
> Heal speed in this environment is 8-10MB/sec/brick.
>
> I did some tests for parallel self heal feature with version 3.9, two
> servers 26 bricks each, 8+2 and 16+4 EC configuration.
> This was a small test environment and the results are as I said 8+2 is
> 2x faster then 16+4 with parallel self heal threads set to 2/4.
> In 1-2 months our new servers arriving, I will do detailed tests for
> heal performance for 8+2 and 16+4 and inform you the results.
>
>
> On Fri, May 5, 2017 at 2:54 PM, Pranith Kumar Karampuri
> <pkara...@redhat.com> wrote:
> >
> >
> > On Fri, May 5, 2017 at 5:19 PM, Pranith Kumar Karampuri
> > <pkara...@redhat.com> wrote:
> >>
> >>
> >>
> >> On Fri, May 5, 2017 at 2:38 PM, Serkan Çoban <cobanser...@gmail.com>
> >> wrote:
> >>>
> >>> It is the over all time, 8TB data disk healed 2x faster in 8+2
> >>> configuration.
> >>
> >>
> >> Wow, that is counter intuitive for me. I will need to explore about this
> >> to find out why that could be. Thanks a lot for this feedback!
> >
> >
> > From memory I remember you said you have a lot of small files hosted on
> the
> > volume, right? It could be because of the bug
> > https://review.gluster.org/17151 is fixing. That is the only reason I
> could
> > guess right now. We will try to test this kind of case if you could give
> us
> > a bit more details about average file-size/depth of directories etc to
> > simulate similar looking directory structure.
> >
> >>
> >>
> >>>
> >>>
> >>> On Fri, May 5, 2017 at 10:00 AM, Pranith Kumar Karampuri
> >>> <pkara...@redhat.com> wrote:
> >>> >
> >>> >
> >>> > On Fri, May 5, 2017 at 11:42 AM, Serkan Çoban <cobanser...@gmail.com
> >
> >>> > wrote:
> >>> >>
> >>> >> Healing gets slower as you increase m in m+n configuration.
> >>> >> We are using 16+4 configuration without any problems other then heal
> >>> >> speed.
> >>> >> I tested heal speed with 8+2 and 16+4 on 3.9.0 and see that heals on
> >>> >> 8+2 is faster by 2x.
> >>> >
> >>> >
> >>> > As you increase number of nodes that are participating in an EC set
> >>> > number
> >>> > of parallel heals increase. Is the heal speed you saw improved per
> file
> >>> > or
> >>> > the over all time it took to heal the data?
> >>> >
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Fri, May 5, 2017 at 9:04 AM, Ashish Pandey <aspan...@redhat.com>
> >>> >> wrote:
> >>> >> >
> >>> >> > 8+2 and 8+3 configurations are not the limitation but just
> >>> >> > suggestions.
> >>> >> > You can create 16+3 volume without any issue.
> >>> >> >
> >>> >> > Ashish
> >>> >> >
> >>> >> > 
> >>> >> > From: "Alastair Neil" <ajneil.t...@gmail.com>
> >>> >> > To: "gluster-users" <gluster-users@gluster.org>
> >>> >> > Sent: Friday, May 5, 2017 2:23:32 AM
> >>> >> > Subject: [Gluster-users] disperse volume brick counts limits in
> RHES
> >>> >> >
> >>> >> >
> >>> >> > Hi
> >>> >> >
> >>> >> > we are deploying a large (24node/45brick) cluster and noted that
> the
> >>> >> > RHES
> >>> >> > guidelines limit the number of data bricks in a disperse set to 8.
> >>> >> > Is
> >>> >> > there
> >>> >> > any reason for this.  I am aware that you want this to be a power
> of
> >>> >> > 2,
> >>> >> > but
> >>> >> > as we have a large number of nodes we were planning on going with
> >>> >> > 16+3.
> >>> >> > Dropping to 8+2 or 8+3 will be a real waste for us.
> >>> >> >
> >>> >> > Thanks,
> >>> >> >
> >>> >> >
> >>> >> > Alastair
> >>> >> >
> >>> >> >
> >>> >> > ___
> >>> >> > Gluster-users mailing list
> >>> >> > Gluster-users@gluster.org
> >>> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
> >>> >> >
> >>> >> >
> >>> >> > ___
> >>> >> > Gluster-users mailing list
> >>> >> > Gluster-users@gluster.org
> >>> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
> >>> >> ___
> >>> >> Gluster-users mailing list
> >>> >> Gluster-users@gluster.org
> >>> >> http://lists.gluster.org/mailman/listinfo/gluster-users
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Pranith
> >>
> >>
> >>
> >>
> >> --
> >> Pranith
> >
> >
> >
> >
> > --
> > Pranith
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] disperse volume brick counts limits in RHES

2017-05-04 Thread Alastair Neil
Hi

we are deploying a large (24node/45brick) cluster and noted that the RHES
guidelines limit the number of data bricks in a disperse set to 8.  Is
there any reason for this.  I am aware that you want this to be a power of
2, but as we have a large number of nodes we were planning on going with
16+3.  Dropping to 8+2 or 8+3 will be a real waste for us.

Thanks,


Alastair
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] advice needed on configuring large gluster cluster

2017-03-15 Thread Alastair Neil
Hi

we have a new gluster cluster we are planning on deploying.  We will have
24 nodes each with JBOD, 39 8TB drives and 6, 900GB SSDs, and FDR IB

We will not be using all of this as one volume , but I thought initially of
using a distributed disperse volume.

Never having attempted anything on this scale I have a couple of questions
regarding EC and distibuted disperse volumes.

Does a distributed dispersed volume have to start life as distributed
dispersed, or can I  take a disperse volume and make it distributed by
adding bricks?

Does an EC scheme of 24+4 seem reasonable?  One requirement we will have is
the need to tolerate two nodes down at once, as the nodes share a chassis.
I assume that  distributed disperse volumes can be expanded in a similar
fashion to distributed replicate volumes by adding additional disperse
brick sets?

I would also like to consider adding a hot-tier using the SSDs,  I confess
I have not done much reading on tiering, but am hoping I can use a
different volume form for the hot tier.  Can I use create a disperse, or a
distributed replicated?   If I am smoking rainbows then I can consider
setting up a SSD only distributed disperse volume.

I'd also appreciate any feedback on likely performance issues and tuning
tips?

Many Thanks

-Alastair
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 90 Brick/Server suggestions?

2017-02-16 Thread Alastair Neil
We have 12 on order.  Actually the DSS7000 has two nodes in the chassis,
and  each accesses 45 bricks.  We will be using an erasure code scheme
probably 24:3 or 24:4, we have not sat down and really thought about the
exact scheme we will use.


On 15 February 2017 at 14:04, Serkan Çoban  wrote:

> Hi,
>
> We are evaluating dell DSS7000 chassis with 90 disks.
> Has anyone used that much brick per server?
> Any suggestions, advices?
>
> Thanks,
> Serkan
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] 3.8.5 replica 3 volumes: I/O error on file on fuse mounts

2016-12-21 Thread Alastair Neil
Would apprecaite any insight into this issue:
replica 3 volume, it is showing a number of files on two of the bricks as
needing healed, when you examine the files on the fuse mounts they generate
I/O errors.
No files listed in split brain, but if I look at one of the files it looks
to me like they have been updated on gluster-2 and gluster0 but not on
gluster1 (see below).
I see  errors in /va/log/gluster/glustershd.log

-Thanks Alastair


[2016-12-20 07:25:06.018829] I [MSGID: 101190]
> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2016-12-20 07:25:06.018901] E [socket.c:2309:socket_connect_finish]
> 0-glusterfs: connection to ::1:24007 failed (Connection refused)
> [2016-12-20 07:25:06.018944] E [glusterfsd-mgmt.c:1902:mgmt_rpc_notify]
> 0-glusterfsd-mgmt: failed to connect with remote-host: localhost (Transport
> endpoint is not connected)
> [2016-12-20 07:25:07.187710] W [glusterfsd.c:1327:cleanup_and_exit]
> (-->/lib64/libpthread.so.0(+0x7dc5) [0x7fd93f669dc5]
> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7fd940cfbcd5]
> -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7fd940cfbb4b] ) 0-:
> received signum (15), shutting down
> [2016-12-20 07:25:08.197959] I [MSGID: 100030] [glusterfsd.c:2454:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.8.5
> (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p
> /var/lib/glusterd/glustershd/run/glustershd.pid -l
> /var/log/glusterfs/glustershd.log -S
> /var/run/gluster/3fe0b238bd46c38a95636f25cb5b9d8a.socket --xlator-option
> *replicate*.node-uuid=bcff5245-ea86-4384-a1bf-9219c8be8001)
> [2016-12-20 07:25:08.216336] I [MSGID: 101190]
> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2016-12-20 07:25:08.216419] E [socket.c:2309:socket_connect_finish]
> 0-glusterfs: connection to ::1:24007 failed (Connection refused)
> [2016-12-20 07:25:08.216464] E [glusterfsd-mgmt.c:1902:mgmt_rpc_notify]
> 0-glusterfsd-mgmt: failed to connect with remote-host: localhost (Transport
> endpoint is not connected)
> [2016-12-20 07:25:12.208092] I [MSGID: 101173]
> [graph.c:269:gf_add_cmdline_options] 0-digitalcorpora-replicate-0: adding
> option 'node-uuid' for volume 'digitalcorpora-replicate-0' with value
> 'bcff5245-ea86-4384-a1bf-9219c8be8001'
> [2016-12-20 07:25:12.208122] I [MSGID: 101173]
> [graph.c:269:gf_add_cmdline_options] 0-gluster_shared_storage-replicate-0:
> adding option 'node-uuid' for volume 'gluster_shared_storage-replicate-0'
> with value 'bcff5245-ea86-4384-a1bf-9219c8be8001'
> [2016-12-20 07:25:12.208140] I [MSGID: 101173]
> [graph.c:269:gf_add_cmdline_options] 0-homes-replicate-0: adding option
> 'node-uuid' for volume 'homes-replicate-0' with value
> 'bcff5245-ea86-4384-a1bf-9219c8be8001'
> [2016-12-20 07:25:12.208155] I [MSGID: 101173]
> [graph.c:269:gf_add_cmdline_options] 0-public-replicate-0: adding option
> 'node-uuid' for volume 'public-replicate-0' with value
> 'bcff5245-ea86-4384-a1bf-9219c8be8001'
> [2016-12-20 07:25:12.208173] I [MSGID: 101173]
> [graph.c:269:gf_add_cmdline_options] 0-static-web-replicate-0: adding
> option 'node-uuid' for volume 'static-web-replicate-0' with value
> 'bcff5245-ea86-4384-a1bf-9219c8be8001'
> [2016-12-20 07:25:12.208199] I [MSGID: 101173]
> [graph.c:269:gf_add_cmdline_options] 0-tmp-replicate-0: adding option
> 'node-uuid' for volume 'tmp-replicate-0' with value
> 'bcff5245-ea86-4384-a1bf-9219c8be8001'
> [2016-12-20 07:25:12.208215] I [MSGID: 101173]
> [graph.c:269:gf_add_cmdline_options] 0-usr-local-replicate-0: adding option
> 'node-uuid' for volume 'usr-local-replicate-0' with value
> 'bcff5245-ea86-4384-a1bf-9219c8be8001'
> [2016-12-20 18:32:06.121734] E [client-common.c:526:client_pre_getxattr]
> (-->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0xb5d8)
> [0x7f6bc4ba65d8]
> -->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0x26ebd)
> [0x7f6bc4bc1ebd]
> -->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0x393e3)
> [0x7f6bc4bd43e3] ) 0-: Assertion failed: 0
> [2016-12-20 18:32:06.121809] E [client-common.c:587:client_pre_opendir]
> (-->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0xa9d5)
> [0x7f6bc4ba59d5]
> -->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0x25a65)
> [0x7f6bc4bc0a65]
> -->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0x396b7)
> [0x7f6bc4bd46b7] ) 0-: Assertion failed: 0
> [2016-12-20 18:46:51.764776] E [client-common.c:526:client_pre_getxattr]
> (-->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0xb5d8)
> [0x7f6bc4ba65d8]
> -->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0x26ebd)
> [0x7f6bc4bc1ebd]
> -->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0x393e3)
> [0x7f6bc4bd43e3] ) 0-: Assertion failed: 0
> [2016-12-20 18:46:51.764850] E [client-common.c:587:client_pre_opendir]
> (-->/usr/lib64/glusterfs/3.8.5/xlator/protocol/client.so(+0xa9d5)
> [0x7f6bc4ba59d5]
> 

Re: [Gluster-users] Looking for use cases / opinions

2016-11-09 Thread Alastair Neil
Serkan

I'd be interested to know how your disks are attached (SAS?)?  Do you use
any hardware RAID, or zfs and do you have and SSDs in there?

On 9 November 2016 at 06:17, Serkan Çoban  wrote:

> Hi, I am using 26x8TB disks per server. There are 60 servers in gluster
> cluster.
> Each disk is a brick and configuration is 16+4 EC, 9PB single volume.
> Clients are using fuse mounts.
> Even with 1-2K files in a directory, ls from clients takes ~60 secs.
> So If you are sensitive to metadata operations, I suggest another
> approach...
>
>
> On Wed, Nov 9, 2016 at 1:05 PM, Frank Rothenstein
>  wrote:
> > As you said you want to have 3 or 4 replicas, so i would use the zfs
> > knowledge and build 1 zpool per node with whatever config you know is
> > fastest on this kind of hardware and as safe as you need (stripe,
> > mirror, raidz1..3 - resilvering zfs is faster than healing gluster, I
> > think) . 1 node -> 1 brick (per gluster volume).
> >
> > Frank
> > Am Dienstag, den 08.11.2016, 19:19 + schrieb Thomas Wakefield:
> >> We haven’t decided how the JBODS would be configured.  They would
> >> likely be SAS attached without a raid controller for improved
> >> performance.  I run large ZFS arrays this way, but only in single
> >> server NFS setups right now.
> >> Mounting each hard drive as it’s own brick would probably give the
> >> most usable space, but would need scripting to manage building all
> >> the bricks.  But does Gluster handle 1000’s of small bricks?
> >>
> >>
> >>
> >> > On Nov 8, 2016, at 9:18 AM, Frank Rothenstein  >> > -kliniken.de> wrote:
> >> >
> >> > Hi Thomas,
> >> >
> >> > thats a huge storage.
> >> > What I can say from my usecase - dont use Gluster directly if the
> >> > files
> >> > are small. I dont know, if the file count matters, but if the files
> >> > are
> >> > small (few KiB), Gluster takes ages to remove for example. Doing
> >> > the
> >> > same in a VM with e.g. ext4 disk on the very same Gluster gives a
> >> > big
> >> > speedup.
> >> > There are many options for a new Gluster volume, like Lindsay
> >> > mentioned.
> >> > And there are other options, like Ceph, OrangeFS.
> >> > How do you want to use the JBODs? I dont think you would use every
> >> > single drive as a brick... How are these connected to the servers?
> >> >
> >> > Im only dealing with about 10TiB Gluster volumes, so by far not at
> >> > your
> >> > planned level, but I really would like to see some results, if you
> >> > go
> >> > for Gluster!
> >> >
> >> > Frank
> >> >
> >> >
> >> > Am Dienstag, den 08.11.2016, 13:49 + schrieb Thomas Wakefield:
> >> > > I think we are leaning towards erasure coding with 3 or 4
> >> > > copies.  But open to suggestions.
> >> > >
> >> > >
> >> > > > On Nov 8, 2016, at 8:43 AM, Lindsay Mathieson  >> > > > n@gm
> >> > > > ail.com> wrote:
> >> > > >
> >> > > > On 8/11/2016 11:38 PM, Thomas Wakefield wrote:
> >> > > > > High Performance Computing, we have a small cluster on campus
> >> > > > > of
> >> > > > > about 50 linux compute servers.
> >> > > > >
> >> > > >
> >> > > > D'oh! I should have thought of that.
> >> > > >
> >> > > >
> >> > > > Are you looking at replication (2 or 3)/disperse or pure
> >> > > > disperse?
> >> > > >
> >> > > > --
> >> > > > Lindsay Mathieson
> >> > > >
> >> > >
> >> > > ___
> >> > > Gluster-users mailing list
> >> > > Gluster-users@gluster.org
> >> > > http://www.gluster.org/mailman/listinfo/gluster-users
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > ___
> >> > ___
> >> > BODDEN-KLINIKEN Ribnitz-Damgarten GmbH
> >> > Sandhufe 2
> >> > 18311 Ribnitz-Damgarten
> >> >
> >> > Telefon: 03821-700-0
> >> > Fax:   03821-700-240
> >> >
> >> > E-Mail: i...@bodden-kliniken.de   Internet: http://www.bodden-klini
> >> > ken.de
> >> >
> >> > Sitz: Ribnitz-Damgarten, Amtsgericht: Stralsund, HRB 2919, Steuer-
> >> > Nr.: 079/133/40188
> >> > Aufsichtsratsvorsitzende: Carmen Schröter, Geschäftsführer: Dr.
> >> > Falko Milski
> >> >
> >> > Der Inhalt dieser E-Mail ist ausschließlich für den bezeichneten
> >> > Adressaten bestimmt. Wenn Sie nicht der vorge-
> >> > sehene Adressat dieser E-Mail oder dessen Vertreter sein sollten,
> >> > beachten Sie bitte, dass jede Form der Veröf-
> >> > fentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser E-
> >> > Mail unzulässig ist. Wir bitten Sie, sofort den
> >> > Absender zu informieren und die E-Mail zu löschen.
> >> >
> >> >
> >> > Bodden-Kliniken Ribnitz-Damgarten GmbH 2016
> >> > *** Virenfrei durch Kerio Mail Server und Sophos Antivirus ***
> >> >
> >>
> >>
> >
> >
> >
> >
> >
> > 
> __
> > BODDEN-KLINIKEN Ribnitz-Damgarten GmbH
> > Sandhufe 2
> > 18311 Ribnitz-Damgarten
> >
> > Telefon: 03821-700-0
> > Fax:   

Re: [Gluster-users] Performance

2016-10-31 Thread Alastair Neil
What version of Gluster?  Are you using glusterfs or nfs mount?  Any other
traffic on the network, is the cluster quiescent apart from your dd test?

It does seem slow.  I have a three server cluster, using straight xfs over
10G with Gluster 3.8 and glusterfs mounts and I see:

[root@sb-c 192.168.10.49:VM]# sync; dd if=/dev/zero of=nfsp2 bs=1M
count=1024; sync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 11.3322 s, 94.8 MB/s
[root@sb-c 192.168.10.49:VM]# sync; dd if=/dev/zero of=nfsp2 bs=1M
count=10240; sync
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 117.854 s, 91.1 MB/s

this is on a cluster serving 5 ovirt nodes and about 60 running VMs.



On 25 October 2016 at 12:50, Service Mail  wrote:

> Hello,
>
> I have the following setup:
>
> 3x zfs raidz2 servers with a single gluster 3.8 replicated volume across a
> 10G network
>
> Everything is working fine however performance looks very poor to me:
>
>
> root@Client:/test_mount# sync; dd if=/dev/zero of=nfsp2 bs=1M count=1024;
> sync
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.1786 s, 33.4 MB/s
>
> root@Client:/test_mount# sync; dd if=/dev/zero of=nfsp2 bs=1M
> count=10240; sync
> 10240+0 records in
> 10240+0 records out
> 10737418240 bytes (11 GB, 10 GiB) copied, 301.563 s, 35.6 MB/s
>
> Are those reading normal? Where should I look to increase performance?
>
> Thanks,
>
> Ciclope
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How gluster parallelize reads

2016-10-03 Thread Alastair Neil
I think this might give you something like  the behaviour you are looking
for, it will not balance blocks across different servers but will
distribute reads from clients across all the servers.

cluster.read-hash-mode 2

0 means use the first server to respond I think - at least that's my guess
of what "first up server" means
1 hashed by GFID,  so clients will use the same server for a given file but
different files may be accessed from different nodes.

On 3 October 2016 at 05:50, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 2016-10-03 11:33 GMT+02:00 Joe Julian :
> > By default, the client reads from localhost first, if the client is also
> a
> > server, or the first to respond. This can be tuned to balance the load
> > better (see "gluster volume set help") but that's not necessarily more
> > efficient. As always, it depends on the workload.
>
> So, is no true saying that gluster aggregate bandwidth in readings.
> Each client will always read from 1 node. Having 3 nodes means that
> I can support a number of clients increased by 3.
>
> Something like an ethernet bonding, each transfer is always subject to the
> single port speed, but I can support twice the connections by creating
> a bond of 2.
>
> > Reading as you suggested is actually far less efficient. The reads would
> > always be coming from disk and never in any readahead cache.
>
> What I mean is to read the same file in multiple parts from multiple
> servers and not
> reading the same file part from multiple servers.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] New cluster - first experience

2016-07-18 Thread Alastair Neil
It does not seem to me that this is a gluster issue.  I just quickly
reviewed the thread and you said that you saw 60 MB/s with plain nfs to the
bricks and with gluster and no sharding you got 59 MB/s

With plain NFS (no gluster involved) i'm getting almost the same
> speed: about 60MB/s
>
> Without sharding:
> # echo 3 > /proc/sys/vm/drop_caches; dd if=/dev/zero of=test bs=1M
> count=1000 conv=fsync
> 1000+0 record dentro
> 1000+0 record fuori
> 1048576000 byte (1,0 GB) copiati, 17,759 s, 59,0 MB/s




On 18 July 2016 at 06:35, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 2016-07-16 15:07 GMT+02:00 Gandalf Corvotempesta
> :
> > 2016-07-16 15:04 GMT+02:00 Gandalf Corvotempesta
> > :
> >> [ ID] Interval   Transfer Bandwidth
> >> [  3]  0.0-10.0 sec  2.31 GBytes  1.98 Gbits/sec
> >
> > Obviously i did the same test with all gluster server. Speed is always
> > near 2gbit, so, the network is not an issue here.
>
> Any help? I would like to start real-test with virtual machines and
> proxmox before the August holiday.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] New cluster - first experience

2016-07-14 Thread Alastair Neil
I am not sure if your nics support it but you could try balance-alb
(bonding mode 6), this does not require special switch support and I have
had good results with it.  As Lindsey said the switch configuration could
be limiting the bandwidth between nodes in balance-rr.

On 14 July 2016 at 05:21, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 2016-07-14 11:19 GMT+02:00, Gandalf Corvotempesta
> :
> > Yes, but my iperf test was made with a wrong bonding configuration.
>
> Anyway, even with direct NFS mount (not involving gluster) i'm stuck
> as 60MB/s (480mbit/s)
> about 50% of available bandwidth with a single nic/connection.
>
> Any change to get this cluster faster ?
> Which speed are you seeing with gluster or nfs ? I would like to
> archieve the best possible speed before buying more powerful hardware
> (10Gb switches)
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS Storage Interruption at Node Loss

2016-07-08 Thread Alastair Neil
Nic
 I believe this is normal expected behaviour.  The network timeout is there
because it is expensive to tear down the sockets etc. so you only want to
do it if a node has really failed and not for some transitory network blip.

On 8 July 2016 at 20:29, Nic Seltzer  wrote:

> Hello list!
>
> I am experiencing an issue whereby mounted Gluster volumes are being made
> read-only until the network timeout interval has passed or the node comes
> back online. I  have reduced the network timeout to one second and was able
> to reduce the size of the outage window to two seconds. I am curious if
> anyone else has seen this issue and how they went about resolving it for
> their implementation. We are using a distributed-replicated volume, but
> have also tested _just_ replicated volume with the same results. I can
> provide the gluster volume info if it's helpful, but suffice to say that it
> is a pretty simple setup.
>
> Thanks!
>
> --
> Nic Seltzer
> Esports Ops Tech | Riot Games
> Cell: +1.402.431.2642 | NA Summoner: Riot Dankeboop
> http://www.riotgames.com
> http://www.leagueoflegends.com
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] New cluster - first experience

2016-07-08 Thread Alastair Neil
Also remember with a single transfer you will not see 2000 gb/s only 1000
gb/s

On 8 July 2016 at 15:14, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 2016-07-08 20:43 GMT+02:00  :
> > Gluster, and in particular the fuse mounter, do not operate on small
> file workloads anywhere near wire speed in their current arch.
>
> I know that i'll unable to reach wire speed, but with 2000gbit
> available, reaching only 88mbit with 1GB file is really low.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

2016-06-24 Thread Alastair Neil
I upgraded my fedora 23 system to f24 a couple of days ago, now I am unable
to mount my gluster cluster.

The update installed:

glusterfs-3.8.0-1.fc24.x86_64
glusterfs-libs-3.8.0-1.fc24.x86_64
glusterfs-fuse-3.8.0-1.fc24.x86_64
glusterfs-client-xlators-3.8.0-1.fc24.x86_64

the gluster is running 3.7.11

The volume is replica 3

I see these errors in the mount log:

[2016-06-24 17:55:34.016462] I [MSGID: 100030] [glusterfsd.c:2408:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.8.0
(args: /usr/sbin/glusterfs --volfile-server=gluster1 --volfile-id=homes
/mnt/homes)
[2016-06-24 17:55:34.094345] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2016-06-24 17:55:34.240135] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 2
[2016-06-24 17:55:34.240130] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 4
[2016-06-24 17:55:34.240130] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 3
[2016-06-24 17:55:34.241499] I [MSGID: 114020] [client.c:2356:notify]
0-homes-client-2: parent translators are ready, attempting connect on
transport
[2016-06-24 17:55:34.249172] I [MSGID: 114020] [client.c:2356:notify]
0-homes-client-5: parent translators are ready, attempting connect on
transport
[2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
0-homes-client-2: changing port to 49171 (from 0)
[2016-06-24 17:55:34.253347] I [MSGID: 114020] [client.c:2356:notify]
0-homes-client-6: parent translators are ready, attempting connect on
transport
[2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
0-homes-client-5: changing port to 49154 (from 0)
[2016-06-24 17:55:34.255115] I [MSGID: 114057]
[client-handshake.c:1441:select_server_supported_programs]
0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2016-06-24 17:55:34.255861] W [MSGID: 114007]
[client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2: failed to
find key 'child_up' in the options
[2016-06-24 17:55:34.259097] I [MSGID: 114057]
[client-handshake.c:1441:select_server_supported_programs]
0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437), Version (330)
Final graph:
+--+
  1: volume homes-client-2
  2: type protocol/client
  3: option clnt-lk-version 1
  4: option volfile-checksum 0
  5: option volfile-key homes
  6: option client-version 3.8.0
  7: option process-uuid
Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0
  8: option fops-version 1298437
  9: option ping-timeout 20
 10: option remote-host gluster-2
 11: option remote-subvolume /export/brick2/home
 12: option transport-type socket
 13: option event-threads 4
 14: option send-gids true
 15: end-volume
 16:
 17: volume homes-client-5
 18: type protocol/client
 19: option clnt-lk-version 1
 20: option volfile-checksum 0
 21: option volfile-key homes
 22: option client-version 3.8.0
 23: option process-uuid
Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0
 24: option fops-version 1298437
 25: option ping-timeout 20
 26: option remote-host gluster1.vsnet.gmu.edu
 27: option remote-subvolume /export/brick2/home
 28: option transport-type socket
 29: option event-threads 4
 30: option send-gids true
 31: end-volume
 32:
 33: volume homes-client-6
 34: type protocol/client
 35: option ping-timeout 20
 36: option remote-host gluster0
 37: option remote-subvolume /export/brick2/home
 38: option transport-type socket
 39: option event-threads 4
 40: option send-gids true
 41: end-volume
 42:
 43: volume homes-replicate-0
 44: type cluster/replicate
 45: option background-self-heal-count 20
 46: option metadata-self-heal on
 47: option data-self-heal off
 48: option entry-self-heal on
 49: option data-self-heal-window-size 8
 50: option data-self-heal-algorithm diff
 51: option eager-lock on
 52: option quorum-type auto
 53: option self-heal-readdir-size 64KB
 54: subvolumes homes-client-2 homes-client-5 homes-client-6
 55: end-volume
 56:
 57: volume homes-dht
 58: type cluster/distribute
 59: option min-free-disk 5%
 60: option rebalance-stats on
 61: option readdir-optimize on
 62: subvolumes homes-replicate-0
 63: end-volume
 64:
 65: volume homes-read-ahead
 66: type performance/read-ahead
 67: subvolumes homes-dht
 68: end-volume
 69:
 70: volume homes-io-cache
 71: type performance/io-cache
 72: subvolumes homes-read-ahead
 73: end-volume
 74:
 75: volume homes-quick-read
 76: type performance/quick-read
 77: subvolumes homes-io-cache
 78: end-volume
 79:
 80: volume homes-open-behind
 81: type 

Re: [Gluster-users] snapshot removal failed on one node how to recover (3.7.11)

2016-06-06 Thread Alastair Neil
No one has any suggestions?  Would this scenario I have been toying with
work:  remove the brick from the node with the out of sync snapshots,
destroy all associated logical volumes, and  then add the brick back as an
arbiter node?


On 1 June 2016 at 13:40, Alastair Neil <ajneil.t...@gmail.com> wrote:

> I have a replica 3 volume that has snapshot scheduled using
> snap_scheduler.py
>
> I recently tried to remove a snapshot and the command failed on one node:
>
> snapshot delete: failed: Commit failed on gluster0.vsnet.gmu.edu. Please
>> check log file for details.
>> Snapshot command failed
>
>
> How do I recover from this failure.  Clearly I need to remove the snapshot
> from the offending server but this does not seem possible as the snapshot
> no longer exists on the other two nodes.
> Suggestions welcome.
>
> -Alastair
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] snapshot removal failed on one node how to recover (3.7.11)

2016-06-01 Thread Alastair Neil
I have a replica 3 volume that has snapshot scheduled using
snap_scheduler.py

I recently tried to remove a snapshot and the command failed on one node:

snapshot delete: failed: Commit failed on gluster0.vsnet.gmu.edu. Please
> check log file for details.
> Snapshot command failed


How do I recover from this failure.  Clearly I need to remove the snapshot
from the offending server but this does not seem possible as the snapshot
no longer exists on the other two nodes.
Suggestions welcome.

-Alastair
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Questions about healing

2016-05-23 Thread Alastair Neil
yes it's configurable with:

network.ping-timeout

and default is 42 seconds I believe.

On 22 May 2016 at 03:39, Kevin Lemonnier  wrote:

> > Let's assume 10.000 shard on a server being healed.
> > Gluster heal 1 shard at once, so the other 9.999 pieces would be read
> > from the other servers
> > to keep VM running ? If yes, this is good. If not, in this case, the
> > whole VM need to be healed
> > and thus, the whole VM would hangs
>
> Yes, that seems to be what's hapenning on 3.7.11.
> Couldn't notice any freez during heals, except for a brief one when
> a node just went down : looks like gluster hangs for a few seconds
> while waiting for the node before deciding to mark it down and continue
> without it.
>
> --
> Kevin Lemonnier
> PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Questions about healing

2016-05-20 Thread Alastair Neil
Well it's not magic, there is an algorithm that is documented and it is
trivial script the recreation of the file from the shards if gluster was
truly unavailable:
>
>
> #!/bin/bash
> #
> # quick and dirty reconstruct file from shards
> # takes brick path and file name as arguments
> # Copyright May 20th 2016 A. Neil
> #
> brick=$1
> filen=$2
> file=`find $brick -name $filen`
> inode=`ls -i $file | cut -d' ' -f1`
> pushd $brick/.glusterfs
> gfid=`find . -inum $inode | cut -d'/' -f4`
> popd
> nshard=`ls -1  $brick/.shard/${gfid}.* | wc -l`
> cp $file ./${filen}.restored
> for i in `seq 1 $nshard`; do cat $brick/.shard/${gfid}.$i >>
> ./${filen}.restored; done


 Admittedly this is not as easy as pulling the image for from the brick
file system, but then the advantages are pretty big.

The point is that each shard is small and healing of them is fast.  The
majority of the time when you need to heal a vm it's is only a few blocks
that have changed and without sharding you might have to heal 10 , 20 or
100GB.  In my experience if you have 30 or 40 VMs it can take hours to
heal.  With the limited testing I have done I have found  that yes some VMs
will experience IO timeouts, freeze, and then need to be restarted.
However, at least you don't need to wait hours before you can do that.






On 20 May 2016 at 15:20, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Il 20 mag 2016 20:14, "Alastair Neil" <ajneil.t...@gmail.com> ha scritto:
> >
> > I think you are confused about what sharding does.   In a sharded
> replica 3 volume all the shards exist on all the replicas so there is no
> distribution.  Might you be getting confused with erasure coding?  The
> upshot of sharding is that if you have a failure, instead of healing
> multiple gigabyte vm files for example, you only heal the shards that have
> changed. This generally shortens the heal time dramatically.
>
> I know what sharding is.
> it split each file in multiple, smaller,  chunks
>
> But if all is gonna bad, how can i reconstruct a file from each shard
> without gluster? It would be a pain.
> Let's assume tens of terabytes of shards to be manually reconstructed ...
>
> Anyway how is possible to keep VM up and running when healing is happening
> on a shard? That part of disk image is not accessible and thus the VM could
> have some issue on a filesystem.
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Questions about healing

2016-05-20 Thread Alastair Neil
I think you are confused about what sharding does.   In a sharded replica 3
volume all the shards exist on all the replicas so there is no
distribution.  Might you be getting confused with erasure coding?  The
upshot of sharding is that if you have a failure, instead of healing
multiple gigabyte vm files for example, you only heal the shards that have
changed. This generally shortens the heal time dramatically.

Alastair

On 18 May 2016 at 12:54, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Il 18/05/2016 13:55, Kevin Lemonnier ha scritto:
>
>> Yes, that's why you need to use sharding. With sharding, the heal is much
>> quicker and the whole VM isn't freezed during the heal, only the shard
>> being healed. I'm testing that right now myself and that's almost invisible
>> for the VM using 3.7.11. Use the latest version though, it really really
>> wasn't transparent in 3.7.6 :).
>>
> I don't like sharding. With sharing all "files" are split in shard and
> distributed across the whole cluster.
> If everything went bad, reconstructing a file from it shards could be hard.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] VM disks corruption on 3.7.11

2016-05-19 Thread Alastair Neil
I am slightly confused you say you have image file corruption but then you
say the qemu-img check says there is no corruption.  If what you mean is
that you see I/O errors during a heal this is likely to be due to io
starvation, something that is a well know issue.

There is work happening to improve this in version 3.8:

https://bugzilla.redhat.com/show_bug.cgi?id=1269461



On 19 May 2016 at 09:58, Kevin Lemonnier  wrote:

> That's a different problem then, I have corruption without removing or
> adding bricks,
> as mentionned. Might be two separate issue
>
>
> On Thu, May 19, 2016 at 11:25:34PM +1000, Lindsay Mathieson wrote:
> >On 19/05/2016 12:17 AM, Lindsay Mathieson wrote:
> >
> >  One thought - since the VM's are active while the brick is
> >  removed/re-added, could it be the shards that are written while the
> >  brick is added that are the reverse healing shards?
> >
> >I tested by:
> >
> >- removing brick 3
> >
> >- erasing brick 3
> >
> >- closing down all VM's
> >
> >- adding new brick 3
> >
> >- waiting until heal number reached its max and started decreasing
> >
> >  There were no reverse heals
> >
> >- Started the VM's backup. No real issues there though one showed IO
> >errors, presumably due to shards being locked as they were healed.
> >
> >- VM's started ok, no reverse heals were noted and eventually Brick 3
> was
> >fully healed. The VM's do not appear to be corrupted.
> >
> >So it would appear the problem is adding a brick while the volume is
> being
> >written to.
> >
> >Cheers,
> >
> >  --
> >  Lindsay Mathieson
>
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
>
>
> --
> Kevin Lemonnier
> PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] "gluster volume heal full" locking all files after adding a brick

2016-04-27 Thread Alastair Neil
what are the quorum setting on the volumes?


On 27 April 2016 at 08:08, Tomer Paretsky  wrote:

> Hi all
>
> i am currently running two replica 3 volumes acting as storage for VM
> images.
> due to some issues with glusterfs over ext4 filesystem (kernel panics), i
> tried removing one of the bricks from each volume from a single server, and
> than re adding them after re-formatting the underlying partition to xfs, on
> only one of the hosts for testing purposes.
>
> the commands used were:
>
> 1) gluster volume remove-brick gv1 replica 2  :/storage/gv1/brk
> force
> 2) gluster volume remove-brick gv2 replica 2 :/storage/gv2/brk
> force
>
> 3) reformatted /storage/gv1 and /storage/gv2 to xfs (these are the
> local/physical mountpoints of the gluster bricks)
>
> 4) gluster volume add-brick gv1 replica 3 :/storage/gv1/brk
> 5) gluster volume add-brick gv2 replica 3 :/storage/gv2/brk
>
> so far - so good -- both bricks were successfully re added to the volume.
>
> 6) gluster volume heal gv1 full
> 7) gluster volume heal gv2 full
>
> the heal operation started and i can see files being replicated into the
> newly added bricks BUT - all the files on the two nodes which were not
> touched are now locked (ReadOnly), i presume, until the heal operation
> finishes and replicates all the files to the newly added bricks (which
> might take a while..)
>
> now as far as i understood the documentation of the healing process - the
> files should not have been locked at all. or am i missing something
> fundemental here?
>
> is there a way to prevent locking of the source files during a heal -full
> operation?
>
> is there a better way to perform the process i just described?
>
> your help is enormously appreciated,
> Cheers,
> Tomer Paretsky
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] self service snapshot access broken with 3.7.11

2016-04-22 Thread Alastair Neil
I just upgraded my cluster to 3.7.11 from 3.7.10 and access to the .snaps
directories now fail with

bash: cd: .snaps: Transport endpoint is not connected


in the volume log file on the client I see:

016-04-22 21:08:28.005854] I [rpc-clnt.c:1847:rpc_clnt_reconfig]
> 2-homes-snapd-client: changing port to 49493 (from 0)
> [2016-04-22 21:08:28.009558] E [socket.c:2278:socket_connect_finish]
> 2-homes-snapd-client: connection to xx.xx.xx.xx.xx:49493 failed (No route
> to host)


I'm quite perplexed, now it's not a network issue or DNS as far as I can
tell, the glusterfs client is working fine, and the gluster servers all
resolve ok.  It seems to be happening on all the clients I have tried
different systems with 3.7.8, 3.7.10, and 3.7.11 version clients and see
the same failure on all of them.

On the servers the snapshots are being taken as expected and they are
started:

Snapshot  :
> Scheduled-Homes_Hourly-homes_GMT-2016.04.22-16.00.01
> Snap UUID : 91ba50b0-d8f2-4135-9ea5-edfdfe2ce61d
> Created   : 2016-04-22 16:00:01
> Snap Volumes:
> Snap Volume Name  : 5170144102814026a34f8f948738406f
> Origin Volume name: homes
> Snaps taken for homes  : 16
> Snaps available for homes  : 240
> Status: Started



the homes volume is replica 3 all the peers are up and so are all the
bricks and services:

glv status homes
> Status of volume: homes
> Gluster process TCP Port  RDMA Port  Online
>  Pid
>
> --
> Brick gluster-2:/export/brick2/home 49171 0  Y
> 38298
> Brick gluster0:/export/brick2/home  49154 0  Y
> 23519
> Brick gluster1.vsnet.gmu.edu:/export/brick2
> /home   49154 0  Y
> 23794
> Snapshot Daemon on localhost49486 0  Y
> 23699
> NFS Server on localhost 2049  0  Y
> 23486
> Self-heal Daemon on localhost   N/A   N/AY
> 23496
> Snapshot Daemon on gluster-249261 0  Y
> 38479
> NFS Server on gluster-2 2049  0  Y
> 39640
> Self-heal Daemon on gluster-2   N/A   N/AY
> 39709
> Snapshot Daemon on gluster1 49480 0  Y
> 23982
> NFS Server on gluster1  2049  0  Y
> 23766
> Self-heal Daemon on gluster1N/A   N/AY
> 23776
>
> Task Status of Volume homes
>
> --
> There are no active volume tasks


I'd appreciate any ideas about troubleshooting this.  I tried disable
.snaps access on the volume and re-enabling it but is made no difference.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] non interactive use of glsuter snapshot delete

2016-04-08 Thread Alastair Neil
I am crafting a script (well actually I am modifying gcron.py) to retain a
configurable number of Hourly, Daily,Weekly and Monthly snapshots.  One
issue is that gluster snapshot delete "snapname" does not seem to have a
command line switch (like "-y" in yum) to attempt the operation
non-interactively.  Is there another way of performing this that is more
friendly to non-interactive use?  From the shell I can pipe "yes"  to the
command but my python fu is weak so I thought I'd ask if there was a
simpler way.

Thanks, Alastair
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to recover after one node breakdown

2016-03-18 Thread Alastair Neil
hopefully you have a back up of /var/lib/glusterd/glusterd.info and
/var/lib/glusterd/peers, if so I think you can copy them back to and
restart glusterd and the volume info should get populated from the other
node.  If not you can probably reconstruct these from  these files on the
other node.

i.e:
On the unaffected node the peers directory should have an entry for the
failed node containing the uuid of the failed node. The  glusterd.info file
should enable you to recreate the peer file on the failed node.


On 16 March 2016 at 09:25, songxin  wrote:

> Hi,
> Now I face a problem.
> Reproduc step is as below.
> 1.I create a replicate volume using two brick on two board
> 2.start the volume
> 3.one board is breakdown and all
> files in the rootfs ,including /var/lib/glusterd/*,are lost.
> 4.reboot the board and ip is not change.
>
> My question:
> How to recovery the replicate volume?
>
> Thanks,
> Xin
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Issue with storage access when brick goes down

2015-12-14 Thread Alastair Neil
I thought it was to do with the expense of tearing down and setting up the
connections, so the timeout is there to avoid an expensive operation if it
is not really necessary.



On 7 December 2015 at 22:15, Bipin Kunal  wrote:

> I assume that this is because the host which went down is even the host
> which is used for mounting client.
>
> Suppose there are 2 host. Host1 and Host2. And client is mounted as
> mount -t glusterfs host1:brick1 /mnt
>
> In this case if host1 goes down, client will wait till network ping
> timeout before it starts accessing volume using other host(host2).
>
> So I think this is expected behaviour.
>
> Thanks,
> Bipin Kunal
> On Dec 7, 2015 10:57 PM, "L, Sridhar (Nokia - IN/Bangalore)" <
> sridha...@nokia.com> wrote:
>
>> Hello,
>>
>> I am running gluster storage in distributed replicated mode. When one of
>> the brick (host) goes offline, operations on the file system by the client
>> will hang for a while and resumes after sometime. I searched and found that
>> operations hang for the period set in network.ping-timeout.
>> Can anyone explain me why the client operations will hang even though the
>> other brick is available with all the data?
>>
>>
>> Regards,
>> Sridhar L
>>
>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster 3.7.5 low speed on write and heavy load

2015-10-16 Thread Alastair Neil
is that a typo 120mb/s?  Is this volume replica 2 or distributed or both?


On 16 October 2015 at 08:05, Kandalf ®  wrote:

> Hi,
>
> I have 2 server with centos 7 and I want to make a replicate storage
> cluster for the esxi.
> In both server I have 4 disk in raid 5 mdadm. The write speed on that
> device is ~250-300MB/s all the time.
> I create xfs file system on the mdadm and I exported that gluster volume
> via native fuse nfs v3 to the esxi.
> Read seed is great but when I try to write to the guest vms I see
> 20-40MB/s only. I try the test also with gluster distributed with only one
> brick to one server. If I take one dd write from esxi to the cluster I
> receive 120mb/s the full ethernet link speed. I also try to mount to linux
> that volumes and try dd write and the write speed is 120mb/s.
> But if I try to mount an raw file image with losetup, or use vmdk files,
> and if I write to them, the speed is 20-40MB/s and I see 12 load on linux.
> Can someone help me?
>
> Thanks!
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster 3.7.5 low speed on write and heavy load

2015-10-16 Thread Alastair Neil
I'm getting confused, 120 mb/s is less that 20 MB/s  unless you mean
120MB/s?

On 16 October 2015 at 16:26, Kandalf ® <tin...@yahoo.com> wrote:

> No, is real speed. From any computer linux or esxi if I write via NFS with
> dd utility I have this speed. But if I try to write to one raw file like
> esxi does in vmdk, than the speed drops to 20-40mb/s. So my issue is low
> speed when I write into raw image file (using losetup and DD on linux , or
> esxi vmdk - vm guest)
>
>
>
> On Friday, October 16, 2015 8:31 PM, Alastair Neil <ajneil.t...@gmail.com>
> wrote:
>
>
> is that a typo 120mb/s?  Is this volume replica 2 or distributed or both?
>
>
> On 16 October 2015 at 08:05, Kandalf ® <tin...@yahoo.com> wrote:
>
> Hi,
>
> I have 2 server with centos 7 and I want to make a replicate storage
> cluster for the esxi.
> In both server I have 4 disk in raid 5 mdadm. The write speed on that
> device is ~250-300MB/s all the time.
> I create xfs file system on the mdadm and I exported that gluster volume
> via native fuse nfs v3 to the esxi.
> Read seed is great but when I try to write to the guest vms I see
> 20-40MB/s only. I try the test also with gluster distributed with only one
> brick to one server. If I take one dd write from esxi to the cluster I
> receive 120mb/s the full ethernet link speed. I also try to mount to linux
> that volumes and try dd write and the write speed is 120mb/s.
> But if I try to mount an raw file image with losetup, or use vmdk files,
> and if I write to them, the speed is 20-40MB/s and I see 12 load on linux.
> Can someone help me?
>
> Thanks!
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster 3.7.5 low speed on write and heavy load

2015-10-16 Thread Alastair Neil
I believe if this is replica 3 then you would expect 40 MB/s, as the client
has to write to 3 bricks.  You only mentioned 2 servers so I assume this is
at most replica 2, so 60 MB/s would be what I would expect.  This is still
1.5-3x the numbers you are reporting though.

On 16 October 2015 at 16:43, Kandalf ® <tin...@yahoo.com> wrote:

> I also tried more gluster version but the same situation.If I use iscsi
> for example to export a raw image... I have 120MB/s write speed.
>
>
>
> On Friday, October 16, 2015 11:42 PM, Kandalf ® <tin...@yahoo.com> wrote:
>
>
> If I write with DD in linux locally to the same cluster node... or via
> network I have 120MB/s. the full speed of te 1 Gbps.
> But if I write to an RAW Image File... like vmdk in vmware... I have
> average of 20-40MB/s only.
>
>
>
> On Friday, October 16, 2015 11:31 PM, Alastair Neil <ajneil.t...@gmail.com>
> wrote:
>
>
> I'm getting confused, 120 mb/s is less that 20 MB/s  unless you mean
> 120MB/s?
>
> On 16 October 2015 at 16:26, Kandalf ® <tin...@yahoo.com> wrote:
>
> No, is real speed. From any computer linux or esxi if I write via NFS with
> dd utility I have this speed. But if I try to write to one raw file like
> esxi does in vmdk, than the speed drops to 20-40mb/s. So my issue is low
> speed when I write into raw image file (using losetup and DD on linux , or
> esxi vmdk - vm guest)
>
>
>
> On Friday, October 16, 2015 8:31 PM, Alastair Neil <ajneil.t...@gmail.com>
> wrote:
>
>
> is that a typo 120mb/s?  Is this volume replica 2 or distributed or both?
>
>
> On 16 October 2015 at 08:05, Kandalf ® <tin...@yahoo.com> wrote:
>
> Hi,
>
> I have 2 server with centos 7 and I want to make a replicate storage
> cluster for the esxi.
> In both server I have 4 disk in raid 5 mdadm. The write speed on that
> device is ~250-300MB/s all the time.
> I create xfs file system on the mdadm and I exported that gluster volume
> via native fuse nfs v3 to the esxi.
> Read seed is great but when I try to write to the guest vms I see
> 20-40MB/s only. I try the test also with gluster distributed with only one
> brick to one server. If I take one dd write from esxi to the cluster I
> receive 120mb/s the full ethernet link speed. I also try to mount to linux
> that volumes and try dd write and the write speed is 120mb/s.
> But if I try to mount an raw file image with losetup, or use vmdk files,
> and if I write to them, the speed is 20-40MB/s and I see 12 load on linux.
> Can someone help me?
>
> Thanks!
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
>
>
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Suggested method for replacing an entire node

2015-10-08 Thread Alastair Neil
I think you should back up /var/lib/glusterd and then restore it after the
reinstall and installation of glusterfs packages.  Assuming the node will
have the same hostname and ip addresses and you are installing the same
version gluster bits, I think it should be fine.  I am assuming you are not
using ssl for the connections if so you will need to back up the keys for
that too.

-Alastair

On 8 October 2015 at 00:12, Atin Mukherjee  wrote:

>
>
> On 10/07/2015 10:28 PM, Gene Liverman wrote:
> > I want to replace my existing CentOS 6 nodes with CentOS 7 ones. Is
> > there a recommended way to go about this from the perspective of
> > Gluster? I am running a 3 node replicated cluster (3 servers each with 1
> > brick). In case it makes a difference, my bricks are on separate drives
> > formatted as XFS so it is possible that I can do my OS reinstall without
> > wiping out the data on two nodes (the third had a hardware failure so it
> > will be fresh from the ground up).
> That's possible. You could do the re-installation one at a time. Once
> the node comes back online self heal daemon will take care of healing
> the data. AFR team can correct me if I am wrong.
>
> Thanks,
> Atin
> >
> >
> >
> >
> > Thanks,
> > *Gene Liverman*
> > Systems Integration Architect
> > Information Technology Services
> > University of West Georgia
> > glive...@westga.edu 
> >
> > ITS: Making Technology Work for You!
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Suggested method for replacing an entire node

2015-10-08 Thread Alastair Neil
Ahh that is good to know.

On 8 October 2015 at 09:50, Atin Mukherjee <atin.mukherje...@gmail.com>
wrote:

> -Atin
> Sent from one plus one
> On Oct 8, 2015 7:17 PM, "Alastair Neil" <ajneil.t...@gmail.com> wrote:
> >
> > I think you should back up /var/lib/glusterd and then restore it after
> the reinstall and installation of glusterfs packages.  Assuming the node
> will have the same hostname and ip addresses and you are installing the
> same version gluster bits, I think it should be fine.  I am assuming you
> are not using ssl for the connections if so you will need to back up the
> keys for that too.
> If the same machine is used with out hostname/ IP change, backing up
> glusterd configuration *is not* needed as syncing the configuration will be
> taken care peer handshaking.
>
> >
> > -Alastair
> >
> > On 8 October 2015 at 00:12, Atin Mukherjee <amukh...@redhat.com> wrote:
> >>
> >>
> >>
> >> On 10/07/2015 10:28 PM, Gene Liverman wrote:
> >> > I want to replace my existing CentOS 6 nodes with CentOS 7 ones. Is
> >> > there a recommended way to go about this from the perspective of
> >> > Gluster? I am running a 3 node replicated cluster (3 servers each
> with 1
> >> > brick). In case it makes a difference, my bricks are on separate
> drives
> >> > formatted as XFS so it is possible that I can do my OS reinstall
> without
> >> > wiping out the data on two nodes (the third had a hardware failure so
> it
> >> > will be fresh from the ground up).
> >> That's possible. You could do the re-installation one at a time. Once
> >> the node comes back online self heal daemon will take care of healing
> >> the data. AFR team can correct me if I am wrong.
> >>
> >> Thanks,
> >> Atin
> >> >
> >> >
> >> >
> >> >
> >> > Thanks,
> >> > *Gene Liverman*
> >> > Systems Integration Architect
> >> > Information Technology Services
> >> > University of West Georgia
> >> > glive...@westga.edu <mailto:glive...@westga.edu>
> >> >
> >> > ITS: Making Technology Work for You!
> >> >
> >> >
> >> > ___
> >> > Gluster-users mailing list
> >> > Gluster-users@gluster.org
> >> > http://www.gluster.org/mailman/listinfo/gluster-users
> >> >
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> http://www.gluster.org/mailman/listinfo/gluster-users
> >
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Scaling a network - Error while adding new brick to a existing cluster.

2015-09-30 Thread Alastair Neil
Yes it seems likely you have a duplicate conflicting gluster
configuration.  Fixing it  should be fairly easy: clear out
/var/lib/glusterd, and then reinstall the glusterfs  packages.  You will
also have to clear all the extended attributes and files from any brick
directories, it may be easier to reformat the file system if they are on
separate bock devices.  I assume the system has a different host name?  You
should then be able to peer probe the new server from one of the cluster
members.



On 29 September 2015 at 01:00, Sreejith K B  wrote:

> Hi all,
>
> While i am trying to add a new brick in to an existing cluster it
> fails, our servers are on linode, i think the problem occurs because, we
> created new server by copying/cloning an existing server that already part
> of that cluster in to a new server space. So i think the already existing
> gluster configuration causes this issue, i need to find-out a solution for
> this situation. please confirm the error reason and a working solution.
>
> regards,
> sreejith.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS 3.7

2015-09-22 Thread Alastair Neil
glusterd should handle syncing any changes you make with the "gluster"
 command to the peers, obviously if you make local changes to the volume
file on one server you are likely to break things unless you copy rsync the
changes to the other server.

On 22 September 2015 at 04:02, Andreas Hollaus  wrote:

> Hi,
>
> Are there any restrictions as to when I'm allowed to make changes to the
> GlusterFS
> volume (for instance: start/stop volume, add/remove brick or peer)? How
> will it
> handle such changes when one of my two replicated servers is down? How
> will GlusterFS
> know which set of configuration files it can trust when the other server
> is connected
> again and the files will contain different information about the volume?
> If these
> were data files on the GlusterFS volume that would have been handled by
> the extended
> file attributes, but how about the GlusterFS configuration itself?
>
> Regards
> Andreas
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] multiple volumes on same brick?

2015-09-22 Thread Alastair Neil
you mean create multiple top level directories on a mounted files system
and use each directory as a brick in a different volume?

if the underlying block device is a thinly provisioned lvm2 volume and you
want to use snapshots you cannot do this, otherwise I don't think there is
a technical reason you can't other than the obvious that this is in general
not  a good idea as  use in one volume may impact the available storage for
a different volume and it will not be transparent how much each volume is
consuming.

I have three servers each with 40Tb of disk in an external sas enclosure.
The storage is presented as a RAID 6 VD and I then make that a physical
volume and created a lvm2 Volume Group on top.  Individual bricks are thin
provisioned logical volumes.



On 21 September 2015 at 16:37, Gluster Admin  wrote:

> Gluster users,
>
> We have a multiple node setup where each server has a single XFS brick
> (underlying storage is hardware battery backed raid6).  Are there any
> issues creating multiple gluster volumes using the same underlying bricks
> from a performance or management standpoint?
>
> or would it be better to setup many smaller bricks via RAID1 to support
> multiple volumes.
>
>
> Current setup:
>
> NODE:/brick1(raid6 8 disk per brick)
>
> Smaller Brick setup:
>
> NODE:/brick1 /brick2 /brick3 /brick4  (raid1 2 disk per brick)
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] autosnap feature?

2015-09-15 Thread Alastair Neil
Not really.  this is useful  as it distributes the snapshot control over
all the cluster members,  I am  looking for the ability to specify a
snapshot schedule like this :

frequent snapshots every 15 mins, keeping 4 snapshots
hourly snapshots every hour, keeping 24 snapshots
daily snapshots every day, keeping 31 snapshots
weekly snapshots every week, keeping 7 snapshots
monthly snapshots every month, keeping 12 snapshots.

Clearly this could be handled via the scheduling as described,  but the
feature that is missing is user friendly labeling so that users don't have
to parse long time-stamps in the snapshot name to figure out what is the
most recent snapshot.  Ideally they could have labels like "Now", "Fifteen
Minutes Ago",  "Thirty Minutes Ago", "Sunday", "Last Week" etc.  The system
should handle rotating the labels automatically, when necessary.  So some
sort of ability to create and manipulate labels on snapshots and then
expose them as links in the .snaps directory would probably be a start.

-Alastair



On 15 September 2015 at 01:35, Rajesh Joseph <rjos...@redhat.com> wrote:

>
>
> - Original Message -
> > From: "Alastair Neil" <ajneil.t...@gmail.com>
> > To: "gluster-users" <gluster-users@gluster.org>
> > Sent: Friday, September 11, 2015 2:24:32 AM
> > Subject: [Gluster-users] autosnap feature?
> >
> > Wondering if there were any plans for a fexible and easy to use
> snapshotting
> > feature along the lines of zfs autosnap scipts. I imagine at the least it
> > would need the ability to rename snapshots.
> >
>
> Are you looking for something like this ?
>
> http://www.gluster.org/community/documentation/index.php/Features/Scheduling_of_Snapshot
>
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS disk full.

2015-09-11 Thread Alastair Neil
If you have an active snapshot, I expect the space will not be freed until
you remove the snapshot.

On 11 September 2015 at 01:44, Fujii Yasuhiro  wrote:

> Hi.
>
> I have a question.
>
> GlusterFS hdd space can't be recovered automatically after glusterfs
> is disk full and I delete files.
> It will be recovered restarting glusterfsd.
> I can find deleted file and glusterfsd do not relese the file I don't know
> why.
> Is this OK?
>
> [version]
> CentOS release 6.7 (Final)
> Linux web2 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 22:55:16 UTC
> 2015 x86_64 x86_64 x86_64 GNU/Linux
> glusterfs-3.6.5-1.el6.x86_64
> glusterfs-api-3.6.5-1.el6.x86_64
> glusterfs-fuse-3.6.5-1.el6.x86_64
> glusterfs-server-3.6.5-1.el6.x86_64
> glusterfs-libs-3.6.5-1.el6.x86_64
> glusterfs-cli-3.6.5-1.el6.x86_6
>
> [test]
> dd if=/dev/zero of=./test.dat bs=1M count=10
> dd: writing `./test.dat': Input/output error
> dd: closing output file `./test.dat': No space left on device
>
> [root@web2 www_virtual]# df
> Filesystem   1K-blocks  Used Available Use% Mounted on
> /dev/xvda1 8124856   2853600   4851880  38% /
> tmpfs   509256 0509256   0% /dev/shm
> /dev/xvdf1   101441468 101419972 0 100% /glusterfs/vol01
> web2:/vol_replica_01 101441408 101420032 0 100% /mnt/glusterfs
>
> [root@web2 www_virtual]# rm test.dat
> rm: remove regular file `test.dat'? y
>
> [root@web2 www_virtual]# sync
> [root@web2 www_virtual]# df
> Filesystem   1K-blocks  Used Available Use% Mounted on
> /dev/xvda1 8124856   2856744   4848736  38% /
> tmpfs   509256 0509256   0% /dev/shm
> /dev/xvdf1   101441468 101419972 0 100% /glusterfs/vol01
> web2:/vol_replica_01 101441408 101420032 0 100% /mnt/glusterfs
>
> Glusterfs is still disk full.
> The other glusterfs server is same.
>
> [find the deleted file]
> (server web2)
> [root@web2 www_virtual]# ls -l /proc/*/fd/* | grep deleted
> lr-x-- 1 root root 64 Sep 11 14:15 /proc/1753/fd/14 ->
> /var/lib/glusterd/snaps/missed_snaps_list (deleted)
> lrwx-- 1 root root 64 Sep 11 14:17 /proc/1775/fd/21 ->
>
> /glusterfs/vol01/brick/.glusterfs/1f/c2/1fc2b7b4-ecd6-4eff-a874-962c2283823f
> (deleted)
>
> [root@web2 .glusterfs]# ps ax | grep 1775 | grep -v grep
>  1775 ?Ssl5:43 /usr/sbin/glusterfsd -s web2 --volfile-id
> vol_replica_01.web2.glusterfs-vol01-brick -p
> /var/lib/glusterd/vols/vol_replica_01/run/web2-glusterfs-vol01-brick.pid
> -S /var/run/1cf98ee59b5dff8cfd793b8ec39851db.socket --brick-name
> /glusterfs/vol01/brick -l
> /var/log/glusterfs/bricks/glusterfs-vol01-brick.log --xlator-option
> *-posix.glusterd-uuid=029cf626-935f-4546-a8df-f9d79a6959da
> --brick-port 49152 --xlator-option
> vol_replica_01-server.listen-port=49152
>
> [root@web2 www_virtual]# lsof -p 1775
> COMMANDPID USER   FD   TYPE DEVICESIZE/OFFNODE NAME
> glusterfs 1775 root  cwdDIR  202,14096   2 /
> glusterfs 1775 root  rtdDIR  202,14096   2 /
> glusterfs 1775 root  txtREG  202,1   78056  266837
> /usr/sbin/glusterfsd
> glusterfs 1775 root  memREG  202,18560 854
> /usr/lib64/glusterfs/3.6.5/auth/login.so
> glusterfs 1775 root  memREG  202,1   13248 853
> /usr/lib64/glusterfs/3.6.5/auth/addr.so
> glusterfs 1775 root  memREG  202,1  2249204334
> /usr/lib64/glusterfs/3.6.5/xlator/protocol/server.so
> glusterfs 1775 root  memREG  202,1  1187602421
> /usr/lib64/glusterfs/3.6.5/xlator/debug/io-stats.so
> glusterfs 1775 root  memREG  202,1  1196882507
> /usr/lib64/glusterfs/3.6.5/xlator/features/quota.so
> glusterfs 1775 root  memREG  202,1  1397842491
> /usr/lib64/glusterfs/3.6.5/xlator/features/marker.so
> glusterfs 1775 root  memREG  202,1   423042475
> /usr/lib64/glusterfs/3.6.5/xlator/features/index.so
> glusterfs 1775 root  memREG  202,1   343602447
> /usr/lib64/glusterfs/3.6.5/xlator/features/barrier.so
> glusterfs 1775 root  memREG  202,1   469362561
> /usr/lib64/glusterfs/3.6.5/xlator/performance/io-threads.so
> glusterfs 1775 root  memREG  202,1  1079842476
> /usr/lib64/glusterfs/3.6.5/xlator/features/locks.so
> glusterfs 1775 root  memREG  202,1   584403940
> /usr/lib64/glusterfs/3.6.5/xlator/system/posix-acl.so
> glusterfs 1775 root  memREG  202,1   90880  266322
> /lib64/libgcc_s-4.4.7-20120601.so.1
> glusterfs 1775 root  memREG  202,1   962002460
> /usr/lib64/glusterfs/3.6.5/xlator/features/changelog.so
> glusterfs 1775 root  memREG  202,13944  272350
> /lib64/libaio.so.1.0.1
> glusterfs 1775 root  memREG 

[Gluster-users] autosnap feature?

2015-09-10 Thread Alastair Neil
Wondering if there were any plans for a fexible and  easy to use
snapshotting feature along the lines of zfs autosnap scipts.  I imagine at
the least it would need the ability to rename snapshots.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] questions about snapshots and cifs

2015-09-09 Thread Alastair Neil
I have been working on providing user serviceable snap shots and I am
confused by the documentation about allowing access from windows explorer:

For snapshots to be accessible from windows, below 2 options can be used.
> A) The glusterfs plugin for samba should give the option
> "snapdir-entry-path" while starting. The option is an indication to
> glusterfs, that samba is loading it and the value of the option should be
> the path that is being used as the share for windows. Ex: Say, there is a
> glusterfs volume and a directory called "export" from the root of the
> volume is being used as the samba share, then samba has to load glusterfs
> with this option as well.
> ret = glfs_set_xlator_option(fs, "*-snapview-client",
>  "snapdir-entry-path", "/export");
> The xlator option "snapdir-entry-path" is not exposed via volume set
> options, cannot be changed from CLI. Its an option that has to be provded
> at the time of mounting glusterfs or when samba loads glusterfs.


How do you set  "snapdir-entry-path", the docs say it needs to be set for
the gluster-cifs pluginto correctly pass the information the client so I is
there must be some volume setting and gluster volume set command to
configure this?

Thanks, Alastair
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Cannot upgrade from 3.6.3 to 3.7.3

2015-08-28 Thread Alastair Neil
Did you mean the option rpc-auth-allow-insecure on setting?  I just did a
rolling upgrade from 3.6 to 3.7 without issue, however, I had enabled
insecure connections because I had some clients running 3.7.

-Alastair


On 27 August 2015 at 10:04, Andreas Mather andr...@allaboutapps.at wrote:

 Hi Humble!

 Thanks for the reply. The docs do not mention anything related to 3.6-3.7
 upgrade that applies to my case.

 I could resolve the issue in the meantime by doing the steps mentioned in
 the 3.7.1 release notes (
 https://gluster.readthedocs.org/en/latest/release-notes/3.7.1/).

 Thanks,

 Andreas


 On Thu, Aug 27, 2015 at 3:22 PM, Humble Devassy Chirammal 
 humble.deva...@gmail.com wrote:

 Hi Andreas,

 
 Is it even possible to perform a rolling upgrade?
 


 The GlusterFS upgrade process is documented  @
 https://gluster.readthedocs.org/en/latest/Upgrade-Guide/README/



 --Humble


 On Thu, Aug 27, 2015 at 4:57 PM, Andreas Mather andr...@allaboutapps.at
 wrote:

 Hi All!

 I wanted to do a rolling upgrade of gluster from 3.6.3 to 3.7.3, but
 after the upgrade, the updated node won't connect.

 The cluster has 4 nodes (vhost[1-4]) and 4 volumes (vol[1-4]) with 2
 replicas each:
 vol1: vhost1/brick1, vhost2/brick2
 vol2: vhost2/brick1, vhost1/brick2
 vol3: vhost3/brick1, vhost4/brick2
 vol4: vhost4/brick1, vhost3/brick2

 I'm trying to start the upgrade on vhost4. After restarting glusterd,
 peer status shows all other peers as disconnected, the log has repeated
 entries like this:

 [2015-08-27 10:59:56.982254] E [MSGID: 106167]
 [glusterd-handshake.c:2078:__glusterd_peer_dump_version_cbk] 0-management:
 Error through RPC layer, retry again later
 [2015-08-27 10:59:56.982335] E [rpc-clnt.c:362:saved_frames_unwind] (--
 /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f1a7a9579e6] (--
 /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f1a7a7229be] (--
 /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f1a7a722ace] (--
 /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f1a7a72447c] (--
 /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f1a7a724c38] )
 0-management: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at
 2015-08-27 10:59:56.981550 (xid=0x2)
 [2015-08-27 10:59:56.982346] W [rpc-clnt-ping.c:204:rpc_clnt_ping_cbk]
 0-management: socket disconnected
 [2015-08-27 10:59:56.982359] I [MSGID: 106004]
 [glusterd-handler.c:5051:__glusterd_peer_rpc_notify] 0-management: Peer
 vhost3-int (72e2078d-1ed9-4cdd-aad2-c86e418746d1), in state Peer in
 Cluster, has disconnected from glusterd.
 [2015-08-27 10:59:56.982491] W
 [glusterd-locks.c:677:glusterd_mgmt_v3_unlock] (--
 /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f1a7a9579e6] (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x541)[0x7f1a6f55ee91]
 (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162)[0x7f1a6f4c6972]
 (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c)[0x7f1a6f4bc90c]
 (-- /lib64/libgfrpc.so.0(rpc_clnt_notify+0x90)[0x7f1a7a724c80] )
 0-management: Lock for vol vol1 not held
 [2015-08-27 10:59:56.982504] W [MSGID: 106118]
 [glusterd-handler.c:5073:__glusterd_peer_rpc_notify] 0-management: Lock not
 released for vol1
 [2015-08-27 10:59:56.982608] W
 [glusterd-locks.c:677:glusterd_mgmt_v3_unlock] (--
 /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f1a7a9579e6] (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x541)[0x7f1a6f55ee91]
 (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162)[0x7f1a6f4c6972]
 (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c)[0x7f1a6f4bc90c]
 (-- /lib64/libgfrpc.so.0(rpc_clnt_notify+0x90)[0x7f1a7a724c80] )
 0-management: Lock for vol vol2 not held
 [2015-08-27 10:59:56.982618] W [MSGID: 106118]
 [glusterd-handler.c:5073:__glusterd_peer_rpc_notify] 0-management: Lock not
 released for vol2
 [2015-08-27 10:59:56.982728] W
 [glusterd-locks.c:677:glusterd_mgmt_v3_unlock] (--
 /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f1a7a9579e6] (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x541)[0x7f1a6f55ee91]
 (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162)[0x7f1a6f4c6972]
 (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c)[0x7f1a6f4bc90c]
 (-- /lib64/libgfrpc.so.0(rpc_clnt_notify+0x90)[0x7f1a7a724c80] )
 0-management: Lock for vol vol3 not held
 [2015-08-27 10:59:56.982739] W [MSGID: 106118]
 [glusterd-handler.c:5073:__glusterd_peer_rpc_notify] 0-management: Lock not
 released for vol3
 [2015-08-27 10:59:56.982844] W
 [glusterd-locks.c:677:glusterd_mgmt_v3_unlock] (--
 /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f1a7a9579e6] (--
 /usr/lib64/glusterfs/3.7.3/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x541)[0x7f1a6f55ee91]
 (--
 

Re: [Gluster-users] backupvolfile-server fqdn vs short

2015-07-23 Thread Alastair Neil
yes certain - the mount command is in the fstab and it uses all fully
qualified domain names.
btw Atin, gmail showed me an empy reply - had to use view original to see
your question, not sure why, are you sending plaintext?


gluster0.vsnet.gmu.edu:/digitalcorpora /var/www/digitalcorpora glusterfs
 _netdev,use-readdirp=no,backupvolfile-server=gluster1.vsnet.gmu.edu:g
 luster-2.vsnet.gmu.edu 0 0


On 23 July 2015 at 12:30, Atin Mukherjee atin.mukherje...@gmail.com wrote:

 -Atin
 Sent from one plus one
 On Jul 23, 2015 9:07 PM, Alastair Neil ajneil.t...@gmail.com wrote:
 
  I just had a curious failure.  I have a gluster 3.6.3 replica 3 volume
 which was mounted via an 3.6.3 client  from one of the nodes with the other
 two specified in the backupvolfile-server mount option.  In the fstab entry
 all the nodes are referenced by their fully qualified domain names.
 
  When I rebooted the primary node, the mount became detached because the
 client was trying to use the short name to communicate with the backup
 nodes and failing to resolve it.  This was fixed by adding the domain to
 the search in resolv.conf.  However I am curious as to why it should try
 and use the short name instead of the fqdn specified in the fstab entry?
 The nodes all have peer entries for hostname, ip address and fqdn.
 Are you sure you didn't use short name in your mount command?
 
  Thanks,  Alastair
 
 
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] backupvolfile-server fqdn vs short

2015-07-23 Thread Alastair Neil
I just had a curious failure.  I have a gluster 3.6.3 replica 3 volume
which was mounted via an 3.6.3 client  from one of the nodes with the other
two specified in the backupvolfile-server mount option.  In the fstab entry
all the nodes are referenced by their fully qualified domain names.

When I rebooted the primary node, the mount became detached because the
client was trying to use the short name to communicate with the backup
nodes and failing to resolve it.  This was fixed by adding the domain to
the search in resolv.conf.  However I am curious as to why it should try
and use the short name instead of the fqdn specified in the fstab entry?
The nodes all have peer entries for hostname, ip address and fqdn.

Thanks,  Alastair
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.6.3 split brain on web browser cache dir w. replica 3 volume

2015-06-02 Thread Alastair Neil
Cheers that's a great help.  I am assuming the extra
trusted.afr.volname-client- entries are left over from the removed peers,
can I expect they will disappear after glusterfsd gets restarted?



On 1 June 2015 at 23:49, Ravishankar N ravishan...@redhat.com wrote:



 On 06/01/2015 08:15 PM, Alastair Neil wrote:


  I have a replica 3 volume I am using to serve my home directory.  I have
 notices a couple of split-brains recently on files used by browsers(for the
 most recent see below, I had an earlier one on
 .config/google-chrome/Default/Session Storage/) .  When I was running
 replica 2 I don't recall seeing more than two entries of the form:
 trusted.afr.volname.client-?.  I did have two other servers that I have
 removed from service recently but I am curious to know if there is some way
 to map  what the server reports as trusted.afr.volname-client-? to a
 hostname?



 Your volfile
 (/var/lib/glusterd/vols/volname/trusted-volname.tcp-fuse.vol) should
 contain which brick (remote-subvolume + remote-host) a given trusted.afr*
 maps to.
 Hope that helps,
 Ravi


  Thanks, Alastair


  # gluster volume heal homes info
 Brick gluster-2:/export/brick2/home/
 /a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in split-brain
 Number of entries: 1
 Brick gluster1:/export/brick2/home/
 /a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in split-brain
 Number of entries: 1
 Brick gluster0:/export/brick2/home/
 /a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in split-brain
 Number of entries: 1
 # getfattr -d -m . -e hex
 /export/brick2/home/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair
 getfattr: Removing leading '/' from absolute path names
 # file:
 export/brick2/home/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair

 security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
 trusted.afr.dirty=0x
 trusted.afr.homes-client-0=0x
 trusted.afr.homes-client-1=0x
 trusted.afr.homes-client-2=0x
 trusted.afr.homes-client-3=0x0002
 trusted.afr.homes-client-4=0x
 trusted.gfid=0x3ae398227cea4f208d7652dbfb93e3e5
 trusted.glusterfs.dht=0x0001
 trusted.glusterfs.quota.dirty=0x3000

 trusted.glusterfs.quota.edf41dc8-2122-4aa3-bc20-29225564ca8c.contri=0x162d2200
 trusted.glusterfs.quota.size=0x162d2200




 ___
 Gluster-users mailing 
 listGluster-users@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] 3.6.3 split brain on web browser cache dir w. replica 3 volume

2015-06-01 Thread Alastair Neil
I have a replica 3 volume I am using to serve my home directory.  I have
notices a couple of split-brains recently on files used by browsers(for the
most recent see below, I had an earlier one on
.config/google-chrome/Default/Session Storage/) .  When I was running
replica 2 I don't recall seeing more than two entries of the form:
trusted.afr.volname.client-?.  I did have two other servers that I have
removed from service recently but I am curious to know if there is some way
to map  what the server reports as trusted.afr.volname-client-? to a
hostname?

Thanks, Alastair


# gluster volume heal homes info
 Brick gluster-2:/export/brick2/home/
 /a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in split-brain
 Number of entries: 1
 Brick gluster1:/export/brick2/home/
 /a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in split-brain
 Number of entries: 1
 Brick gluster0:/export/brick2/home/
 /a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in split-brain
 Number of entries: 1
 # getfattr -d -m . -e hex
 /export/brick2/home/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair
 getfattr: Removing leading '/' from absolute path names
 # file:
 export/brick2/home/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair
 security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
 trusted.afr.dirty=0x
 trusted.afr.homes-client-0=0x
 trusted.afr.homes-client-1=0x
 trusted.afr.homes-client-2=0x
 trusted.afr.homes-client-3=0x0002
 trusted.afr.homes-client-4=0x
 trusted.gfid=0x3ae398227cea4f208d7652dbfb93e3e5
 trusted.glusterfs.dht=0x0001
 trusted.glusterfs.quota.dirty=0x3000

 trusted.glusterfs.quota.edf41dc8-2122-4aa3-bc20-29225564ca8c.contri=0x162d2200
 trusted.glusterfs.quota.size=0x162d2200
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] fault tolerance

2015-05-20 Thread Alastair Neil
For the client's not to notice the fault you would need to have three
servers in replica 3, otherwise the filesystem will become read-only with
one node down.

On 17 May 2015 at 03:25, Markus Ueberall markus.ueber...@gmail.com wrote:

 Dear Carlos,

 Have a look at http://bit.ly/gadmstp (section Mounting options); you
 can use the backupvolfile-server option for this.

 Kind regards, Markus

 Am 16.05.2015 um 22:17 schrieb Carlos J. Herrera:
  Hi people,
  Now I am using the version of gluster 3.6.2 and I want configure the
  system for fault tolerance. The point is that I want have two server
  in replication mode and if one server down the client do not note the
  fault. How I need import the system in the client for this purpose.


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] fault tolerance

2015-05-20 Thread Alastair Neil
True, but since split-brain resolution is tedious, I assumed he would have
the quorum set to 51%

On 21 May 2015 at 00:57, Joe Julian j...@julianfamily.org wrote:



 On 05/20/2015 09:51 PM, Alastair Neil wrote:

 For the client's not to notice the fault you would need to have three
 servers in replica 3, otherwise the filesystem will become read-only with
 one node down.


 False, unless you specifically enable features to do that.


 On 17 May 2015 at 03:25, Markus Ueberall markus.ueber...@gmail.com
 wrote:

 Dear Carlos,

 Have a look at http://bit.ly/gadmstp (section Mounting options); you
 can use the backupvolfile-server option for this.

 Kind regards, Markus

 Am 16.05.2015 um 22:17 schrieb Carlos J. Herrera:
  Hi people,
  Now I am using the version of gluster 3.6.2 and I want configure the
  system for fault tolerance. The point is that I want have two server
  in replication mode and if one server down the client do not note the
  fault. How I need import the system in the client for this purpose.


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users




 ___
 Gluster-users mailing 
 listGluster-users@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users



 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Unable to add brick from new fresh installed server into upgraded cluster

2014-11-25 Thread Alastair Neil
I have a 2 server CentOS 6 replicated cluster which started out as version
3.3 and has been progressively updated  and is now on version 3.6.1.
Yesterday I added a new freshly installed CentOS 6.6 host and wanted to
convert to replica 3 on one of my volumes, however I was unable to add the
brick as it reported that all the brick hosts had to be at version 03060.
Presumable some minimum compatibility is set on the volume, but I am
struggling to find where.  The info file under
/va/lib/glusterd/vols/volname has op-version=2, client-op-version=2.

Any suggestions?


Thanks, Alastair
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Installing glusterfs 3.6 client on EL7

2014-11-07 Thread Alastair Neil
I'm looking for repos to install on EL7 since  RH only supplies 3.4.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] confused about replicated volumes and sparse files

2014-02-20 Thread Alastair Neil
I am trying to understand how  verify that a replicated volume is up to
date.


Here is my scenario.  I have a gluster cluster with two nodes serving vm
images to ovirt.

I have a volume called vm-store with a brick from each of the nodes:

Volume Name: vm-store
 Type: Replicate
 Volume ID: 379e52d3-2622-4834-8aef-b255db1c67af
 Status: Started
 Number of Bricks: 1 x 2 = 2
 Transport-type: tcp
 Bricks:
 Brick1: gluster1:/export/brick0
 Brick2: gluster0:/export/brick0
 Options Reconfigured:
 user.cifs: disable
 nfs.rpc-auth-allow: *
 auth.allow: *
 storage.owner-gid: 36
 storage.owner-uid: 36


The bricks are formated with xfs using the same options on both servers and
the two servers are identical hardware and OS version and release (CentOS
6.5) with glusterfs v 3.4.2  from bits.gluster.org.

I have a 20GB sparse disk image for a VM but I am confused about why I see
differend reported disk usage on each of the nodes:

[root@gluster0 ~]#  du -sh /export/brick0
 48G /export/brick0
 [root@gluster0 ~]# du -sh
 /export/brick0/6d637c7f-a4ab-4510-a0d9-63a04c55d6d8/images/5dfc7c6f-d35d-4831-b2fb-ed9ab8e3392b/5933a44e-77d6-4606-b6a9-bbf7e4235b13
 8.6G
 /export/brick0/6d637c7f-a4ab-4510-a0d9-63a04c55d6d8/images/5dfc7c6f-d35d-4831-b2fb-ed9ab8e3392b/5933a44e-77d6-4606-b6a9-bbf7e4235b13
 [root@gluster1 ~]# du -sh /export/brick0
 52G /export/brick0
 [root@gluster1 ~]# du -sh
 /export/brick0/6d637c7f-a4ab-4510-a0d9-63a04c55d6d8/images/5dfc7c6f-d35d-4831-b2fb-ed9ab8e3392b/5933a44e-77d6-4606-b6a9-bbf7e4235b13
 12G
 /export/brick0/6d637c7f-a4ab-4510-a0d9-63a04c55d6d8/images/5dfc7c6f-d35d-4831-b2fb-ed9ab8e3392b/5933a44e-77d6-4606-b6a9-bbf7e4235b13


sure enough stat also shows different number of blocks:

[root@gluster0 ~]# stat
 /export/brick0/6d637c7f-a4ab-4510-a0d9-63a04c55d6d8/images/5dfc7c6f-d35d-4831-b2fb-ed9ab8e3392b/5933a44e-77d6-4606-b6a9-bbf7e4235b13
   File:
 `/export/brick0/6d637c7f-a4ab-4510-a0d9-63a04c55d6d8/images/5dfc7c6f-d35d-4831-b2fb-ed9ab8e3392b/5933a44e-77d6-4606-b6a9-bbf7e4235b13'
   Size: 21474836480 Blocks: 17927384   IO Block: 4096   regular file
 Device: fd03h/64771d Inode: 1610613256  Links: 2
 Access: (0660/-rw-rw)  Uid: (   36/vdsm)   Gid: (   36/ kvm)
 Access: 2014-02-18 17:06:30.661993000 -0500
 Modify: 2014-02-20 13:29:33.507966199 -0500
 Change: 2014-02-20 13:29:33.507966199 -0500
 [root@gluster1 ~]# stat
 /export/brick0/6d637c7f-a4ab-4510-a0d9-63a04c55d6d8/images/5dfc7c6f-d35d-4831-b2fb-ed9ab8e3392b/5933a44e-77d6-4606-b6a9-bbf7e4235b13
   File:
 `/export/brick0/6d637c7f-a4ab-4510-a0d9-63a04c55d6d8/images/5dfc7c6f-d35d-4831-b2fb-ed9ab8e3392b/5933a44e-77d6-4606-b6a9-bbf7e4235b13'
   Size: 21474836480 Blocks: 24735976   IO Block: 4096   regular file
 Device: fd03h/64771d Inode: 3758096942  Links: 2
 Access: (0660/-rw-rw)  Uid: (   36/vdsm)   Gid: (   36/ kvm)
 Access: 2014-02-20 09:30:38.490724245 -0500
 Modify: 2014-02-20 13:29:39.464913739 -0500
 Change: 2014-02-20 13:29:39.465913754 -0500



Can someone clear up my understanding?

Thanks, Alastair
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users