Re: [Gluster-users] Quota limits gone after upgrading to 3.8

2017-05-08 Thread Sanoj Unnikrishnan
Hi mabi,

This bug was fixed recently, https://bugzilla.redhat.com/sh
ow_bug.cgi?id=1414346. It would be available in 3.11 release. I will plan
to back port same to earlier releases.

Your quota limits are still set and honored, It is only the listing that
has gone wrong. Using list with command with single path should display the
limit on that path. The printing of list gets messed up when the last gfid
in the quota.conf file is not present in the FS (due to an rmdir without a
remove limit)

You could use the following workaround to get rid of the issue.
 => Remove exactly the last 17 bytes of " /var/lib/glusterd/vols/<
volname>/quota.conf"
  Note: keep a backup of quota.conf for safety

If this does not solve the issue, please revert back with
1) quota.conf file
2) output of list command (when executed along with path)
3) getfattr -d -m . -e hex  | grep limit

It would be great to have your feedback for quota on this thread (
http://lists.gluster.org/pipermail/gluster-users/2017-April/030676.html)

Thanks & Regards,
Sanoj


On Mon, May 8, 2017 at 7:58 PM, mabi  wrote:

> Hello,
>
> I upgraded last week my 2 nodes replica GlusterFS cluster from 3.7.20 to
> 3.8.11 and on one of the volumes I use the quota feature of GlusterFS.
> Unfortunately, I just noticed by using the usual command "gluster volume
> quota myvolume list" that all my quotas on that volume are gone. I had
> around 10 different quotas set on different directories.
>
> Does anyone have an idea where the quotas have vanished? are they gone for
> always and do I need to re-set them all?
>
> Regards,
> M.
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] disperse volume brick counts limits in RHES

2017-05-08 Thread Alastair Neil
so the bottleneck is that computations with 16x20 matrix require  ~4 times
the cycles?  It seems then that there is ample room for improvement, as
there are many linear algebra packages out there that scale better than
O(nxm).  Is the healing time dominated by the EC compute time?  If Serkan
saw a hard 2x scaling then it seems likely.

-Alastair




On 8 May 2017 at 03:02, Xavier Hernandez  wrote:

> On 05/05/17 13:49, Pranith Kumar Karampuri wrote:
>
>>
>>
>> On Fri, May 5, 2017 at 2:38 PM, Serkan Çoban > > wrote:
>>
>> It is the over all time, 8TB data disk healed 2x faster in 8+2
>> configuration.
>>
>>
>> Wow, that is counter intuitive for me. I will need to explore about this
>> to find out why that could be. Thanks a lot for this feedback!
>>
>
> Matrix multiplication for encoding/decoding of 8+2 is 4 times faster than
> 16+4 (one matrix of 16x16 is composed by 4 submatrices of 8x8), however
> each matrix operation on a 16+4 configuration takes twice the amount of
> data of a 8+2, so net effect is that 8+2 is twice as fast as 16+4.
>
> An 8+2 also uses bigger blocks on each brick, processing the same amount
> of data in less I/O operations and bigger network packets.
>
> Probably these are the reasons why 16+4 is slower than 8+2.
>
> See my other email for more detailed description.
>
> Xavi
>
>
>>
>>
>> On Fri, May 5, 2017 at 10:00 AM, Pranith Kumar Karampuri
>> > wrote:
>> >
>> >
>> > On Fri, May 5, 2017 at 11:42 AM, Serkan Çoban
>> > wrote:
>> >>
>> >> Healing gets slower as you increase m in m+n configuration.
>> >> We are using 16+4 configuration without any problems other then
>> heal
>> >> speed.
>> >> I tested heal speed with 8+2 and 16+4 on 3.9.0 and see that heals
>> on
>> >> 8+2 is faster by 2x.
>> >
>> >
>> > As you increase number of nodes that are participating in an EC
>> set number
>> > of parallel heals increase. Is the heal speed you saw improved per
>> file or
>> > the over all time it took to heal the data?
>> >
>> >>
>> >>
>> >>
>> >> On Fri, May 5, 2017 at 9:04 AM, Ashish Pandey
>> > wrote:
>> >> >
>> >> > 8+2 and 8+3 configurations are not the limitation but just
>> suggestions.
>> >> > You can create 16+3 volume without any issue.
>> >> >
>> >> > Ashish
>> >> >
>> >> > 
>> >> > From: "Alastair Neil" > >
>> >> > To: "gluster-users" > >
>> >> > Sent: Friday, May 5, 2017 2:23:32 AM
>> >> > Subject: [Gluster-users] disperse volume brick counts limits in
>> RHES
>> >> >
>> >> >
>> >> > Hi
>> >> >
>> >> > we are deploying a large (24node/45brick) cluster and noted
>> that the
>> >> > RHES
>> >> > guidelines limit the number of data bricks in a disperse set to
>> 8.  Is
>> >> > there
>> >> > any reason for this.  I am aware that you want this to be a
>> power of 2,
>> >> > but
>> >> > as we have a large number of nodes we were planning on going
>> with 16+3.
>> >> > Dropping to 8+2 or 8+3 will be a real waste for us.
>> >> >
>> >> > Thanks,
>> >> >
>> >> >
>> >> > Alastair
>> >> >
>> >> >
>> >> > ___
>> >> > Gluster-users mailing list
>> >> > Gluster-users@gluster.org 
>> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>> >> >
>> >> >
>> >> > ___
>> >> > Gluster-users mailing list
>> >> > Gluster-users@gluster.org 
>> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>> >> ___
>> >> Gluster-users mailing list
>> >> Gluster-users@gluster.org 
>> >> http://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>> >
>> >
>> >
>> >
>> > --
>> > Pranith
>>
>>
>>
>>
>> --
>> Pranith
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>

Re: [Gluster-users] Elasticsearch facing CorruptIndexException exception with GlusterFs 3.10.1

2017-05-08 Thread Vijay Bellur
On Mon, May 8, 2017 at 1:46 PM, Abhijit Paul 
wrote:

> *@Prasanna & @Pranith & @Keithley *Thank you very much for this update,
> let me try out this with build frm src and then use with Elasticsearch in
> kubernetes environment, i will let you know the update.
>
> My hole intention to use this gluster-block solution is to *avoid
> Elasticsearch index health turn RED issue due to CorruptIndex* issue by
> using GlusterFS & FUSE, *on this regard if any further pointer or
> forwards are there will relay appreciate.*
>
>>
>>
>>
>
We expect gluster's block interface to not have the same problem as
encountered by the file system interface with Elasticsearch.  Our limited
testing validates that expectation as outlined in Prasanna's blog post.

Feedback from your testing would be very welcome! Please feel free to let
us know if you require any help in getting the deployment working.

Thanks!
Vijay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Quota limits gone after upgrading to 3.8

2017-05-08 Thread mabi
Hello,

I upgraded last week my 2 nodes replica GlusterFS cluster from 3.7.20 to 3.8.11 
and on one of the volumes I use the quota feature of GlusterFS. Unfortunately, 
I just noticed by using the usual command "gluster volume quota myvolume list" 
that all my quotas on that volume are gone. I had around 10 different quotas 
set on different directories.

Does anyone have an idea where the quotas have vanished? are they gone for 
always and do I need to re-set them all?

Regards,
M.___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] VM going down

2017-05-08 Thread Alessandro Briosi
Il 08/05/2017 12:38, Krutika Dhananjay ha scritto:
> The newly introduced "SEEK" fop seems to be failing at the bricks.
>
> Adding Niels for his inputs/help.
>

Don't know if this is related though the SEEK is done only when the VM
is started, not when it's suddenly shutdown.
Though it's an odd message (as the file really is there), the VM starts
correctly.

Alessandro
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] VM going down

2017-05-08 Thread Alessandro Briosi
Il 08/05/2017 12:57, Jesper Led Lauridsen TS Infra server ha scritto:
>
> I dont know if this has any relation to you issue. But I have seen
> several times during gluster healing that my wm’s fail or are marked
> unresponsive in rhev. My conclusion is that the load gluster puts on
> the wm-images during checksum while healing, result in to much latency
> and wm’s fail.
>
>  
>
> My plans is to try using sharding, so the wm-images/files are split
> into smaller files, changing the number of allowed concurrent heals
> ‘cluster.background-self-heal-count’ and disabling
> ‘cluster.self-heal-daemon’.
>

The thing is that there are no heal processes running, no log entries
either.
Few days ago I had a failure and the heal process started and finished
without any problems.

I do not use sharding yet.

Alessandro
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bad perf for small files on large EC volume

2017-05-08 Thread Serkan Çoban
There are 300M files right I am not counting wrong?
With that file profile I would never use EC in first place.
Maybe you can pack the files into tar archives or similar before
migrating to gluster?
It will take ages to heal a drive with that file count...

On Mon, May 8, 2017 at 3:59 PM, Ingard Mevåg  wrote:
> With attachments :)
>
> 2017-05-08 14:57 GMT+02:00 Ingard Mevåg :
>>
>> Hi
>>
>> We've got 3 servers with 60 drives each setup with an EC volume running on
>> gluster 3.10.0
>> The servers are connected via 10gigE.
>>
>> We've done the changes recommended here :
>> https://bugzilla.redhat.com/show_bug.cgi?id=1349953#c17 and we're able to
>> max out the network with the iozone tests referenced in the same ticket.
>>
>> However for small files we are getting 3-5 MB/s with the smallfile_cli.py
>> tool. For instance:
>> python smallfile_cli.py --operation create --threads 32 --file-size 100
>> --files 1000 --top /tmp/dfs-archive-001/
>> .
>> .
>> total threads = 32
>> total files = 31294
>> total data = 2.984 GB
>>  97.79% of requested files processed, minimum is  90.00
>> 785.542908 sec elapsed time
>> 39.837416 files/sec
>> 39.837416 IOPS
>> 3.890373 MB/sec
>> .
>>
>> We're going to use these servers for archive purposes, so the files will
>> be moved there and accessed very little. After noticing our migration tool
>> performing very badly we did some analyses on the data actually being moved
>> :
>>
>> Bucket 31808791 (16.27 GB) :: 0 bytes - 1.00 KB
>> Bucket 49448258 (122.89 GB) :: 1.00 KB - 5.00 KB
>> Bucket 13382242 (96.92 GB) :: 5.00 KB - 10.00 KB
>> Bucket 13557684 (195.15 GB) :: 10.00 KB - 20.00 KB
>> Bucket 22735245 (764.96 GB) :: 20.00 KB - 50.00 KB
>> Bucket 15101878 (1041.56 GB) :: 50.00 KB - 100.00 KB
>> Bucket 10734103 (1558.35 GB) :: 100.00 KB - 200.00 KB
>> Bucket 17695285 (5773.74 GB) :: 200.00 KB - 500.00 KB
>> Bucket 13632394 (10039.92 GB) :: 500.00 KB - 1.00 MB
>> Bucket 21815815 (32641.81 GB) :: 1.00 MB - 2.00 MB
>> Bucket 36940815 (117683.33 GB) :: 2.00 MB - 5.00 MB
>> Bucket 13580667 (91899.10 GB) :: 5.00 MB - 10.00 MB
>> Bucket 10945768 (232316.33 GB) :: 10.00 MB - 50.00 MB
>> Bucket 1723848 (542581.89 GB) :: 50.00 MB - 9223372036.85 GB
>>
>> So it turns out we've got a very large number of very small files being
>> written to this volume.
>> I've attached the volume config and 2 profiling runs so if someone wants
>> to take a look and maybe give us some hints in terms of what volume settings
>> will be best for writing a lot of small files that would be much
>> appreciated.
>>
>> kind regards
>> ingard
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Bad perf for small files on large EC volume

2017-05-08 Thread Ingard Mevåg
Hi

We've got 3 servers with 60 drives each setup with an EC volume running on
gluster 3.10.0
The servers are connected via 10gigE.

We've done the changes recommended here :
https://bugzilla.redhat.com/show_bug.cgi?id=1349953#c17 and we're able to
max out the network with the iozone tests referenced in the same ticket.

However for small files we are getting 3-5 MB/s with the smallfile_cli.py
tool. For instance:
python smallfile_cli.py --operation create --threads 32 --file-size 100
--files 1000 --top /tmp/dfs-archive-001/
.
.
total threads = 32
total files = 31294
total data = 2.984 GB
 97.79% of requested files processed, minimum is  90.00
785.542908 sec elapsed time
39.837416 files/sec
39.837416 IOPS
3.890373 MB/sec
.

We're going to use these servers for archive purposes, so the files will be
moved there and accessed very little. After noticing our migration tool
performing very badly we did some analyses on the data actually being moved
:

Bucket 31808791 (16.27 GB) :: 0 bytes - 1.00 KB
Bucket 49448258 (122.89 GB) :: 1.00 KB - 5.00 KB
Bucket 13382242 (96.92 GB) :: 5.00 KB - 10.00 KB
Bucket 13557684 (195.15 GB) :: 10.00 KB - 20.00 KB
Bucket 22735245 (764.96 GB) :: 20.00 KB - 50.00 KB
Bucket 15101878 (1041.56 GB) :: 50.00 KB - 100.00 KB
Bucket 10734103 (1558.35 GB) :: 100.00 KB - 200.00 KB
Bucket 17695285 (5773.74 GB) :: 200.00 KB - 500.00 KB
Bucket 13632394 (10039.92 GB) :: 500.00 KB - 1.00 MB
Bucket 21815815 (32641.81 GB) :: 1.00 MB - 2.00 MB
Bucket 36940815 (117683.33 GB) :: 2.00 MB - 5.00 MB
Bucket 13580667 (91899.10 GB) :: 5.00 MB - 10.00 MB
Bucket 10945768 (232316.33 GB) :: 10.00 MB - 50.00 MB
Bucket 1723848 (542581.89 GB) :: 50.00 MB - 9223372036.85 GB

So it turns out we've got a very large number of very small files being
written to this volume.
I've attached the volume config and 2 profiling runs so if someone wants to
take a look and maybe give us some hints in terms of what volume settings
will be best for writing a lot of small files that would be much
appreciated.

kind regards
ingard
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Elasticsearch facing CorruptIndexException exception with GlusterFs 3.10.1

2017-05-08 Thread Kaleb S. KEITHLEY

On 05/08/2017 05:32 AM, Pranith Kumar Karampuri wrote:

Abhijit,
  We released gluster-block v0.2 just this Friday for which RHEL 
packages are yet to be built.


+Kaleb,
  Could you help with this please?

Prasanna,
  Could you let Abhijit know the rpm versions for tcmu-runner and 
other packages so that this feature can be used?




For RHEL 7?

Do you want CentOS Storage SIG packages?

And what is the status of getting this into Fedora, which could include 
EPEL (instead of CentOS Storage SIG)?


In the mean time I can try scratch builds in Fedora/EPEL using the 
src.rpm from the COPR build. If there are no dependency issues that 
should work. For now.


--

Kaleb


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] VM going down

2017-05-08 Thread Jesper Led Lauridsen TS Infra server
I dont know if this has any relation to you issue. But I have seen several 
times during gluster healing that my wm’s fail or are marked unresponsive in 
rhev. My conclusion is that the load gluster puts on the wm-images during 
checksum while healing, result in to much latency and wm’s fail.

My plans is to try using sharding, so the wm-images/files are split into 
smaller files, changing the number of allowed concurrent heals 
‘cluster.background-self-heal-count’ and disabling ‘cluster.self-heal-daemon’.

/Jesper

Fra: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] På vegne af Krutika Dhananjay
Sendt: 8. maj 2017 12:38
Til: Alessandro Briosi ; de Vos, Niels 
Cc: gluster-users 
Emne: Re: [Gluster-users] VM going down

The newly introduced "SEEK" fop seems to be failing at the bricks.
Adding Niels for his inputs/help.

-Krutika

On Mon, May 8, 2017 at 3:43 PM, Alessandro Briosi 
> wrote:
Hi all,
I have sporadic VM going down which files are on gluster FS.

If I look at the gluster logs the only events I find are:
/var/log/glusterfs/bricks/data-brick2-brick.log

[2017-05-08 09:51:17.661697] I [MSGID: 115036]
[server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting
connection from
srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0
[2017-05-08 09:51:17.661697] I [MSGID: 115036]
[server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting
connection from
srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0
[2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup]
0-datastore2-server: releasing lock on
66d9eefb-ee55-40ad-9f44-c55d1e809006 held by {client=0x7f4c7c004880,
pid=0 lk-owner=5c7099efc97f}
[2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup]
0-datastore2-server: releasing lock on
a8d82b3d-1cf9-45cf-9858-d8546710b49c held by {client=0x7f4c840f31d0,
pid=0 lk-owner=5c7019fac97f}
[2017-05-08 09:51:17.661835] I [MSGID: 115013]
[server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on
/images/201/vm-201-disk-2.qcow2
[2017-05-08 09:51:17.661838] I [MSGID: 115013]
[server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on
/images/201/vm-201-disk-1.qcow2
[2017-05-08 09:51:17.661953] I [MSGID: 101055]
[client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down
connection srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0
[2017-05-08 09:51:17.661953] I [MSGID: 101055]
[client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down
connection srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0
[2017-05-08 10:01:06.210392] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted
client from
srvpve2-162483-2017/05/08-10:01:06:189720-datastore2-client-0-0-0
(version: 3.8.11)
[2017-05-08 10:01:06.237433] E [MSGID: 113107] [posix.c:1079:posix_seek]
0-datastore2-posix: seek failed on fd 18 length 42957209600 [No such
device or address]
[2017-05-08 10:01:06.237463] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2
(a8d82b3d-1cf9-45cf-9858-d8546710b49c) ==> (No such device or address)
[No such device or address]
[2017-05-08 10:01:07.019974] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted
client from
srvpve2-162483-2017/05/08-10:01:07:3687-datastore2-client-0-0-0
(version: 3.8.11)
[2017-05-08 10:01:07.041967] E [MSGID: 113107] [posix.c:1079:posix_seek]
0-datastore2-posix: seek failed on fd 19 length 859136720896 [No such
device or address]
[2017-05-08 10:01:07.041992] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2
(66d9eefb-ee55-40ad-9f44-c55d1e809006) ==> (No such device or address)
[No such device or address]

The strange part is that I cannot seem to find any other error.
If I restart the VM everything works as expected (it stopped at ~9.51
UTC and was started at ~10.01 UTC) .

This is not the first time that this happened, and I do not see any
problems with networking or the hosts.

Gluster version is 3.8.11
this is the incriminated volume (though it happened on a different one too)

Volume Name: datastore2
Type: Replicate
Volume ID: c95ebb5f-6e04-4f09-91b9-bbbe63d83aea
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: srvpve2g:/data/brick2/brick
Brick2: srvpve3g:/data/brick2/brick
Brick3: srvpve1g:/data/brick2/brick (arbiter)
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet

Any hint on how to dig more deeply into the reason would be greatly
appreciated.

Alessandro
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] VM going down

2017-05-08 Thread Krutika Dhananjay
The newly introduced "SEEK" fop seems to be failing at the bricks.

Adding Niels for his inputs/help.

-Krutika

On Mon, May 8, 2017 at 3:43 PM, Alessandro Briosi  wrote:

> Hi all,
> I have sporadic VM going down which files are on gluster FS.
>
> If I look at the gluster logs the only events I find are:
> /var/log/glusterfs/bricks/data-brick2-brick.log
>
> [2017-05-08 09:51:17.661697] I [MSGID: 115036]
> [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting
> connection from
> srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0
> [2017-05-08 09:51:17.661697] I [MSGID: 115036]
> [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting
> connection from
> srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0
> [2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup]
> 0-datastore2-server: releasing lock on
> 66d9eefb-ee55-40ad-9f44-c55d1e809006 held by {client=0x7f4c7c004880,
> pid=0 lk-owner=5c7099efc97f}
> [2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup]
> 0-datastore2-server: releasing lock on
> a8d82b3d-1cf9-45cf-9858-d8546710b49c held by {client=0x7f4c840f31d0,
> pid=0 lk-owner=5c7019fac97f}
> [2017-05-08 09:51:17.661835] I [MSGID: 115013]
> [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on
> /images/201/vm-201-disk-2.qcow2
> [2017-05-08 09:51:17.661838] I [MSGID: 115013]
> [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on
> /images/201/vm-201-disk-1.qcow2
> [2017-05-08 09:51:17.661953] I [MSGID: 101055]
> [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down
> connection srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0
> [2017-05-08 09:51:17.661953] I [MSGID: 101055]
> [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down
> connection srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0
> [2017-05-08 10:01:06.210392] I [MSGID: 115029]
> [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted
> client from
> srvpve2-162483-2017/05/08-10:01:06:189720-datastore2-client-0-0-0
> (version: 3.8.11)
> [2017-05-08 10:01:06.237433] E [MSGID: 113107] [posix.c:1079:posix_seek]
> 0-datastore2-posix: seek failed on fd 18 length 42957209600 [No such
> device or address]
> [2017-05-08 10:01:06.237463] E [MSGID: 115089]
> [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2
> (a8d82b3d-1cf9-45cf-9858-d8546710b49c) ==> (No such device or address)
> [No such device or address]
> [2017-05-08 10:01:07.019974] I [MSGID: 115029]
> [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted
> client from
> srvpve2-162483-2017/05/08-10:01:07:3687-datastore2-client-0-0-0
> (version: 3.8.11)
> [2017-05-08 10:01:07.041967] E [MSGID: 113107] [posix.c:1079:posix_seek]
> 0-datastore2-posix: seek failed on fd 19 length 859136720896 [No such
> device or address]
> [2017-05-08 10:01:07.041992] E [MSGID: 115089]
> [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2
> (66d9eefb-ee55-40ad-9f44-c55d1e809006) ==> (No such device or address)
> [No such device or address]
>
> The strange part is that I cannot seem to find any other error.
> If I restart the VM everything works as expected (it stopped at ~9.51
> UTC and was started at ~10.01 UTC) .
>
> This is not the first time that this happened, and I do not see any
> problems with networking or the hosts.
>
> Gluster version is 3.8.11
> this is the incriminated volume (though it happened on a different one too)
>
> Volume Name: datastore2
> Type: Replicate
> Volume ID: c95ebb5f-6e04-4f09-91b9-bbbe63d83aea
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: srvpve2g:/data/brick2/brick
> Brick2: srvpve3g:/data/brick2/brick
> Brick3: srvpve1g:/data/brick2/brick (arbiter)
> Options Reconfigured:
> nfs.disable: on
> performance.readdir-ahead: on
> transport.address-family: inet
>
> Any hint on how to dig more deeply into the reason would be greatly
> appreciated.
>
> Alessandro
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] VM going down

2017-05-08 Thread Alessandro Briosi
Hi all,
I have sporadic VM going down which files are on gluster FS.

If I look at the gluster logs the only events I find are:
/var/log/glusterfs/bricks/data-brick2-brick.log

[2017-05-08 09:51:17.661697] I [MSGID: 115036]
[server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting
connection from
srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0
[2017-05-08 09:51:17.661697] I [MSGID: 115036]
[server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting
connection from
srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0
[2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup]
0-datastore2-server: releasing lock on
66d9eefb-ee55-40ad-9f44-c55d1e809006 held by {client=0x7f4c7c004880,
pid=0 lk-owner=5c7099efc97f}
[2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup]
0-datastore2-server: releasing lock on
a8d82b3d-1cf9-45cf-9858-d8546710b49c held by {client=0x7f4c840f31d0,
pid=0 lk-owner=5c7019fac97f}
[2017-05-08 09:51:17.661835] I [MSGID: 115013]
[server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on
/images/201/vm-201-disk-2.qcow2
[2017-05-08 09:51:17.661838] I [MSGID: 115013]
[server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on
/images/201/vm-201-disk-1.qcow2
[2017-05-08 09:51:17.661953] I [MSGID: 101055]
[client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down
connection srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0
[2017-05-08 09:51:17.661953] I [MSGID: 101055]
[client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down
connection srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0
[2017-05-08 10:01:06.210392] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted
client from
srvpve2-162483-2017/05/08-10:01:06:189720-datastore2-client-0-0-0
(version: 3.8.11)
[2017-05-08 10:01:06.237433] E [MSGID: 113107] [posix.c:1079:posix_seek]
0-datastore2-posix: seek failed on fd 18 length 42957209600 [No such
device or address]
[2017-05-08 10:01:06.237463] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2
(a8d82b3d-1cf9-45cf-9858-d8546710b49c) ==> (No such device or address)
[No such device or address]
[2017-05-08 10:01:07.019974] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted
client from
srvpve2-162483-2017/05/08-10:01:07:3687-datastore2-client-0-0-0
(version: 3.8.11)
[2017-05-08 10:01:07.041967] E [MSGID: 113107] [posix.c:1079:posix_seek]
0-datastore2-posix: seek failed on fd 19 length 859136720896 [No such
device or address]
[2017-05-08 10:01:07.041992] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2
(66d9eefb-ee55-40ad-9f44-c55d1e809006) ==> (No such device or address)
[No such device or address]

The strange part is that I cannot seem to find any other error.
If I restart the VM everything works as expected (it stopped at ~9.51
UTC and was started at ~10.01 UTC) .

This is not the first time that this happened, and I do not see any
problems with networking or the hosts.

Gluster version is 3.8.11
this is the incriminated volume (though it happened on a different one too)

Volume Name: datastore2
Type: Replicate
Volume ID: c95ebb5f-6e04-4f09-91b9-bbbe63d83aea
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: srvpve2g:/data/brick2/brick
Brick2: srvpve3g:/data/brick2/brick
Brick3: srvpve1g:/data/brick2/brick (arbiter)
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet

Any hint on how to dig more deeply into the reason would be greatly
appreciated.

Alessandro
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Elasticsearch facing CorruptIndexException exception with GlusterFs 3.10.1

2017-05-08 Thread Pranith Kumar Karampuri
Abhijit,
 We released gluster-block v0.2 just this Friday for which RHEL
packages are yet to be built.

+Kaleb,
 Could you help with this please?

Prasanna,
 Could you let Abhijit know the rpm versions for tcmu-runner and other
packages so that this feature can be used?


On Mon, May 8, 2017 at 2:49 PM, Pranith Kumar Karampuri  wrote:

> Wait, wait we are discussing this issue only. Expect a reply in some time
> :-)
>
> On Mon, May 8, 2017 at 2:19 PM, Abhijit Paul 
> wrote:
>
>> poking for previous mail reply
>>
>> On Sun, May 7, 2017 at 1:06 AM, Abhijit Paul 
>> wrote:
>>
>>>  https://pkalever.wordpress.com/2017/03/14/elasticsearch-wit
>>> h-gluster-block/
>>> here used tested environment is Fedora ,
>>> but i am using RHEL based Oracle linux so does gluster-block compatible
>>> with RHEL as well? What i needs to change & make it work?
>>>
>>> On Fri, May 5, 2017 at 5:42 PM, Pranith Kumar Karampuri <
>>> pkara...@redhat.com> wrote:
>>>


 On Fri, May 5, 2017 at 5:40 PM, Pranith Kumar Karampuri <
 pkara...@redhat.com> wrote:

>
>
> On Fri, May 5, 2017 at 5:36 PM, Abhijit Paul  > wrote:
>
>> So should i start using gluster-block with elasticsearch in kubernetes
>> environment?
>>
>> My expectation from gluster-block is, it should not CorruptIndex
>> of elasticsearch...and issue facing in previous mails.
>>
>> Please let me know whether should i processed with above mentioned
>> combination.
>>
>
> We are still in the process of fixing the failure scenarios of
> tcmu-runner dying and failingover in the multipath scenarios.
>

 Prasanna did test that elasticsearch itself worked fine in
 gluster-block environment when all the machines are up etc i.e. success
 path. We are doing failure path testing and fixing things at the moment.


>
>
>>
>> On Fri, May 5, 2017 at 5:06 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>> Abhijit we just started making the efforts to get all of this stable.
>>>
>>> On Fri, May 5, 2017 at 4:45 PM, Abhijit Paul <
>>> er.abhijitp...@gmail.com> wrote:
>>>
 I yet to try gluster-block with elasticsearch...but carious to know
 does this combination plays well in kubernetes environment?

 On Fri, May 5, 2017 at 12:14 PM, Abhijit Paul <
 er.abhijitp...@gmail.com> wrote:

> thanks Krutika for the alternative.
>
> @*Prasanna @**Pranith*
> I was going thorough the mentioned blog post and saw that used
> tested environment was Fedora ,
> but i am using RHEL based Oracle linux so does gluster-block
> compatible with RHEL as well?
>
> On Fri, May 5, 2017 at 12:03 PM, Krutika Dhananjay <
> kdhan...@redhat.com> wrote:
>
>> Yeah, there are a couple of cache consistency issues with
>> performance translators that are causing these exceptions.
>> Some of them were fixed by 3.10.1. Some still remain.
>>
>> Alternatively you can give gluster-block + elasticsearch a try,
>> which doesn't require solving all these caching issues.
>> Here's a blog post on the same - https://pkalever.wordpress.com
>> /2017/03/14/elasticsearch-with-gluster-block/
>>
>> Adding Prasanna and Pranith who worked on this, in case you need
>> more info on this.
>>
>> -Krutika
>>
>> On Fri, May 5, 2017 at 12:15 AM, Abhijit Paul <
>> er.abhijitp...@gmail.com> wrote:
>>
>>> Thanks for the reply, i will try it out but i am also facing one
>>> more issue "i.e. replicated volumes returning different
>>> timestamps"
>>> so is this because of Bug 1426548 - Openshift Logging
>>> ElasticSearch FSLocks when using GlusterFS storage backend
>>>  ?
>>>
>>> *FYI i am using glusterfs 3.10.1 tar.gz*
>>>
>>> Regards,
>>> Abhijit
>>>
>>>
>>>
>>> On Thu, May 4, 2017 at 10:58 PM, Amar Tumballi <
>>> atumb...@redhat.com> wrote:
>>>


 On Thu, May 4, 2017 at 10:41 PM, Abhijit Paul <
 er.abhijitp...@gmail.com> wrote:

> Since i am new to gluster, can please provide how to turn 
> off/disable
> "perf xlator options"?
>
>
 $ gluster volume set  performance.stat-prefetch off
 $ gluster volume set  performance.read-ahead off
 $ gluster volume set  performance.write-behind off
 $ gluster volume set  performance.io-cache off
 $ gluster 

Re: [Gluster-users] Elasticsearch facing CorruptIndexException exception with GlusterFs 3.10.1

2017-05-08 Thread Pranith Kumar Karampuri
Wait, wait we are discussing this issue only. Expect a reply in some time
:-)

On Mon, May 8, 2017 at 2:19 PM, Abhijit Paul 
wrote:

> poking for previous mail reply
>
> On Sun, May 7, 2017 at 1:06 AM, Abhijit Paul 
> wrote:
>
>>  https://pkalever.wordpress.com/2017/03/14/elasticsearch-wit
>> h-gluster-block/
>> here used tested environment is Fedora ,
>> but i am using RHEL based Oracle linux so does gluster-block compatible
>> with RHEL as well? What i needs to change & make it work?
>>
>> On Fri, May 5, 2017 at 5:42 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Fri, May 5, 2017 at 5:40 PM, Pranith Kumar Karampuri <
>>> pkara...@redhat.com> wrote:
>>>


 On Fri, May 5, 2017 at 5:36 PM, Abhijit Paul 
 wrote:

> So should i start using gluster-block with elasticsearch in kubernetes
> environment?
>
> My expectation from gluster-block is, it should not CorruptIndex
> of elasticsearch...and issue facing in previous mails.
>
> Please let me know whether should i processed with above mentioned
> combination.
>

 We are still in the process of fixing the failure scenarios of
 tcmu-runner dying and failingover in the multipath scenarios.

>>>
>>> Prasanna did test that elasticsearch itself worked fine in gluster-block
>>> environment when all the machines are up etc i.e. success path. We are
>>> doing failure path testing and fixing things at the moment.
>>>
>>>


>
> On Fri, May 5, 2017 at 5:06 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>> Abhijit we just started making the efforts to get all of this stable.
>>
>> On Fri, May 5, 2017 at 4:45 PM, Abhijit Paul <
>> er.abhijitp...@gmail.com> wrote:
>>
>>> I yet to try gluster-block with elasticsearch...but carious to know
>>> does this combination plays well in kubernetes environment?
>>>
>>> On Fri, May 5, 2017 at 12:14 PM, Abhijit Paul <
>>> er.abhijitp...@gmail.com> wrote:
>>>
 thanks Krutika for the alternative.

 @*Prasanna @**Pranith*
 I was going thorough the mentioned blog post and saw that used
 tested environment was Fedora ,
 but i am using RHEL based Oracle linux so does gluster-block
 compatible with RHEL as well?

 On Fri, May 5, 2017 at 12:03 PM, Krutika Dhananjay <
 kdhan...@redhat.com> wrote:

> Yeah, there are a couple of cache consistency issues with
> performance translators that are causing these exceptions.
> Some of them were fixed by 3.10.1. Some still remain.
>
> Alternatively you can give gluster-block + elasticsearch a try,
> which doesn't require solving all these caching issues.
> Here's a blog post on the same - https://pkalever.wordpress.com
> /2017/03/14/elasticsearch-with-gluster-block/
>
> Adding Prasanna and Pranith who worked on this, in case you need
> more info on this.
>
> -Krutika
>
> On Fri, May 5, 2017 at 12:15 AM, Abhijit Paul <
> er.abhijitp...@gmail.com> wrote:
>
>> Thanks for the reply, i will try it out but i am also facing one
>> more issue "i.e. replicated volumes returning different
>> timestamps"
>> so is this because of Bug 1426548 - Openshift Logging
>> ElasticSearch FSLocks when using GlusterFS storage backend
>>  ?
>>
>> *FYI i am using glusterfs 3.10.1 tar.gz*
>>
>> Regards,
>> Abhijit
>>
>>
>>
>> On Thu, May 4, 2017 at 10:58 PM, Amar Tumballi <
>> atumb...@redhat.com> wrote:
>>
>>>
>>>
>>> On Thu, May 4, 2017 at 10:41 PM, Abhijit Paul <
>>> er.abhijitp...@gmail.com> wrote:
>>>
 Since i am new to gluster, can please provide how to turn 
 off/disable
 "perf xlator options"?


>>> $ gluster volume set  performance.stat-prefetch off
>>> $ gluster volume set  performance.read-ahead off
>>> $ gluster volume set  performance.write-behind off
>>> $ gluster volume set  performance.io-cache off
>>> $ gluster volume set  performance.quick-read off
>>>
>>>
>>> Regards,
>>> Amar
>>>

> On Wed, May 3, 2017 at 8:51 PM, Atin Mukherjee <
> amukh...@redhat.com> wrote:
>
>> I think there is still some pending stuffs in some of the
>> gluster perf xlators to make that work complete. Cced the 
>> relevant folks
>> for more information. Can you please turn off all the perf 
>> 

Re: [Gluster-users] Elasticsearch facing CorruptIndexException exception with GlusterFs 3.10.1

2017-05-08 Thread Abhijit Paul
poking for previous mail reply

On Sun, May 7, 2017 at 1:06 AM, Abhijit Paul 
wrote:

>  https://pkalever.wordpress.com/2017/03/14/elasticsearch-wit
> h-gluster-block/
> here used tested environment is Fedora ,
> but i am using RHEL based Oracle linux so does gluster-block compatible
> with RHEL as well? What i needs to change & make it work?
>
> On Fri, May 5, 2017 at 5:42 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>>
>> On Fri, May 5, 2017 at 5:40 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Fri, May 5, 2017 at 5:36 PM, Abhijit Paul 
>>> wrote:
>>>
 So should i start using gluster-block with elasticsearch in kubernetes
 environment?

 My expectation from gluster-block is, it should not CorruptIndex
 of elasticsearch...and issue facing in previous mails.

 Please let me know whether should i processed with above mentioned
 combination.

>>>
>>> We are still in the process of fixing the failure scenarios of
>>> tcmu-runner dying and failingover in the multipath scenarios.
>>>
>>
>> Prasanna did test that elasticsearch itself worked fine in gluster-block
>> environment when all the machines are up etc i.e. success path. We are
>> doing failure path testing and fixing things at the moment.
>>
>>
>>>
>>>

 On Fri, May 5, 2017 at 5:06 PM, Pranith Kumar Karampuri <
 pkara...@redhat.com> wrote:

> Abhijit we just started making the efforts to get all of this stable.
>
> On Fri, May 5, 2017 at 4:45 PM, Abhijit Paul  > wrote:
>
>> I yet to try gluster-block with elasticsearch...but carious to know
>> does this combination plays well in kubernetes environment?
>>
>> On Fri, May 5, 2017 at 12:14 PM, Abhijit Paul <
>> er.abhijitp...@gmail.com> wrote:
>>
>>> thanks Krutika for the alternative.
>>>
>>> @*Prasanna @**Pranith*
>>> I was going thorough the mentioned blog post and saw that used
>>> tested environment was Fedora ,
>>> but i am using RHEL based Oracle linux so does gluster-block
>>> compatible with RHEL as well?
>>>
>>> On Fri, May 5, 2017 at 12:03 PM, Krutika Dhananjay <
>>> kdhan...@redhat.com> wrote:
>>>
 Yeah, there are a couple of cache consistency issues with
 performance translators that are causing these exceptions.
 Some of them were fixed by 3.10.1. Some still remain.

 Alternatively you can give gluster-block + elasticsearch a try,
 which doesn't require solving all these caching issues.
 Here's a blog post on the same - https://pkalever.wordpress.com
 /2017/03/14/elasticsearch-with-gluster-block/

 Adding Prasanna and Pranith who worked on this, in case you need
 more info on this.

 -Krutika

 On Fri, May 5, 2017 at 12:15 AM, Abhijit Paul <
 er.abhijitp...@gmail.com> wrote:

> Thanks for the reply, i will try it out but i am also facing one
> more issue "i.e. replicated volumes returning different
> timestamps"
> so is this because of Bug 1426548 - Openshift Logging
> ElasticSearch FSLocks when using GlusterFS storage backend
>  ?
>
> *FYI i am using glusterfs 3.10.1 tar.gz*
>
> Regards,
> Abhijit
>
>
>
> On Thu, May 4, 2017 at 10:58 PM, Amar Tumballi <
> atumb...@redhat.com> wrote:
>
>>
>>
>> On Thu, May 4, 2017 at 10:41 PM, Abhijit Paul <
>> er.abhijitp...@gmail.com> wrote:
>>
>>> Since i am new to gluster, can please provide how to turn 
>>> off/disable
>>> "perf xlator options"?
>>>
>>>
>> $ gluster volume set  performance.stat-prefetch off
>> $ gluster volume set  performance.read-ahead off
>> $ gluster volume set  performance.write-behind off
>> $ gluster volume set  performance.io-cache off
>> $ gluster volume set  performance.quick-read off
>>
>>
>> Regards,
>> Amar
>>
>>>
 On Wed, May 3, 2017 at 8:51 PM, Atin Mukherjee <
 amukh...@redhat.com> wrote:

> I think there is still some pending stuffs in some of the
> gluster perf xlators to make that work complete. Cced the 
> relevant folks
> for more information. Can you please turn off all the perf xlator 
> options
> as a work around to move forward?
>
> On Wed, May 3, 2017 at 8:04 PM, Abhijit Paul <
> er.abhijitp...@gmail.com> wrote:
>
>> Dear folks,
>>
>> I setup Glusterfs(3.10.1) NFS type as 

Re: [Gluster-users] gdeploy, Centos7 & Ansible 2.3

2017-05-08 Thread hvjunk

> On 08 May 2017, at 09:34 , knarra  wrote:
> Hi,
> 
> There is a new version of gdeploy built where the above seen issues are 
> fixed. Can you please update gdeploy to the version below [1] and run the 
> test again ?
> [1] 
> https://copr-be.cloud.fedoraproject.org/results/sac/gdeploy/epel-7-x86_64/00547404-gdeploy/gdeploy-2.0.2-6.noarch.rpm
>  
> 
> Thanks
> 
> kasturi
Thank you Kasturi,

Please se the run with errors below, using this conf file:
==
[hosts]
10.10.10.11
10.10.10.12
10.10.10.13

[backend-setup]
devices=/dev/sdb
mountpoints=/gluster/brick1
brick_dirs=/gluster/brick1/one
pools=pool1

#Installing nfs-ganesha
[yum]
action=install
repolist=
gpgcheck=no
update=no
packages=glusterfs-ganesha

#This will create a volume. Skip this section if your volume already exists
[volume]
action=create
volname=ganesha
transport=tcp
replica_count=3
arbiter=1
force=yes

#Creating a high availability cluster and exporting the volume
[nfs-ganesha]
action=create-cluster
ha-name=ganesha-ha-360
cluster-nodes=10.10.10.11,10.10.10.12
vip=10.10.10.31,10.10.10.41
volname=ganesha

==



[root@linked-clone-of-centos-linux ~]# gdeploy -c t.conf
ERROR! no action detected in task. This often indicates a misspelled module 
name, or incorrect module path.

The error appears to have been in '/tmp/tmpvhTM5i/pvcreate.yml': line 16, 
column 5, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  # Create pv on all the disks
  - name: Create Physical Volume
^ here


The error appears to have been in '/tmp/tmpvhTM5i/pvcreate.yml': line 16, 
column 5, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  # Create pv on all the disks
  - name: Create Physical Volume
^ here

Ignoring errors...
ERROR! no action detected in task. This often indicates a misspelled module 
name, or incorrect module path.

The error appears to have been in '/tmp/tmpvhTM5i/vgcreate.yml': line 8, column 
5, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  tasks:
  - name: Create volume group on the disks
^ here


The error appears to have been in '/tmp/tmpvhTM5i/vgcreate.yml': line 8, column 
5, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  tasks:
  - name: Create volume group on the disks
^ here

Ignoring errors...
ERROR! no action detected in task. This often indicates a misspelled module 
name, or incorrect module path.

The error appears to have been in 
'/tmp/tmpvhTM5i/auto_lvcreate_for_gluster.yml': line 7, column 5, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  tasks:
  - name: Create logical volume named metadata
^ here


The error appears to have been in 
'/tmp/tmpvhTM5i/auto_lvcreate_for_gluster.yml': line 7, column 5, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  tasks:
  - name: Create logical volume named metadata
^ here

Ignoring errors...

PLAY [gluster_servers] 
**

TASK [Create an xfs filesystem] 
*
failed: [10.10.10.11] (item=/dev/GLUSTER_vg1/GLUSTER_lv1) => {"failed": true, 
"item": "/dev/GLUSTER_vg1/GLUSTER_lv1", "msg": "Device 
/dev/GLUSTER_vg1/GLUSTER_lv1 not found."}
failed: [10.10.10.13] (item=/dev/GLUSTER_vg1/GLUSTER_lv1) => {"failed": true, 
"item": "/dev/GLUSTER_vg1/GLUSTER_lv1", "msg": "Device 
/dev/GLUSTER_vg1/GLUSTER_lv1 not found."}
failed: [10.10.10.12] (item=/dev/GLUSTER_vg1/GLUSTER_lv1) => {"failed": true, 
"item": "/dev/GLUSTER_vg1/GLUSTER_lv1", "msg": "Device 
/dev/GLUSTER_vg1/GLUSTER_lv1 not found."}
to retry, use: --limit @/tmp/tmpvhTM5i/fscreate.retry

PLAY RECAP 
**
10.10.10.11: ok=0changed=0unreachable=0failed=1
10.10.10.12: ok=0changed=0unreachable=0failed=1
10.10.10.13: ok=0changed=0unreachable=0failed=1

Ignoring errors...

PLAY [gluster_servers] 
**

TASK [Create the backend disks, skips if present] 

Re: [Gluster-users] disperse volume brick counts limits in RHES

2017-05-08 Thread Serkan Çoban
>What network do you have?
We have 2X10G bonded interfaces on each server.

Thanks to Xavier for detailed explanation of EC details.

On Sat, May 6, 2017 at 2:20 AM, Alastair Neil  wrote:
> What network do you have?
>
>
> On 5 May 2017 at 09:51, Serkan Çoban  wrote:
>>
>> In our use case every node has 26 bricks. I am using 60 nodes, one 9PB
>> volume with 16+4 EC configuration, each brick in a sub-volume is on
>> different host.
>> We put 15-20k 2GB files every day into 10-15 folders. So it is 1500K
>> files/folder. Our gluster version is 3.7.11.
>> Heal speed in this environment is 8-10MB/sec/brick.
>>
>> I did some tests for parallel self heal feature with version 3.9, two
>> servers 26 bricks each, 8+2 and 16+4 EC configuration.
>> This was a small test environment and the results are as I said 8+2 is
>> 2x faster then 16+4 with parallel self heal threads set to 2/4.
>> In 1-2 months our new servers arriving, I will do detailed tests for
>> heal performance for 8+2 and 16+4 and inform you the results.
>>
>>
>> On Fri, May 5, 2017 at 2:54 PM, Pranith Kumar Karampuri
>>  wrote:
>> >
>> >
>> > On Fri, May 5, 2017 at 5:19 PM, Pranith Kumar Karampuri
>> >  wrote:
>> >>
>> >>
>> >>
>> >> On Fri, May 5, 2017 at 2:38 PM, Serkan Çoban 
>> >> wrote:
>> >>>
>> >>> It is the over all time, 8TB data disk healed 2x faster in 8+2
>> >>> configuration.
>> >>
>> >>
>> >> Wow, that is counter intuitive for me. I will need to explore about
>> >> this
>> >> to find out why that could be. Thanks a lot for this feedback!
>> >
>> >
>> > From memory I remember you said you have a lot of small files hosted on
>> > the
>> > volume, right? It could be because of the bug
>> > https://review.gluster.org/17151 is fixing. That is the only reason I
>> > could
>> > guess right now. We will try to test this kind of case if you could give
>> > us
>> > a bit more details about average file-size/depth of directories etc to
>> > simulate similar looking directory structure.
>> >
>> >>
>> >>
>> >>>
>> >>>
>> >>> On Fri, May 5, 2017 at 10:00 AM, Pranith Kumar Karampuri
>> >>>  wrote:
>> >>> >
>> >>> >
>> >>> > On Fri, May 5, 2017 at 11:42 AM, Serkan Çoban
>> >>> > 
>> >>> > wrote:
>> >>> >>
>> >>> >> Healing gets slower as you increase m in m+n configuration.
>> >>> >> We are using 16+4 configuration without any problems other then
>> >>> >> heal
>> >>> >> speed.
>> >>> >> I tested heal speed with 8+2 and 16+4 on 3.9.0 and see that heals
>> >>> >> on
>> >>> >> 8+2 is faster by 2x.
>> >>> >
>> >>> >
>> >>> > As you increase number of nodes that are participating in an EC set
>> >>> > number
>> >>> > of parallel heals increase. Is the heal speed you saw improved per
>> >>> > file
>> >>> > or
>> >>> > the over all time it took to heal the data?
>> >>> >
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> On Fri, May 5, 2017 at 9:04 AM, Ashish Pandey 
>> >>> >> wrote:
>> >>> >> >
>> >>> >> > 8+2 and 8+3 configurations are not the limitation but just
>> >>> >> > suggestions.
>> >>> >> > You can create 16+3 volume without any issue.
>> >>> >> >
>> >>> >> > Ashish
>> >>> >> >
>> >>> >> > 
>> >>> >> > From: "Alastair Neil" 
>> >>> >> > To: "gluster-users" 
>> >>> >> > Sent: Friday, May 5, 2017 2:23:32 AM
>> >>> >> > Subject: [Gluster-users] disperse volume brick counts limits in
>> >>> >> > RHES
>> >>> >> >
>> >>> >> >
>> >>> >> > Hi
>> >>> >> >
>> >>> >> > we are deploying a large (24node/45brick) cluster and noted that
>> >>> >> > the
>> >>> >> > RHES
>> >>> >> > guidelines limit the number of data bricks in a disperse set to
>> >>> >> > 8.
>> >>> >> > Is
>> >>> >> > there
>> >>> >> > any reason for this.  I am aware that you want this to be a power
>> >>> >> > of
>> >>> >> > 2,
>> >>> >> > but
>> >>> >> > as we have a large number of nodes we were planning on going with
>> >>> >> > 16+3.
>> >>> >> > Dropping to 8+2 or 8+3 will be a real waste for us.
>> >>> >> >
>> >>> >> > Thanks,
>> >>> >> >
>> >>> >> >
>> >>> >> > Alastair
>> >>> >> >
>> >>> >> >
>> >>> >> > ___
>> >>> >> > Gluster-users mailing list
>> >>> >> > Gluster-users@gluster.org
>> >>> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
>> >>> >> >
>> >>> >> >
>> >>> >> > ___
>> >>> >> > Gluster-users mailing list
>> >>> >> > Gluster-users@gluster.org
>> >>> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
>> >>> >> ___
>> >>> >> Gluster-users mailing list
>> >>> >> Gluster-users@gluster.org
>> >>> >> http://lists.gluster.org/mailman/listinfo/gluster-users
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > --
>> >>> > Pranith
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Pranith
>> >
>> >
>> >
>> >
>> > 

Re: [Gluster-users] gdeploy, Centos7 & Ansible 2.3

2017-05-08 Thread knarra

On 05/06/2017 03:14 PM, hvjunk wrote:

Hi there,

 So, busy testing/installing/etc. and was pointed last night in the 
direction of gdeploy. I did a quick try on Ubuntu 16.04, found some 
module related troubles, so I retried on Centos 7 this morning.


Seems that the playbooks aren’t 2.3 “compatible”…

The brick VMs are setup using the set-vms-centos.sh & 
sshkeys-centos.yml playbook from 
https://bitbucket.org/dismyne/gluster-ansibles/src/24b62dcc858364ee3744d351993de0e8e35c2680/?at=Centos-gdeploy-tests


The “installation”/gdeploy VM run:

The relevant history output:

   18  yum install epel-release
   19  yum install ansible
   20  yum search gdeploy
   21  yum install 
https://download.gluster.org/pub/gluster/gdeploy/LATEST/CentOS7/gdeploy-2.0.1-9.noarch.rpm

   22  vi t.conf
   23  gdeploy -c t.conf
   24  history
   25  mkdir .ssh
   26  cd .ssh
   27  ls
   28  vi id_rsa
   29  chmod 0600 id_rsa
   30  cd
   31  gdeploy -c t.conf
   32  ssh -v 10.10.10.11
   33  ssh -v 10.10.10.12
   34  ssh -v 10.10.10.13
   35  gdeploy -c t.conf

The t.conf:
==
 [hosts]
10.10.10.11
10.10.10.12
10.10.10.13

[backend-setup]
devices=/dev/sdb
mountpoints=/gluster/brick1
brick_dirs=/gluster/brick1/one

==

The gdeploy run:

=
[root@linked-clone-of-centos-linux ~]# gdeploy -c t.conf
ERROR! no action detected in task. This often indicates a misspelled 
module name, or incorrect module path.


The error appears to have been in '/tmp/tmpezTsyO/pvcreate.yml': line 
16, column 5, but may

be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  # Create pv on all the disks
  - name: Create Physical Volume
^ here


The error appears to have been in '/tmp/tmpezTsyO/pvcreate.yml': line 
16, column 5, but may

be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  # Create pv on all the disks
  - name: Create Physical Volume
^ here

Ignoring errors...
ERROR! no action detected in task. This often indicates a misspelled 
module name, or incorrect module path.


The error appears to have been in '/tmp/tmpezTsyO/vgcreate.yml': line 
8, column 5, but may

be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  tasks:
  - name: Create volume group on the disks
^ here


The error appears to have been in '/tmp/tmpezTsyO/vgcreate.yml': line 
8, column 5, but may

be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  tasks:
  - name: Create volume group on the disks
^ here

Ignoring errors...
ERROR! no action detected in task. This often indicates a misspelled 
module name, or incorrect module path.


The error appears to have been in 
'/tmp/tmpezTsyO/auto_lvcreate_for_gluster.yml': line 7, column 5, but may

be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  tasks:
  - name: Create logical volume named metadata
^ here


The error appears to have been in 
'/tmp/tmpezTsyO/auto_lvcreate_for_gluster.yml': line 7, column 5, but may

be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  tasks:
  - name: Create logical volume named metadata
^ here

Ignoring errors...

PLAY [gluster_servers] 
**


TASK [Create a xfs filesystem] 
**
failed: [10.10.10.13] (item=/dev/GLUSTER_vg1/GLUSTER_lv1) => 
{"failed": true, "item": "/dev/GLUSTER_vg1/GLUSTER_lv1", "msg": 
"Device /dev/GLUSTER_vg1/GLUSTER_lv1 not found."}
failed: [10.10.10.12] (item=/dev/GLUSTER_vg1/GLUSTER_lv1) => 
{"failed": true, "item": "/dev/GLUSTER_vg1/GLUSTER_lv1", "msg": 
"Device /dev/GLUSTER_vg1/GLUSTER_lv1 not found."}
failed: [10.10.10.11] (item=/dev/GLUSTER_vg1/GLUSTER_lv1) => 
{"failed": true, "item": "/dev/GLUSTER_vg1/GLUSTER_lv1", "msg": 
"Device /dev/GLUSTER_vg1/GLUSTER_lv1 not found."}

to retry, use: --limit @/tmp/tmpezTsyO/fscreate.retry

PLAY RECAP 
**

10.10.10.11: ok=0changed=0  unreachable=0failed=1
10.10.10.12: ok=0changed=0  unreachable=0failed=1
10.10.10.13: ok=0changed=0  unreachable=0failed=1

Ignoring errors...

PLAY [gluster_servers] 
**


TASK [Create the backend disks, skips if present] 

Re: [Gluster-users] disperse volume brick counts limits in RHES

2017-05-08 Thread Xavier Hernandez

On 05/05/17 13:49, Pranith Kumar Karampuri wrote:



On Fri, May 5, 2017 at 2:38 PM, Serkan Çoban > wrote:

It is the over all time, 8TB data disk healed 2x faster in 8+2
configuration.


Wow, that is counter intuitive for me. I will need to explore about this
to find out why that could be. Thanks a lot for this feedback!


Matrix multiplication for encoding/decoding of 8+2 is 4 times faster 
than 16+4 (one matrix of 16x16 is composed by 4 submatrices of 8x8), 
however each matrix operation on a 16+4 configuration takes twice the 
amount of data of a 8+2, so net effect is that 8+2 is twice as fast as 16+4.


An 8+2 also uses bigger blocks on each brick, processing the same amount 
of data in less I/O operations and bigger network packets.


Probably these are the reasons why 16+4 is slower than 8+2.

See my other email for more detailed description.

Xavi





On Fri, May 5, 2017 at 10:00 AM, Pranith Kumar Karampuri
> wrote:
>
>
> On Fri, May 5, 2017 at 11:42 AM, Serkan Çoban
> wrote:
>>
>> Healing gets slower as you increase m in m+n configuration.
>> We are using 16+4 configuration without any problems other then heal
>> speed.
>> I tested heal speed with 8+2 and 16+4 on 3.9.0 and see that heals on
>> 8+2 is faster by 2x.
>
>
> As you increase number of nodes that are participating in an EC
set number
> of parallel heals increase. Is the heal speed you saw improved per
file or
> the over all time it took to heal the data?
>
>>
>>
>>
>> On Fri, May 5, 2017 at 9:04 AM, Ashish Pandey
> wrote:
>> >
>> > 8+2 and 8+3 configurations are not the limitation but just
suggestions.
>> > You can create 16+3 volume without any issue.
>> >
>> > Ashish
>> >
>> > 
>> > From: "Alastair Neil" >
>> > To: "gluster-users" >
>> > Sent: Friday, May 5, 2017 2:23:32 AM
>> > Subject: [Gluster-users] disperse volume brick counts limits in
RHES
>> >
>> >
>> > Hi
>> >
>> > we are deploying a large (24node/45brick) cluster and noted
that the
>> > RHES
>> > guidelines limit the number of data bricks in a disperse set to
8.  Is
>> > there
>> > any reason for this.  I am aware that you want this to be a
power of 2,
>> > but
>> > as we have a large number of nodes we were planning on going
with 16+3.
>> > Dropping to 8+2 or 8+3 will be a real waste for us.
>> >
>> > Thanks,
>> >
>> >
>> > Alastair
>> >
>> >
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org 
>> > http://lists.gluster.org/mailman/listinfo/gluster-users

>> >
>> >
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org 
>> > http://lists.gluster.org/mailman/listinfo/gluster-users

>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> http://lists.gluster.org/mailman/listinfo/gluster-users

>
>
>
>
> --
> Pranith




--
Pranith


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users