Re: [Gluster-devel] One client can effectively hang entire gluster array

2016-07-11 Thread Raghavendra G
On Fri, Jul 8, 2016 at 8:02 PM, Jeff Darcy  wrote:

> > In either of these situations, one glusterfsd process on whatever peer
> the
> > client is currently talking to will skyrocket to *nproc* cpu usage (800%,
> > 1600%) and the storage cluster is essentially useless; all other clients
> > will eventually try to read or write data to the overloaded peer and,
> when
> > that happens, their connection will hang. Heals between peers hang
> because
> > the load on the peer is around 1.5x the number of cores or more. This
> occurs
> > in either gluster 3.6 or 3.7, is very repeatable, and happens much too
> > frequently.
>
> I have some good news and some bad news.
>
> The good news is that features to address this are already planned for the
> 4.0 release.  Primarily I'm referring to QoS enhancements, some parts of
> which were already implemented for the bitrot daemon.  I'm still working
> out the exact requirements for this as a general facility, though.  You
> can help!  :)  Also, some of the work on "brick multiplexing" (multiple
> bricks within one glusterfsd process) should help to prevent the thrashing
> that causes a complete freeze-up.
>
> Now for the bad news.  Did I mention that these are 4.0 features?  4.0 is
> not near term, and not getting any nearer as other features and releases
> keep "jumping the queue" to absorb all of the resources we need for 4.0
> to happen.  Not that I'm bitter or anything.  ;)  To address your more
> immediate concerns, I think we need to consider more modest changes that
> can be completed in more modest time.  For example:
>
>  * The load should *never* get to 1.5x the number of cores.  Perhaps we
>could tweak the thread-scaling code in io-threads and epoll to check
>system load and not scale up (or even scale down) if system load is
>already high.
>
>  * We might be able to tweak io-threads (which already runs on the
>bricks and already has a global queue) to schedule requests in a
>fairer way across clients.  Right now it executes them in the
>same order that they were read from the network.


This sounds to be an easier fix. We can make io-threads to factor in
another input i.e., the client through which request came in (essentially
frame->root->client) before scheduling. That should make the problem
bearable at-least if not crippling. As to what algorithm to use, I think we
can consider leaky bucket of bit-rot implementation or dmclock. I've not
really thought deeper about the algorithm part. If the approach sounds ok,
we can discuss more about algos.

That tends to
>be a bit "unfair" and that should be fixed in the network code,
>but that's a much harder task.
>
> These are only weak approximations of what we really should be doing,
> and will be doing in the long term, but (without making any promises)
> they might be sufficient and achievable in the near term.  Thoughts?
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Raghavendra G
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [puzzle] readv operation allocate iobuf twice

2016-07-11 Thread Raghavendra Gowdappa


- Original Message -
> From: "Zhengping Zhou" 
> To: gluster-devel@gluster.org
> Sent: Tuesday, July 12, 2016 9:28:01 AM
> Subject: [Gluster-devel] [puzzle] readv operation allocate iobuf twice
> 
> Hi all:
> 
> It is a puzzle to me that we  allocate rsp buffers for rspond
> content in function client3_3_readv, but these rsp parameters hasn't
> ever been saved to struct saved_frame in submit procedure. 

Good catch :). We were aware of this issue, but the fix wasn't prioritized. Can 
you please file a bug on this? If you want to send a fix (which essentially 
stores the rsp payload ptr in saved-frame and passes it down during 
rpc_clnt_fill_request_info - as part of handling RPC_TRANSPORT_MAP_XID_REQUEST 
event in rpc-clnt), please post a patch to gerrit and I'll accept it. If you 
don't have bandwidth, one of us can send out a fix too.

Again, thanks for the effort :).

regards,
Raghavendra

> Which means
> the iobuf will reallocated by transport layer in function
> __socket_read_accepted_successful_reply.
> According to  the commnet of fucntion rpc_clnt_submit :
> 1. Both @rsp_hdr and @rsp_payload are optional.
> 2. The user of rpc_clnt_submit, if wants response hdr and payload in its
> own
> buffers, then it has to populate @rsphdr and @rsp_payload.
> 
> The rsp_payload  is optional, ransport layer will not reallocate
> rsp buffers if
> it populated. But the fact is readv operation will allocate rsp buffer twice.
> 
> Thanks
> Zhengping
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [puzzle] readv operation allocate iobuf twice

2016-07-11 Thread Zhengping Zhou
Hi all:

It is a puzzle to me that we  allocate rsp buffers for rspond
content in function client3_3_readv, but these rsp parameters hasn't
ever been saved to struct saved_frame in submit procedure. Which means
the iobuf will reallocated by transport layer in function
__socket_read_accepted_successful_reply.
According to  the commnet of fucntion rpc_clnt_submit :
1. Both @rsp_hdr and @rsp_payload are optional.
2. The user of rpc_clnt_submit, if wants response hdr and payload in its own
buffers, then it has to populate @rsphdr and @rsp_payload.

The rsp_payload  is optional, ransport layer will not reallocate
rsp buffers if
it populated. But the fact is readv operation will allocate rsp buffer twice.

Thanks
Zhengping
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Question on merging zfs snapshot support into the mainline glusterfs

2016-07-11 Thread sriram
Hi Rajesh,
 
Could you let us know the idea on how to go about this?
 
Sriram
 
 
On Wed, Jul 6, 2016, at 03:18 PM, Pranith Kumar Karampuri wrote:
> I believe Rajesh already has something here. May be he can post an
> outline so that we can take it from there?
>
> On Tue, Jul 5, 2016 at 10:52 PM,  wrote:
>> __
>> Hi,
>>
>> I tried to go through the patch and find the reason behind the
>> question posted. But could'nt get any concrete details about
>> the same.
>>
>> When going through the mail chain, there were mentions of generic
>> snapshot interface. I'd be interested in doing the changes if you
>> guys could fill me with some initial information. Thanks.
>>
>> Sriram
>>
>>
>> On Mon, Jul 4, 2016, at 01:59 PM, B.K.Raghuram wrote:
>>> Hi Rajesh,
>>> I did not want to respond to the question that you'd posed on the
>>> zfs snapshot code (about the volume backend backup) as I am not too
>>> familiar with the code and the person who's coded it is not with us
>>> anymore. This was done in bit of a hurry so it could be that it was
>>> just kept for later..
>>>
>>> However, Sriram who is cc'd on this email, has been helping us by
>>> starting to look at the gluster code  and has expressed an interest
>>> in taking the zfs code changes on. So he can probably dig out an
>>> answer to your question. Sriram, Rajesh had a question on one of the
>>> zfs related patches -
>>> (https://github.com/fractalio/glusterfs/commit/39a163eca338b6da146f72f380237abd4c671db2#commitcomment-18109851)
>>>
>>> Sriram is also interested in contributing to the process of creating
>>> a generic snapshot interface in the gluster code which you and
>>> Pranith mentioned above. If this is ok with you all, could you fill
>>> him in on what your thoughts are on that and how he could get
>>> started?
>>> Thanks!
>>> -Ram
>>>
>>> On Wed, Jun 22, 2016 at 11:45 AM, Rajesh Joseph 
>>> wrote:


 On Tue, Jun 21, 2016 at 4:24 PM, Pranith Kumar Karampuri
  wrote:
> hi,
>   Is there a plan to come up with an interface for snapshot
>   functionality? For example, in handling different types of
>   sockets in gluster all we need to do is to specify which
>   interface we want to use and ib,network-socket,unix-domain
>   sockets all implement the interface. The code doesn't have
>   to assume anything about underlying socket type. Do you guys
>   think it is a worthwhile effort to separate out the logic of
>   interface and the code which uses snapshots? I see quite a
>   few of if (strcmp ("zfs", fstype)) code which can all be
>   removed if we do this. Giving btrfs snapshots in future will
>   be a breeze as well, this way? All we need to do is
>   implementing snapshot interface using btrfs snapshot
>   commands. I am not talking about this patch per se. Just
>   wanted to seek your inputs about future plans for ease of
>   maintaining the feature.


 As I said in my previous mail this is in plan and we will be doing
 it. But due to other priorities this was not taken in yet.


>
>
> On Tue, Jun 21, 2016 at 11:46 AM, Atin Mukherjee
>  wrote:
>>
>>
>> On 06/21/2016 11:41 AM, Rajesh Joseph wrote:
>>  > What kind of locking issues you see? If you can provide some
>>  > more information I can be able to help you.
>>
>> That's related to stale lock issues on GlusterD which are there
>> in 3.6.1 since the fixes landed in the branch post 3.6.1. I have
>> already provided the workaround/way to fix them [1]
>>
>>  
>> [1]http://www.gluster.org/pipermail/gluster-users/2016-June/thread.html#26995
>>
>>  ~Atin
>>
>> ___
>>  Gluster-devel mailing list Gluster-devel@gluster.org
>>  http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> Pranith
>

>>
>
>
>
> --
> Pranith
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Please review MAINTAINERS file

2016-07-11 Thread Nigel Babu
Yep. Open for at least another week. Go right ahead.

On Mon, Jul 11, 2016 at 8:02 PM, M S Vishwanath Bhat 
wrote:

>
>
> On 8 July 2016 at 00:06, Nigel Babu  wrote:
>
>> Hello folks,
>>
>> Please if you get a minute today, review the MAINTAINERS file. Make sure
>> that
>> the components owned by you actually has the right people listed as
>> maintainers. If you have new co-maintainers, please add their name in. If
>> your
>> component isn't listed there, please make sure to list it.
>
>
> I apologise for the delay in response (I was on holidays for half of last
> week). I need to add test component and add my and rtalur's name to it.
> This is still open right? I will send the patch tomorrow morning IST.
>
> Best Regards,
> Vishwanath
>
>
>
>
>>   I'm going to be
>> working off this data for bug 1350477.
>>
>> --
>> nigelb
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>
>


-- 
nigelb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Please review MAINTAINERS file

2016-07-11 Thread M S Vishwanath Bhat
On 8 July 2016 at 00:06, Nigel Babu  wrote:

> Hello folks,
>
> Please if you get a minute today, review the MAINTAINERS file. Make sure
> that
> the components owned by you actually has the right people listed as
> maintainers. If you have new co-maintainers, please add their name in. If
> your
> component isn't listed there, please make sure to list it.


I apologise for the delay in response (I was on holidays for half of last
week). I need to add test component and add my and rtalur's name to it.
This is still open right? I will send the patch tomorrow morning IST.

Best Regards,
Vishwanath




>   I'm going to be
> working off this data for bug 1350477.
>
> --
> nigelb
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [bug] systemd complains about execution bit set on .service file

2016-07-11 Thread Niels de Vos
On Mon, Jul 11, 2016 at 01:41:03PM +0200, Niels de Vos wrote:
> On Mon, Jul 11, 2016 at 02:12:46PM +0300, Sergej Pupykin wrote:
> > 
> > Hello,
> > 
> > could someone replace $(INSTALL_PROGRAM) with some command which does
> > not set execution permission on installed file?
> > 
> > glusterfs-3.7.11/extras/systemd/Makefile.am
> > >> >$(INSTALL_PROGRAM) glusterd.service $(DESTDIR)$(SYSTEMD_DIR)/
> > 
> > See also: https://bugs.archlinux.org/task/50001
> 
> Thanks for reporting, I'll have a look at it.

The 3.7 bug for this is https://bugzilla.redhat.com/1354476

Once the patch for the master branch has been merged, I'll backport it
to 3.8 and 3.7:
  http://review.gluster.org/14892

Niels


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] How to solve the FSYNC() ERR

2016-07-11 Thread Keiviw
Yes, I just used a text editor i.e. vi to test this out, but my real 
application is surveillance camera system. The video was stored in GlusterFS by 
FUSE and Samba. 
Because of the surveillance camera system 7*24 working mode, it doesn't 
have to cache the files. So I added O_DIRECT flag in GlusterFS client protocol 
open() and create(). According my test, I wrote a video file into GlusterFS in 
windows, then I didn't find any cache in GlusterFS node, and I read a video 
file in windows, it also wasn't cached in GlusterFS node, but some frames in 
the video file(at the last) was lost, just like my vi test. 
IIRC the page-alighed may be the reason of this question. How to solve this 
problem? Or in GlusterFS 3.6, 3.7, is there the option to enable io-direct mode 
to avoid the video cache in GlusterFS node?







At 2016-07-11 18:51:02, "Krutika Dhananjay"  wrote:

What's the application you are running? Sounds like you're using a text editor 
like vi(m) to test this out?
Is the application opening the files with O_DIRECT?

Do you have the strace output of the running application that confirms that it 
is open()ing the file with O_DIRECT?

Also, what are the offsets and sizes of the writes on this file by this 
application in the strace output?


-Krutika





On Mon, Jul 11, 2016 at 2:44 PM, Keiviw  wrote:

I have checked the page-aligned, i.e. the file was larger than one page, a part 
of the file(one page size) was saved successfully, and the rest(more than one 
page but less than two pages) was lost.







At 2016-07-11 12:53:32, "Pranith Kumar Karampuri"  wrote:

Is it possible to share the test you are running? As per your volume, o-direct 
is not enabled on your volume, i.e. the file shouldn't be opened with o-direct 
but as per the logs it is giving Invalid Argument as if there is something 
wrong with the arguments when we do o-direct write with wrong size. so I would 
like to test out why exactly is it giving this problem. Please note that for 
o-direct write to succeed, both offset and size should be page-aligned, 
something like multiple of 512 is one way to check it.



On Sun, Jul 10, 2016 at 5:19 PM, Keiviw  wrote:

My volume info:

Volume Name: test
Type: Distribute
Volume ID: 9294b122-d81e-4b12-9b5c-46e89ee0e40b
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: compute2:/home/brick1
Brick2: compute2:/home/brick2
Options Reconfigured:
performance.flush-behind: off
storage.linux-aio: off
My brick logs(I have cleaned up the history log):
[2016-07-10 11:42:50.577683] E [posix.c:2128:posix_writev] 0-test-posix: 
write failed: offset 0, Invalid argument
[2016-07-10 11:42:50.577735] I [server3_1-fops.c:1414:server_writev_cbk] 
0-test-server: 8569840: WRITEV 5 (526a3118-9994-429e-afc0-4aa063606bde) ==> -1 
(Invalid argument)
[2016-07-10 11:42:54.583038] E [posix.c:2128:posix_writev] 0-test-posix: 
write failed: offset 0, Invalid argument
[2016-07-10 11:42:54.583080] I [server3_1-fops.c:1414:server_writev_cbk] 
0-test-server: 8569870: WRITEV 5 (c3d28f34-8f43-446d-8d0b-80841ae8ec5b) ==> -1 
(Invalid argument)
My mnt-test-.logs:
[2016-07-10 11:42:50.577816] W [client3_1-fops.c:876:client3_1_writev_cbk] 
0-test-client-1: remote operation failed: Invalid argument
[2016-07-10 11:42:50.578508] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 12398282: FSYNC() ERR => -1 (Invalid argument)
[2016-07-10 11:42:54.583156] W [client3_1-fops.c:876:client3_1_writev_cbk] 
0-test-client-1: remote operation failed: Invalid argument
[2016-07-10 11:42:54.583762] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 12398317: FSYNC() ERR => -1 (Invalid argument)









在 2016-07-10 19:18:18,"Krutika Dhananjay"  写道:


To me it looks like a case of a flush triggering a write() that was cached by 
write-behind and because the write buffer

did not meet the page alignment requirement with o-direct write, it was failed 
with EINVAL and the trigger fop - i.e., flush() was failed with the 'Invalid 
argument' error code.


Could you attach the brick logs as well, so that we can confirm the theory?



-Krutika


On Sat, Jul 9, 2016 at 9:31 PM, Atin Mukherjee  wrote:
Pranith/Krutika,


Your inputs please, IIRC we'd need to turn on some o_direct option here?



On Saturday 9 July 2016, Keiviw  wrote:

The errors also occured in GlusterFS 3.6.7,I just add the O_DIRECT flag in 
client protocol open() and create()! How to explain and solve the problem?


发自 网易邮箱大师
On 07/09/2016 17:58, Atin Mukherjee wrote:
Any specific reason of using 3.3 given that its really quite old? We are at 
3.6, 3.7 & 3.8 supportability matrix now.


On Saturday 9 July 2016, Keiviw  wrote:

hi,
I have installed GlusterFS 3.3.0, and now I get Fsync failures when saving 
files with 

Re: [Gluster-devel] [bug] systemd complains about execution bit set on .service file

2016-07-11 Thread Niels de Vos
On Mon, Jul 11, 2016 at 02:12:46PM +0300, Sergej Pupykin wrote:
> 
> Hello,
> 
> could someone replace $(INSTALL_PROGRAM) with some command which does
> not set execution permission on installed file?
> 
> glusterfs-3.7.11/extras/systemd/Makefile.am
> >> >$(INSTALL_PROGRAM) glusterd.service $(DESTDIR)$(SYSTEMD_DIR)/
> 
> See also: https://bugs.archlinux.org/task/50001

Thanks for reporting, I'll have a look at it.

Niels


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [bug] systemd complains about execution bit set on .service file

2016-07-11 Thread Sergej Pupykin

Hello,

could someone replace $(INSTALL_PROGRAM) with some command which does
not set execution permission on installed file?

glusterfs-3.7.11/extras/systemd/Makefile.am
>> >$(INSTALL_PROGRAM) glusterd.service $(DESTDIR)$(SYSTEMD_DIR)/

See also: https://bugs.archlinux.org/task/50001
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] How to solve the FSYNC() ERR

2016-07-11 Thread Krutika Dhananjay
What's the application you are running? Sounds like you're using a text
editor like vi(m) to test this out?
Is the application opening the files with O_DIRECT?
Do you have the strace output of the running application that confirms that
it is open()ing the file with O_DIRECT?
Also, what are the offsets and sizes of the writes on this file by this
application in the strace output?

-Krutika


On Mon, Jul 11, 2016 at 2:44 PM, Keiviw  wrote:

> I have checked the page-aligned, i.e. the file was larger than one page, a
> part of the file(one page size) was saved successfully, and the rest(more
> than one page but less than two pages) was lost.
>
>
>
>
>
>
> At 2016-07-11 12:53:32, "Pranith Kumar Karampuri" 
> wrote:
>
> Is it possible to share the test you are running? As per your volume,
> o-direct is not enabled on your volume, i.e. the file shouldn't be opened
> with o-direct but as per the logs it is giving Invalid Argument as if there
> is something wrong with the arguments when we do o-direct write with wrong
> size. so I would like to test out why exactly is it giving this problem.
> Please note that for o-direct write to succeed, both offset and size should
> be page-aligned, something like multiple of 512 is one way to check it.
>
> On Sun, Jul 10, 2016 at 5:19 PM, Keiviw  wrote:
>
>> My volume info:
>>
>> Volume Name: test
>> Type: Distribute
>> Volume ID: 9294b122-d81e-4b12-9b5c-46e89ee0e40b
>> Status: Started
>> Number of Bricks: 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: compute2:/home/brick1
>> Brick2: compute2:/home/brick2
>> Options Reconfigured:
>> performance.flush-behind: off
>> storage.linux-aio: off
>> My brick logs(I have cleaned up the history log):
>> [2016-07-10 11:42:50.577683] E [posix.c:2128:posix_writev]
>> 0-test-posix: write failed: offset 0, Invalid argument
>> [2016-07-10 11:42:50.577735] I
>> [server3_1-fops.c:1414:server_writev_cbk] 0-test-server: 8569840: WRITEV 5
>> (526a3118-9994-429e-afc0-4aa063606bde) ==> -1 (Invalid argument)
>> [2016-07-10 11:42:54.583038] E [posix.c:2128:posix_writev]
>> 0-test-posix: write failed: offset 0, Invalid argument
>> [2016-07-10 11:42:54.583080] I
>> [server3_1-fops.c:1414:server_writev_cbk] 0-test-server: 8569870: WRITEV 5
>> (c3d28f34-8f43-446d-8d0b-80841ae8ec5b) ==> -1 (Invalid argument)
>> My mnt-test-.logs:
>> [2016-07-10 11:42:50.577816] W
>> [client3_1-fops.c:876:client3_1_writev_cbk] 0-test-client-1: remote
>> operation failed: Invalid argument
>> [2016-07-10 11:42:50.578508] W [fuse-bridge.c:968:fuse_err_cbk]
>> 0-glusterfs-fuse: 12398282: FSYNC() ERR => -1 (Invalid argument)
>> [2016-07-10 11:42:54.583156] W
>> [client3_1-fops.c:876:client3_1_writev_cbk] 0-test-client-1: remote
>> operation failed: Invalid argument
>> [2016-07-10 11:42:54.583762] W [fuse-bridge.c:968:fuse_err_cbk]
>> 0-glusterfs-fuse: 12398317: FSYNC() ERR => -1 (Invalid argument)
>>
>>
>>
>>
>>
>>
>> 在 2016-07-10 19:18:18,"Krutika Dhananjay"  写道:
>>
>>
>> To me it looks like a case of a flush triggering a write() that was
>> cached by write-behind and because the write buffer
>> did not meet the page alignment requirement with o-direct write, it was
>> failed with EINVAL and the trigger fop - i.e., flush() was failed with the
>> 'Invalid argument' error code.
>>
>> Could you attach the brick logs as well, so that we can confirm the
>> theory?
>>
>> -Krutika
>>
>> On Sat, Jul 9, 2016 at 9:31 PM, Atin Mukherjee 
>> wrote:
>>
>>> Pranith/Krutika,
>>>
>>> Your inputs please, IIRC we'd need to turn on some o_direct option here?
>>>
>>>
>>> On Saturday 9 July 2016, Keiviw  wrote:
>>>
 The errors also occured in GlusterFS 3.6.7,I just add the O_DIRECT flag
 in client protocol open() and create()! How to explain and solve the
 problem?

 发自 网易邮箱大师 
 On 07/09/2016 17:58, Atin Mukherjee wrote:

 Any specific reason of using 3.3 given that its really quite old? We
 are at 3.6, 3.7 & 3.8 supportability matrix now.


 On Saturday 9 July 2016, Keiviw  wrote:

> hi,
> I have installed GlusterFS 3.3.0, and now I get Fsync failures
> when saving files with the O_DIRECT flag in open() and create().
> 1, I tried to save a flie in vi and got this error:
> "test" E667: Fsync failed
> 2, I see this in the logs:
> [2016-07-07 14:20:10.325400] W
> [fuse-bridge.c:968:fuse_err_cbk] 0-glusterfs-fuse: 102: FSYNC() 
> ERR
> => -1 (Invalid argument)
> [2016-07-07 14:20:13.930384] W
> [fuse-bridge.c:968:fuse_err_cbk] 0-glusterfs-fuse: 137: FSYNC() 
> ERR
> => -1 (Invalid argument)
> [2016-07-07 14:20:51.199448] W
> [fuse-bridge.c:968:fuse_err_cbk] 0-glusterfs-fuse: 174: FLUSH() 

[Gluster-devel] Jenkins results for devrpms 404ing

2016-07-11 Thread Nigel Babu
Hello folks,

A lot of older devrpm jobs will 404 because I was only saving 10 latest jobs.
I thought since we didn't consume the rpms, I could safely delete them. Turns
out we do want the logs. I've updated the job so we keep the last 50 jobs. This
won't fix the jobs that already 404, but at least it shouldn't happen anymore.

--
nigelb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] How to solve the FSYNC() ERR

2016-07-11 Thread Keiviw
I have checked the page-aligned, i.e. the file was larger than one page, a part 
of the file(one page size) was saved successfully, and the rest(more than one 
page but less than two pages) was lost.






At 2016-07-11 12:53:32, "Pranith Kumar Karampuri"  wrote:

Is it possible to share the test you are running? As per your volume, o-direct 
is not enabled on your volume, i.e. the file shouldn't be opened with o-direct 
but as per the logs it is giving Invalid Argument as if there is something 
wrong with the arguments when we do o-direct write with wrong size. so I would 
like to test out why exactly is it giving this problem. Please note that for 
o-direct write to succeed, both offset and size should be page-aligned, 
something like multiple of 512 is one way to check it.



On Sun, Jul 10, 2016 at 5:19 PM, Keiviw  wrote:

My volume info:

Volume Name: test
Type: Distribute
Volume ID: 9294b122-d81e-4b12-9b5c-46e89ee0e40b
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: compute2:/home/brick1
Brick2: compute2:/home/brick2
Options Reconfigured:
performance.flush-behind: off
storage.linux-aio: off
My brick logs(I have cleaned up the history log):
[2016-07-10 11:42:50.577683] E [posix.c:2128:posix_writev] 0-test-posix: 
write failed: offset 0, Invalid argument
[2016-07-10 11:42:50.577735] I [server3_1-fops.c:1414:server_writev_cbk] 
0-test-server: 8569840: WRITEV 5 (526a3118-9994-429e-afc0-4aa063606bde) ==> -1 
(Invalid argument)
[2016-07-10 11:42:54.583038] E [posix.c:2128:posix_writev] 0-test-posix: 
write failed: offset 0, Invalid argument
[2016-07-10 11:42:54.583080] I [server3_1-fops.c:1414:server_writev_cbk] 
0-test-server: 8569870: WRITEV 5 (c3d28f34-8f43-446d-8d0b-80841ae8ec5b) ==> -1 
(Invalid argument)
My mnt-test-.logs:
[2016-07-10 11:42:50.577816] W [client3_1-fops.c:876:client3_1_writev_cbk] 
0-test-client-1: remote operation failed: Invalid argument
[2016-07-10 11:42:50.578508] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 12398282: FSYNC() ERR => -1 (Invalid argument)
[2016-07-10 11:42:54.583156] W [client3_1-fops.c:876:client3_1_writev_cbk] 
0-test-client-1: remote operation failed: Invalid argument
[2016-07-10 11:42:54.583762] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 12398317: FSYNC() ERR => -1 (Invalid argument)









在 2016-07-10 19:18:18,"Krutika Dhananjay"  写道:


To me it looks like a case of a flush triggering a write() that was cached by 
write-behind and because the write buffer

did not meet the page alignment requirement with o-direct write, it was failed 
with EINVAL and the trigger fop - i.e., flush() was failed with the 'Invalid 
argument' error code.


Could you attach the brick logs as well, so that we can confirm the theory?



-Krutika


On Sat, Jul 9, 2016 at 9:31 PM, Atin Mukherjee  wrote:
Pranith/Krutika,


Your inputs please, IIRC we'd need to turn on some o_direct option here?



On Saturday 9 July 2016, Keiviw  wrote:

The errors also occured in GlusterFS 3.6.7,I just add the O_DIRECT flag in 
client protocol open() and create()! How to explain and solve the problem?


发自 网易邮箱大师
On 07/09/2016 17:58, Atin Mukherjee wrote:
Any specific reason of using 3.3 given that its really quite old? We are at 
3.6, 3.7 & 3.8 supportability matrix now.


On Saturday 9 July 2016, Keiviw  wrote:

hi,
I have installed GlusterFS 3.3.0, and now I get Fsync failures when saving 
files with the O_DIRECT flag in open() and create().
1, I tried to save a flie in vi and got this error:
"test" E667: Fsync failed
2, I see this in the logs:
[2016-07-07 14:20:10.325400] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 102: FSYNC() ERR => -1 (Invalid argument)
[2016-07-07 14:20:13.930384] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 137: FSYNC() ERR => -1 (Invalid argument)
[2016-07-07 14:20:51.199448] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 174: FLUSH() ERR => -1 (Invalid argument)
[2016-07-07 14:21:32.804738] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 206: FLUSH() ERR => -1 (Invalid argument)
[2016-07-07 14:21:43.702146] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 276: FSYNC() ERR => -1 (Invalid argument)
[2016-07-07 14:21:51.296809] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 314: FSYNC() ERR => -1 (Invalid argument)
[2016-07-07 14:21:54.062687] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 349: FSYNC() ERR => -1 (Invalid argument)
[2016-07-07 14:22:54.678960] W [fuse-bridge.c:968:fuse_err_cbk] 
0-glusterfs-fuse: 429: FSYNC() ERR => -1 (Invalid argument)
[2016-07-07 14:24:35.546980] W [fuse-bridge.c:968:fuse_err_cbk] 

[Gluster-devel] tests/bugs/glusterd/bug-963541.t constantly failing on local setup

2016-07-11 Thread Atin Mukherjee
While looking at the regression failure of [1] I ran the test mentioned in
$Subj in my local set up and I could see it failed 5 out of 5 times.

The tests at line number 24 & 25 try to start and stop the rebalance, but
the later fails because by that time rebalance process still hasn't come up
because glusterd spawns it with nowait from runner interface.

@Sakshi, as discussed could just work on the fix for this? If we are seeing
too many regression failures because of this test, we can have this marked
as bad test for this interim period.

~Atin


[1]
https://build.gluster.org/job/rackspace-regression-2GB-triggered/22090/console
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Glusterfs-3.7.13 release plans

2016-07-11 Thread Raghavendra Gowdappa


- Original Message -
> From: "Oleksandr Natalenko" 
> To: "Kaushal M" 
> Cc: "Raghavendra Gowdappa" , maintain...@gluster.org, 
> "Gluster Devel"
> 
> Sent: Friday, July 8, 2016 6:31:57 PM
> Subject: Re: [Gluster-devel] [Gluster-Maintainers] Glusterfs-3.7.13 release 
> plans
> 
> Does this issue have some fix pending, or there is just bugreport?

We've an RCA that strongly points out the issue. I'll be working towards 
testing the hypothesis and sending out a fix.

> 
> 08.07.2016 15:12, Kaushal M написав:
> > On Fri, Jul 8, 2016 at 2:22 PM, Raghavendra Gowdappa
> >  wrote:
> >> There seems to be a major inode leak in fuse-clients:
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1353856
> >> 
> >> We have found an RCA through code reading (though have a high
> >> confidence on the RCA). Do we want to include this in 3.7.13?
> > 
> > I'm not going to be delaying the release anymore. I'll be adding this
> > issue into the release-notes as a known-issue.
> > 
> >> 
> >> regards,
> >> Raghavendra.
> >> 
> >> - Original Message -
> >>> From: "Kaushal M" 
> >>> To: "Pranith Kumar Karampuri" 
> >>> Cc: maintain...@gluster.org, "Gluster Devel"
> >>> 
> >>> Sent: Friday, July 8, 2016 11:51:11 AM
> >>> Subject: Re: [Gluster-Maintainers] Glusterfs-3.7.13 release plans
> >>> 
> >>> On Fri, Jul 8, 2016 at 9:59 AM, Pranith Kumar Karampuri
> >>>  wrote:
> >>> > Could you take in http://review.gluster.org/#/c/14598/ as well? It is
> >>> > ready
> >>> > for merge.
> >>> >
> >>> > On Thu, Jul 7, 2016 at 3:02 PM, Atin Mukherjee 
> >>> > wrote:
> >>> >>
> >>> >> Can you take in http://review.gluster.org/#/c/14861 ?
> >>> 
> >>> Can you get one of the maintainers to give it a +2?
> >>> 
> >>> >>
> >>> >>
> >>> >> On Thursday 7 July 2016, Kaushal M  wrote:
> >>> >>>
> >>> >>> On Thu, Jun 30, 2016 at 11:08 AM, Kaushal M 
> >>> >>> wrote:
> >>> >>> > Hi all,
> >>> >>> >
> >>> >>> > I'm (or was) planning to do a 3.7.13 release on schedule today.
> >>> >>> > 3.7.12
> >>> >>> > has a huge issue with libgfapi, solved by [1].
> >>> >>> > I'm not sure if this fixes the other issues with libgfapi noticed
> >>> >>> > by
> >>> >>> > Lindsay on gluster-users.
> >>> >>> >
> >>> >>> > This patch has been included in the packages 3.7.12 built for
> >>> >>> > CentOS,
> >>> >>> > Fedora, Ubuntu, Debian and SUSE. I guess Lindsay is using one of
> >>> >>> > these
> >>> >>> > packages, so it might be that the issue seen is new. So I'd like to
> >>> >>> > do
> >>> >>> > a quick release once we have a fix.
> >>> >>> >
> >>> >>> > Maintainers can merge changes into release-3.7 that follow the
> >>> >>> > criteria given in [2]. Please make sure to add the bugs for patches
> >>> >>> > you are merging are added as dependencies for the 3.7.13 tracker
> >>> >>> > bug
> >>> >>> > [3].
> >>> >>> >
> >>> >>>
> >>> >>> I've just merged the fix for the gfapi breakage into release-3.7, and
> >>> >>> hope to tag 3.7.13 soon.
> >>> >>>
> >>> >>> The current head for release-3.7 is commit bddf6f8. 18 patches have
> >>> >>> been merged since 3.7.12 for the following components,
> >>> >>>  - gfapi
> >>> >>>  - nfs (includes ganesha related changes)
> >>> >>>  - glusterd/cli
> >>> >>>  - libglusterfs
> >>> >>>  - fuse
> >>> >>>  - build
> >>> >>>  - geo-rep
> >>> >>>  - afr
> >>> >>>
> >>> >>> I need and acknowledgement from the maintainers of the above
> >>> >>> components that they are ready.
> >>> >>> If any maintainers know of any other issues, please reply here. We'll
> >>> >>> decide how to address them for this release here.
> >>> >>>
> >>> >>> Also, please don't merge anymore changes into release-3.7. If you
> >>> >>> need
> >>> >>> to get something merged, please inform me.
> >>> >>>
> >>> >>> Thanks,
> >>> >>> Kaushal
> >>> >>>
> >>> >>> > Thanks,
> >>> >>> > Kaushal
> >>> >>> >
> >>> >>> > [1]: https://review.gluster.org/14822
> >>> >>> > [2]: https://public.pad.fsfe.org/p/glusterfs-release-process-201606
> >>> >>> > under the GlusterFS minor release heading
> >>> >>> > [3]: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.7.13
> >>> >>> ___
> >>> >>> maintainers mailing list
> >>> >>> maintain...@gluster.org
> >>> >>> http://www.gluster.org/mailman/listinfo/maintainers
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Atin
> >>> >> Sent from iPhone
> >>> >>
> >>> >> ___
> >>> >> maintainers mailing list
> >>> >> maintain...@gluster.org
> >>> >> http://www.gluster.org/mailman/listinfo/maintainers
> >>> >>
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Pranith
> >>> ___
> >>> maintainers mailing list
> >>> maintain...@gluster.org
> >>> 

Re: [Gluster-devel] Thank You!

2016-07-11 Thread Karthik Subrahmanya


- Original Message -
> From: "Raghavendra Gowdappa" 
> To: "Karthik Subrahmanya" 
> Cc: "Gluster Devel" , josephau...@gmail.com, 
> "vivek sb agarwal"
> , "Vijaikumar Mallikarjuna" 
> 
> Sent: Monday, July 11, 2016 11:34:41 AM
> Subject: Re: [Gluster-devel] Thank You!
> 
> 
> 
> - Original Message -
> > From: "Karthik Subrahmanya" 
> > To: "Gluster Devel" , josephau...@gmail.com,
> > "vivek sb agarwal"
> > , "Vijaikumar Mallikarjuna"
> > 
> > Sent: Saturday, July 9, 2016 9:05:13 PM
> > Subject: [Gluster-devel] Thank You!
> > 
> > Hi all,
> > 
> > I am a intern joined on 11th of January 2016, and worked on the
> > WORM/Retention feature for GlusterFS. It is released as an
> > experimental feature with the GlusterFS v3.8. The blog post on
> > the feature is published on "Planet Gluster" [1] and
> > "blog.gluster.org" [2].
> > 
> > Monday 11th July 2016 I am getting converted as "Associate Software
> > Engineer".
> 
> Awesome!! Where is the treat :)? Congratulations and welcome. Hope you'll
> find enough ways to contribute constructively.
Thank you :)

~Karthik
> 
> > I would like to take this opportunity to thank all of you for all your
> > valuable
> > guidance, support and help during this period. I hope you will guide me in
> > my future works, correct me when I am wrong and help me top learn more.
> > Thank you all.
> > 
> > [1] http://planet.gluster.org/
> > [2]
> > https://blog.gluster.org/2016/07/worm-write-once-read-multiple-retention-and-compliance-2/
> > 
> > Regards,
> > Karthik
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> > 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel