[Gluster-devel] Emergency downtime for review.gluster.org and build.gluster.org

2016-09-23 Thread Nigel Babu
Hello folks,

We need to urgently migrate review.gluster.org and build.gluster.org to a new
physical server. We will be having the following emergency maintenance window
on Saturday 24th September (tomorrow) and Sunday 25th September:

0900 to 1500 UTC
1430 to 2030 IST
0500 to 1100 EDT

If all goes well on Saturday, we will not use the window on Sunday.

Both these hosts reside on formicary.gluster.org. We're migrating with another
project's server rack, hence the urgency. This weekend is the safest time to do
this migration. Next two weekends, many of us will be traveling to and from
Berlin. The later we do this migration, the closer it puts us to the 3.9
release. After this move, all our servers will be in their permanent home in
the community cage. We hope there will be less interruptions and unplanned
issues going forward.

We will alert the lists when the migration starts and ends.

--
nigelb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Fixing setfsuid/gid problems in posix xlator

2016-09-23 Thread Pranith Kumar Karampuri
On Fri, Sep 23, 2016 at 6:12 PM, Jeff Darcy  wrote:

> > Jiffin found an interesting problem in posix xlator where we have never
> been
> > using setfsuid/gid ( http://review.gluster.org/#/c/15545/ ), what I am
> > seeing regressions after this is, if the files are created using non-root
> > user then the file creation fails because that user doesn't have
> permissions
> > to create the gfid-link. So it seems like the correct way forward for
> this
> > patch is to write wrappers around sys_ to do setfsuid/gid do the
> > actual operation requested and then set it back to old uid/gid and then
> do
> > the internal operations. I am planning to write posix_sys_() to
> do
> > the same, may be a macro?
>
> Kind of an aside, but I'd prefer to see a lot fewer macros in our code.
> They're not type-safe, and multi-line macros often mess up line numbers for
> debugging or error messages.  IMO it's better to use functions whenever
> possible, and usually to let the compiler worry about how/when to inline.
>
> > I need inputs from you guys to let me know if I am on the right path and
> if
> > you see any issues with this approach.
>
> I think there's a bit of an interface problem here.  The sys_xxx wrappers
> don't have arguments that point to the current frame, so how would they get
> the correct uid/gid?  We could add arguments to each function, but then
> we'd have to modify every call.  This includes internal calls which don't
> have a frame to pass, so I guess they'd have to pass NULL.  Alternatively,
> we could create a parallel set of functions with frame pointers.  Contrary
> to what I just said above, this might be a case where macros make sense:
>
>int
>sys_writev_fp (call_frame_t *frame, int fd, void *buf, size_t len)
>{
>   if (frame) { setfsuid(...) ... }
>   int ret = writev (fd, buf, len);
>   if (frame) { setfsuid(...) ... }
>   return ret;
>}
>#define sys_writev(fd,buf,len) sys_writev_fp (NULL, fd, buf, len)
>
> That way existing callers don't have to change, but posix can use the
> extended versions to get the right setfsuid behavior.
>
>
After trying to do these modifications to test things out, I am now under
the impression to remove setfsuid/gid altogether and depend on posix-acl
for permission checks. It seems too cumbersome as the operations more often
than not happen on files inside .glusterfs and non-root users/groups don't
have permissions at all to access files in that directory.


-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Tendrl frontend tools selection/discussion

2016-09-23 Thread Soumya Deb
Hi,

As you may know, we're moving ahead with Tendrl's development (the upstream
of Red Hat Storage Console). As part of the tendrl frontend dev team, I
wanted to share the tool-stack we're discussing/deciding to build the
web-application around. Sharing it with the peer teams, so that it can have
more eyes in the process, and gather feedback as early as possible.

Here's the tracking issue that documents all the selections (being) made:
https://github.com/Tendrl/ui/issues/5

To follow Tendrl developer discussions (for the ones interested) please
subscribe to the upstream list:
https://www.redhat.com/mailman/listinfo/tendrl-devel

Thanks,
Deb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] relative ordering of writes to same file from two different fds

2016-09-23 Thread Jeff Darcy
> > write-behind: implement causal ordering and other cleanup
> 
> > Rules of causal ordering implemented:¬
> 
> > - If request A arrives after the acknowledgement (to the app,¬
> 
> > i.e, STACK_UNWIND) of another request B, then request B is¬
> 
> > said to have 'caused' request A.¬
> 
>
> With the above principle, two write requests (p1 and p2 in example above)
> issued by _two different threads/processes_ there need _not always_ be a
> 'causal' relationship (whether there is a causal relationship is purely
> based on the "chance" that write-behind chose to ack one/both of them and
> their timing of arrival).

I think this is an issue of terminology.  While it's not *certain* that B
(or p1) caused A (or p2), it's *possible*.  Contrast with the case where
they overlap, which could not possibly happen if the application were
trying to ensure order.  In the distributed-system literature, this is
often referred to as a causal relationship even though it's really just
the possibility of one, because in most cases even the possibility means
that reordering would be unacceptable.

> So, current write-behind is agnostic to the
> ordering of p1 and p2 (when done by two threads).
>
> However if p1 and p2 are issued by same thread there is _always_ a causal
> relationship (p2 being caused by p1).

See above.  If we feel bound to respect causal relationships, we have to
be pessimistic and assume that wherever such a relationship *could* exist
it *does* exist.  However, as I explained in my previous message, I don't
think it's practical to provide such a guarantee across multiple clients,
and if we don't provide it across multiple clients then it's not worth
much to provide it on a single client.  Applications that require such
strict ordering shouldn't use write-behind, or should explicitly flush
between writes.  Otherwise they'll break unexpectedly when parts are
distributed across multiple nodes.  Assuming that everything runs on one
node is the same mistake POSIX makes.  The assumption was appropriate
for an earlier era, but not now for a decade or more.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Fixing setfsuid/gid problems in posix xlator

2016-09-23 Thread Jeff Darcy
> Jiffin found an interesting problem in posix xlator where we have never been
> using setfsuid/gid ( http://review.gluster.org/#/c/15545/ ), what I am
> seeing regressions after this is, if the files are created using non-root
> user then the file creation fails because that user doesn't have permissions
> to create the gfid-link. So it seems like the correct way forward for this
> patch is to write wrappers around sys_ to do setfsuid/gid do the
> actual operation requested and then set it back to old uid/gid and then do
> the internal operations. I am planning to write posix_sys_() to do
> the same, may be a macro?

Kind of an aside, but I'd prefer to see a lot fewer macros in our code.
They're not type-safe, and multi-line macros often mess up line numbers for
debugging or error messages.  IMO it's better to use functions whenever
possible, and usually to let the compiler worry about how/when to inline.

> I need inputs from you guys to let me know if I am on the right path and if
> you see any issues with this approach.

I think there's a bit of an interface problem here.  The sys_xxx wrappers
don't have arguments that point to the current frame, so how would they get
the correct uid/gid?  We could add arguments to each function, but then
we'd have to modify every call.  This includes internal calls which don't
have a frame to pass, so I guess they'd have to pass NULL.  Alternatively,
we could create a parallel set of functions with frame pointers.  Contrary
to what I just said above, this might be a case where macros make sense:

   int
   sys_writev_fp (call_frame_t *frame, int fd, void *buf, size_t len)
   {
  if (frame) { setfsuid(...) ... }
  int ret = writev (fd, buf, len);
  if (frame) { setfsuid(...) ... }
  return ret;
   }
   #define sys_writev(fd,buf,len) sys_writev_fp (NULL, fd, buf, len)

That way existing callers don't have to change, but posix can use the
extended versions to get the right setfsuid behavior.

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] relative ordering of writes to same file from two different fds

2016-09-23 Thread Raghavendra G
On Wed, Sep 21, 2016 at 10:58 PM, Raghavendra Talur 
wrote:

>
>
> On Wed, Sep 21, 2016 at 6:32 PM, Ric Wheeler  wrote:
>
>> On 09/21/2016 08:06 AM, Raghavendra Gowdappa wrote:
>>
>>> Hi all,
>>>
>>> This mail is to figure out the behavior of write to same file from two
>>> different fds. As Ryan quotes in one of comments,
>>>
>>> 
>>>
>>> I think it’s not safe. in this case:
>>> 1. P1 write to F1 use FD1
>>> 2. after P1 write finish, P2 write to the same place use FD2
>>> since they are not conflict with each other now, the order the 2 writes
>>> send to underlying fs is not determined. so the final data may be P1’s or
>>> P2’s.
>>> this semantics is not the same with linux buffer io. linux buffer io
>>> will make the second write cover the first one, this is to say the final
>>> data is P2’s.
>>> you can see it from linux NFS (as we are all network filesystem)
>>> fs/nfs/file.c:nfs_write_begin(), nfs will flush ‘incompatible’ request
>>> first before another write begin. the way 2 request is determine to be
>>> ‘incompatible’ is that they are from 2 different open fds.
>>> I think write-behind behaviour should keep the same with linux page
>>> cache.
>>>
>>> 
>>>
>>
>> I think that how this actually would work is that both would be written
>> to the same page in the page cache (if not using buffered IO), so as long
>> as they do not happen at the same time, you would get the second P2 copy of
>> data each time.
>>
>
> I apologize if my understanding is wrong but IMO this is exactly what we
> do in write-behind too. The cache is inode based and ensures that writes
> are ordered irrespective of the FD used for the write.
>
>
> Here is the commit message which brought the change
> 
> -
> write-behind: implement causal ordering and other cleanup
>
>
> Rules of causal ordering implemented:¬
>
>
>
>
>
>
>  - If request A arrives after the acknowledgement (to the app,¬
>
>i.e, STACK_UNWIND) of another request B, then request B is¬
>
>said to have 'caused' request A.¬
>
>

With the above principle, two write requests (p1 and p2 in example above)
issued by _two different threads/processes_ there need _not always_ be a
'causal' relationship (whether there is a causal relationship is purely
based on the "chance" that write-behind chose to ack one/both of them and
their timing of arrival). So, current write-behind is agnostic to the
ordering of p1 and p2 (when done by two threads).

However if p1 and p2 are issued by same thread there is _always_ a causal
relationship (p2 being caused by p1).


>
>
> - (corollary) Two requests, which at any point of time, are¬
>
>unacknowledged simultaneously in the system can never 'cause'¬
>
>each other (wb_inode->gen is based on this)¬
>
>
>
>  - If request A is caused by request B, AND request A's region¬
>
>has an overlap with request B's region, then then the fulfillment¬
>
>of request A is guaranteed to happen after the fulfillment of B.¬
>
>
>
>  - FD of origin is not considered for the determination of causal¬
>
>ordering.¬
>
>
>
>  - Append operation's region is considered the whole file.¬
>
>
>
>  Other cleanup:¬
>
>
>
>  - wb_file_t not required any more.¬
>
>
>
>  - wb_local_t not required any more.¬
>
>
>
>  - O_RDONLY fd's operations now go through the queue to make sure¬
>
>writes in the requested region get fulfilled be
> 
> ---
>
> Thanks,
> Raghavendra Talur
>
>
>>
>> Same story for using O_DIRECT - that write bypasses the page cache and
>> will update the data directly.
>>
>> What might happen in practice though is that your applications might use
>> higher level IO routines and they might buffer data internally. If that
>> happens, there is no ordering that is predictable.
>>
>> Regards,
>>
>> Ric
>>
>>
>>
>>> However, my understanding is that filesystems need not maintain the
>>> relative order of writes (as it received from vfs/kernel) on two different
>>> fds. Also, if we have to maintain the order it might come with increased
>>> latency. The increased latency can be because of having "newer" writes to
>>> wait on "older" ones. This wait can fill up write-behind buffer and can
>>> eventually result in a full write-behind cache and hence not able to
>>> "write-back" newer writes.
>>>
>>> * What does POSIX say about it?
>>> * How do other filesystems behave in this scenario?
>>>
>>>
>>> Also, the current write-behind implementation has the concept of
>>> "generation numbers". To quote from comment:
>>>
>>> 
>>>
>>>  uint64_t gen;/* Liability generation number. Represents
>>>  the current 'state' of liability. Every
>>>  new addition to the liability list bumps
>>>  the generation number.
>>>
>>>
>>>   

Re: [Gluster-devel] Question on merging zfs snapshot support into the mainline glusterfs

2016-09-23 Thread sriram
Hi Avra,

Have submitted the patches for Modularizing snapshot,

https://bugzilla.redhat.com/show_bug.cgi?id=1377437

This is the patch set:

 http://review.gluster.org/15554 This patch follows the discussion from
 the gluster-devel mail chain of, ...
 http://review.gluster.org/1 Referring to bugID:1377437,
 Modularizing snapshot for plugin based modules.
 http://review.gluster.org/15556 - This is third patch in the series for
 the bug=1377437
 http://review.gluster.org/15557 [BugId:1377437][Patch4]: Refering to
 the bug ID,
 http://review.gluster.org/15558 [BugId:1377437][Patch5]: Refering to
 the bug ID,
 http://review.gluster.org/15559 [BugId:1377437][Patch6]: Refering to
 the bug ID,
 http://review.gluster.org/15560 [BugId:1377437][Patch7]: Refering to
 the bug ID. * This patch has some minor ...
 http://review.gluster.org/15561 [BugId:1377437][Patch8]: Refering to
 the bug ID, this commit has minor fixes ...
 http://review.gluster.org/15562 [BugId:1377437][Patch9]: Refering to
 the bug ID, - Minor header file ...

Primarily, focused on moving lvm based implementation into plugins.
Have spread the commits across nine patches, some of them are minors,
except a couple of ones which does the real work. Others are minors.
Followed this method since, it would be easy for a review
(accept/reject). Let me know if there is something off the methods
followed with gluster devel. Thanks

Sriram

On Mon, Sep 19, 2016, at 10:58 PM, Avra Sengupta wrote:
> Hi Sriram,
>
>  I have created a bug for this
>  (https://bugzilla.redhat.com/show_bug.cgi?id=1377437). The plan is
>  that for the first patch as mentioned below, let's not meddle with
>  the zfs code at all. What we are looking at is segregating the lvm
>  based code as is today, from the management infrastructure (which is
>  addressed in your patch), and creating a table based pluggable
>  infra(refer to gd_svc_cli_actors[] in xlators/mgmt/glusterd/src/glusterd-
>  handler.c and other similar tables in gluster code base to get a
>  understanding of what I am conveying), which can be used to call this
>  code and still achieve the same results as we do today.
>
>  Once this code is merged, we can use the same infra to start pushing
>  in the zfs code (rest of your current patch). Please let me know if
>  you have further queries regarding this. Thanks.
>
>  Regards,
>  Avra
>
>  On 09/19/2016 07:52 PM, sri...@marirs.net.in wrote:
>> Hi Avra,
>>
>> Do you have a bug id for this changes? Or may I raise a new one?
>>
>> Sriram
>>
>>
>> On Fri, Sep 16, 2016, at 11:37 AM, sri...@marirs.net.in wrote:
>>> Thanks Avra,
>>>
>>> I'll send this patch to gluster master in a while.
>>>
>>> Sriram
>>>
>>>
>>> On Wed, Sep 14, 2016, at 03:08 PM, Avra Sengupta wrote:
 Hi Sriram,

 Sorry for the delay in response. I started going through the
 commits in the github repo. I finished going through the first
 commit, where you create a plugin structure and move code.
 Following is the commit link:

 https://github.com/sriramster/glusterfs/commit/7bf157525539541ebf0aa36a380bbedb2cae5440

 FIrst of all, the overall approach of using plugins, and
 maintaining plugins that is used in the patch is in sync with what
 we had discussed. There are some gaps though, like in the zfs
 functions the snap brick is mounted without updating labels, and in
 restore you perform a zfs rollback, which significantly changes the
 behavior between how a lvm based snapshot and a zfs based snapshot.

 But before we get into these details, I would request you to kindly
 send this particular patch to the gluster master branch, as that is
 how we formally review patches, and I would say this particular
 patch in itself is ready for a formal review. Once we straighten
 out the quirks in this patch, we can significantly start moving the
 other dependent patches to master and reviewing them. Thanks.

 Regards,
 Avra

 P.S : Adding gluster-devel

 On 09/13/2016 01:14 AM, sri...@marirs.net.in wrote:
> Hi Avra,
>
> You'd time to look into the below request?
>
> Sriram
>
>
> On Thu, Sep 8, 2016, at 01:20 PM, sri...@marirs.net.in wrote:
>> Hi Avra,
>>
>> Thank you. Please, let me know your feedback. It would be helpful
>> on continuing from then.
>>
>> Sriram
>>
>>
>> On Thu, Sep 8, 2016, at 01:18 PM, Avra Sengupta wrote:
>>> Hi Sriram,
>>>
>>> Rajesh is on a vacation, and will be available towards the end
>>> of next week. He will be sharing his feedback once he is back.
>>> Meanwhile I will have a look at the patch and share my feedback
>>> with you. But it will take me some time to go through it.
>>> Thanks.
>>>
>>> Regards,
>>> Avra
>>>
>>> On 09/08/2016 01:09 PM, sri...@marirs.net.in wrote:
 Hello Rajesh,

 Sorry to bother. Could you have a look at the below 

Re: [Gluster-devel] review request - Change the way client uuid is built

2016-09-23 Thread Soumya Koduri



On 09/23/2016 11:48 AM, Poornima Gurusiddaiah wrote:



- Original Message -

From: "Niels de Vos" 
To: "Raghavendra Gowdappa" 
Cc: "Gluster Devel" 
Sent: Wednesday, September 21, 2016 3:52:39 AM
Subject: Re: [Gluster-devel] review request - Change the way client uuid is 
built

On Wed, Sep 21, 2016 at 01:47:34AM -0400, Raghavendra Gowdappa wrote:

Hi all,

[1] might have implications across different components in the stack. Your
reviews are requested.



rpc : Change the way client uuid is built

Problem:
Today the main users of client uuid are protocol layers, locks, leases.
Protocolo layers requires each client uuid to be unique, even across
connects and disconnects. Locks and leases on the server side also use
the same client uid which changes across graph switches and across
file migrations. Which makes the graph switch and file migration
tedious for locks and leases.
As of today lock migration across graph switch is client driven,
i.e. when a graph switches, the client reassociates all the locks(which
were associated with the old graph client uid) with the new graphs
client uid. This means flood of fops to get and set locks for each fd.
Also file migration across bricks becomes even more difficult as
client uuid for the same client, is different on the other brick.

The exact set of issues exists for leases as well.

Hence the solution:
Make the migration of locks and leases during graph switch and migration,
server driven instead of client driven. This can be achieved by changing
the format of client uuid.

Client uuid currently:
%s(ctx uuid)-%s(protocol client name)-%d(graph id)%s(setvolume
count/reconnect count)

Proposed Client uuid:
"CTX_ID:%s-GRAPH_ID:%d-PID:%d-HOST:%s-PC_NAME:%s-RECON_NO:%s"
-  CTX_ID: This is will be constant per client.
-  GRAPH_ID, PID, HOST, PC_NAME(protocol client name), RECON_NO(setvolume
count)
remains the same.

With this, the first part of the client uuid, CTX_ID+GRAPH_ID remains
constant across file migration, thus the migration is made easier.

Locks and leases store only the first part CTX_ID+GRAPH_ID as their
client identification. This means, when the new graph connects,


Can we assume that CTX_ID+GRAPH_ID shall be unique across clients all 
the time? If not, wouldn't we get into issues of clientB's locks/leases 
not conflicting with locks/leases of clientA's.



the locks and leases xlator should walk through their database
to update the client id, to have new GRAPH_ID. Thus the graph switch
is made server driven and saves a lot of network traffic.


What is the plan to have the CTX_ID+GRAPH_ID shared over multiple gfapi
applications? This would be important for NFS-Ganesha failover where one
NFS-Ganesha process is stopped, and the NFS-Clients (by virtual-ip) move
to an other NFS-Ganesha server.


Sharing it across multiple gfapi applications is currently not supported.
Do you mean, setting the CTX_ID+GRAPH_ID at the init of the other client,
or during replay of locks during the failover?
If its the former, we need an api in gfapi to take the CTX_ID+GRAPH_ID as
an argument and other things.

Will there be a way to set CTX_ID(+GRAPH_ID?) through libgfapi? That
would allow us to add a configuration option to NFS-Ganesha and have the
whole NFS-Ganesha cluster use the same locking/leases.

Ah, ok. the whole of cluster will have the same CTX_ID(+GRAPH_ID?), but then
the cleanup logic will not work, as the disconnect cleanup happens as soon as
one of the NFS-Ganesha disconnects?


yes. If we have uniform ID (CTX_ID+GRAPH_ID?) across clients, we should 
keep locks/leases as long as even one client is connected and not clean 
them up as part of fd cleanup during disconnects.


Thanks,
Soumya



This patch doesn't eliminate the migration that is required during graph switch,
it still is necessary, but it can be server driven instead of client driven.


Thanks,
Niels




Change-Id: Ia81d57a9693207cd325d7b26aee4593fcbd6482c
BUG: 1369028
Signed-off-by: Poornima G 
Signed-off-by: Susant Palai 



[1] http://review.gluster.org/#/c/13901/10/

regards,
Raghavendra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Fixing setfsuid/gid problems in posix xlator

2016-09-23 Thread Pranith Kumar Karampuri
On Fri, Sep 23, 2016 at 12:30 PM, Soumya Koduri  wrote:

>
>
> On 09/23/2016 08:28 AM, Pranith Kumar Karampuri wrote:
>
>> hi,
>>Jiffin found an interesting problem in posix xlator where we have
>> never been using setfsuid/gid (http://review.gluster.org/#/c/15545/),
>> what I am seeing regressions after this is, if the files are created
>> using non-root user then the file creation fails because that user
>> doesn't have permissions to create the gfid-link. So it seems like the
>> correct way forward for this patch is to write wrappers around
>> sys_ to do setfsuid/gid do the actual operation requested and
>> then set it back to old uid/gid and then do the internal operations. I
>> am planning to write posix_sys_() to do the same, may be a
>> macro?.
>>
>
> Why not otherwise around? As in can we switch to superuser when required
> so that we know what all internal operations need root access and avoid
> misusing it.
>

The thread should have the uid/gid of the frame->root->uid/gid only at the
time of executing the syscall of open/mkdir/creat in posix xlator etc, rest
of the time it shouldn't. So doing it this way.


>
> Thanks,
> Soumya
>
> I need inputs from you guys to let me know if I am on the right path
>> and if you see any issues with this approach.
>>
>> --
>> Pranith
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>>


-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Spurious failure in ./tests/basic/rpc-coverage.t ?

2016-09-23 Thread Nithya Balachandran
https://build.gluster.org/job/centos6-regression/930/console

Can someone please take a look at this?

Thanks,
Nithya
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Fixing setfsuid/gid problems in posix xlator

2016-09-23 Thread Soumya Koduri



On 09/23/2016 08:28 AM, Pranith Kumar Karampuri wrote:

hi,
   Jiffin found an interesting problem in posix xlator where we have
never been using setfsuid/gid (http://review.gluster.org/#/c/15545/),
what I am seeing regressions after this is, if the files are created
using non-root user then the file creation fails because that user
doesn't have permissions to create the gfid-link. So it seems like the
correct way forward for this patch is to write wrappers around
sys_ to do setfsuid/gid do the actual operation requested and
then set it back to old uid/gid and then do the internal operations. I
am planning to write posix_sys_() to do the same, may be a macro?.


Why not otherwise around? As in can we switch to superuser when required 
so that we know what all internal operations need root access and avoid 
misusing it.


Thanks,
Soumya


I need inputs from you guys to let me know if I am on the right path
and if you see any issues with this approach.

--
Pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] review request - Change the way client uuid is built

2016-09-23 Thread Poornima Gurusiddaiah


- Original Message -
> From: "Niels de Vos" 
> To: "Raghavendra Gowdappa" 
> Cc: "Gluster Devel" 
> Sent: Wednesday, September 21, 2016 3:52:39 AM
> Subject: Re: [Gluster-devel] review request - Change the way client uuid is 
> built
> 
> On Wed, Sep 21, 2016 at 01:47:34AM -0400, Raghavendra Gowdappa wrote:
> > Hi all,
> > 
> > [1] might have implications across different components in the stack. Your
> > reviews are requested.
> > 
> > 
> > 
> > rpc : Change the way client uuid is built
> > 
> > Problem:
> > Today the main users of client uuid are protocol layers, locks, leases.
> > Protocolo layers requires each client uuid to be unique, even across
> > connects and disconnects. Locks and leases on the server side also use
> > the same client uid which changes across graph switches and across
> > file migrations. Which makes the graph switch and file migration
> > tedious for locks and leases.
> > As of today lock migration across graph switch is client driven,
> > i.e. when a graph switches, the client reassociates all the locks(which
> > were associated with the old graph client uid) with the new graphs
> > client uid. This means flood of fops to get and set locks for each fd.
> > Also file migration across bricks becomes even more difficult as
> > client uuid for the same client, is different on the other brick.
> > 
> > The exact set of issues exists for leases as well.
> > 
> > Hence the solution:
> > Make the migration of locks and leases during graph switch and migration,
> > server driven instead of client driven. This can be achieved by changing
> > the format of client uuid.
> > 
> > Client uuid currently:
> > %s(ctx uuid)-%s(protocol client name)-%d(graph id)%s(setvolume
> > count/reconnect count)
> > 
> > Proposed Client uuid:
> > "CTX_ID:%s-GRAPH_ID:%d-PID:%d-HOST:%s-PC_NAME:%s-RECON_NO:%s"
> > -  CTX_ID: This is will be constant per client.
> > -  GRAPH_ID, PID, HOST, PC_NAME(protocol client name), RECON_NO(setvolume
> > count)
> > remains the same.
> > 
> > With this, the first part of the client uuid, CTX_ID+GRAPH_ID remains
> > constant across file migration, thus the migration is made easier.
> > 
> > Locks and leases store only the first part CTX_ID+GRAPH_ID as their
> > client identification. This means, when the new graph connects,
> > the locks and leases xlator should walk through their database
> > to update the client id, to have new GRAPH_ID. Thus the graph switch
> > is made server driven and saves a lot of network traffic.
> 
> What is the plan to have the CTX_ID+GRAPH_ID shared over multiple gfapi
> applications? This would be important for NFS-Ganesha failover where one
> NFS-Ganesha process is stopped, and the NFS-Clients (by virtual-ip) move
> to an other NFS-Ganesha server.
> 
Sharing it across multiple gfapi applications is currently not supported.
Do you mean, setting the CTX_ID+GRAPH_ID at the init of the other client,
or during replay of locks during the failover?
If its the former, we need an api in gfapi to take the CTX_ID+GRAPH_ID as
an argument and other things.
> Will there be a way to set CTX_ID(+GRAPH_ID?) through libgfapi? That
> would allow us to add a configuration option to NFS-Ganesha and have the
> whole NFS-Ganesha cluster use the same locking/leases.
Ah, ok. the whole of cluster will have the same CTX_ID(+GRAPH_ID?), but then
the cleanup logic will not work, as the disconnect cleanup happens as soon as
one of the NFS-Ganesha disconnects?

This patch doesn't eliminate the migration that is required during graph switch,
it still is necessary, but it can be server driven instead of client driven.
> 
> Thanks,
> Niels
> 
> 
> > 
> > Change-Id: Ia81d57a9693207cd325d7b26aee4593fcbd6482c
> > BUG: 1369028
> > Signed-off-by: Poornima G 
> > Signed-off-by: Susant Palai 
> > 
> > 
> > 
> > [1] http://review.gluster.org/#/c/13901/10/
> > 
> > regards,
> > Raghavendra
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [gluster-devel] Documentation Tooling Review

2016-09-23 Thread Rajesh Joseph
On Thu, Sep 22, 2016 at 10:05 PM, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 2016-09-22 18:34 GMT+02:00 Amye Scavarda :
> > Nope! RHGS is the supported version and gluster.org is the open source
> > version. We'd like to keep the documentation reflecting that, but good
> > catch.
>
> Ok but features should be the same, right?
>

Yes, features would be same, infact upstream would be ahead of RHGS.
Similarly our upstream documentation will be ahead of Red Hat documentation.

I should have been more clear. In Red Hat documentation there are lot of
references to version specific changes which will not be applicable to
Gluster releases. Also there is branding difference Red Hat like to call
the product "Red Hat Gluster Storage" (RHGS) and we "Gluster"

-Rajesh
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel