Re: [Gluster-devel] autodelete in snapshots
I agree as well. We shouldn't be deleting any data without the explicit consent of the user. The approach proposed by MS is better than the earlier approach. ~kaushal On Tue, Jun 3, 2014 at 1:02 AM, M S Vishwanath Bhat wrote: > > > > On 2 June 2014 20:22, Vijay Bellur wrote: >> >> On 04/23/2014 05:50 AM, Vijay Bellur wrote: >>> >>> On 04/20/2014 11:42 PM, Lalatendu Mohanty wrote: On 04/16/2014 11:39 AM, Avra Sengupta wrote: > > The whole purpose of introducing the soft-limit is, that at any point > of time the number of > snaps should not exceed the hard limit. If we trigger auto-delete on > hitting hard-limit, then > the purpose itself is lost, because at that point we would be taking a > snap, making the limit > hard-limit + 1, and then triggering auto-delete, which violates the > sanctity of the hard-limit. > Also what happens when we are at hard-limit + 1, and another snap is > issued, while auto-delete > is yet to process the first delete. At that point we end up at > hard-limit + 1. Also what happens > if for a particular snap the auto-delete fails. > > We should see the hard-limit, as something set by the admin keeping in > mind the resource consumption > and at no-point should we cross this limit, come what may. If we hit > this limit, the create command > should fail asking the user to delete snaps using the "snapshot > delete" command. > > The two options Raghavendra mentioned are applicable for the > soft-limit only, in which cases on > hitting the soft-limit > > 1. Trigger auto-delete > > or > > 2. Log a warning-message, for the user saying the number of snaps is > exceeding the snap-limit and > display the number of available snaps > > Now which of these should happen also depends on the user, because the > auto-delete option > is configurable. > > So if the auto-delete option is set as true, auto-delete should be > triggered and the above message > should also be logged. > > But if the option is set as false, only the message should be logged. > > This is the behaviour as designed. Adding Rahul, and Seema in the > mail, to reflect upon the > behaviour as well. > > Regards, > Avra This sounds correct. However we need to make sure that the usage or documentation around this should be good enough , so that users understand the each of the limits correctly. >>> >>> It might be better to avoid the usage of the term "soft-limit". >>> soft-limit as used in quota and other places generally has an alerting >>> connotation. Something like "auto-deletion-limit" might be better. >>> >> >> I still see references to "soft-limit" and auto deletion seems to get >> triggered upon reaching soft-limit. >> >> Why is the ability to auto delete not configurable? It does seem pretty >> nasty to go about deleting snapshots without obtaining explicit consent from >> the user. > > > I agree with Vijay here. It's not good to delete a snap (even though it is > oldest) without the explicit consent from user. > > FYI It took me more than 2 weeks to figure out that my snaps were getting > autodeleted after reaching "soft-limit". For all I know I had not done > anything and my snap restore were failing. > > I propose to remove the terms "soft" and "hard" limit. I believe there > should be a limit (just "limit") after which all snapshot creates should > fail with proper error messages. And there can be a water-mark after which > user should get warning messages. So below is my proposal. > > auto-delete + snap-limit: If the snap-limit is set to n, next snap create > (n+1th) will succeed only if if auto-delete is set to on/true/1 and oldest > snap will get deleted automatically. If autodelete is set to off/false/0 , > (n+1)th snap create will fail with proper error message from gluster CLI > command. But again by default autodelete should be off. > > snap-water-mark: This should come in picture only if autodelete is turned > off. It should not have any meaning if auto-delete is turned ON. Basically > it's usage is to give the user warning that limit almost being reached and > it is time for admin to decide which snaps should be deleted (or which > should be kept) > > *my two cents* > > -MS > >> >> >> Cheers, >> >> Vijay >> >> ___ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> http://supercolony.gluster.org/mailman/listinfo/gluster-devel > > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] inode lru limit
> >> Hi, > >> > >> But as of now the inode table is bound to bound_xl which is associated > >> with > >> the client_t object for the client being connected. As part of fops we can > >> get the bound_xl (thus the inode table) from the rpc request > >> (req->trans->xl_private). But in reconfigure we get just the xlator > >> pointer > >> of protocol/server and dict containing new options. > >> > >> So what I am planning is this. If the xprt_list (transport list > >> corresponding > >> to the clients mounted) is empty, then just set the private structure's > >> variable for lru limit (which will be used to create the inode table when > >> a > >> client mounts). If xprt_list of protocol/server's private structure is not > >> empty, then get one of the transports from that list and get the client_t > >> object corresponding to the transport, from which bould_xl is obtained > >> (all > >> the client_t objects share the same inode table) . Then from bound_xl > >> pointer to inode table is got and its variable for lru limit is also set > >> to > >> the value specified via cli and inode_table_prune is called to purge the > >> extra inodes. > > In the above proposal if there are no active clients, lru limit of itable > > is not reconfigured. Here are two options to improve correctness of your > > proposal. > If there are no active clients, then there will not be any itable. > itable will be created when 1st client connects to the brick. And while > creating the itable we use the inode_lru_limit variable present in > protocol/server's private structure and inode table that is created also > saves the same value. A zero client current count doesn't mean that itables are absent in bounded_xl. There can be previous connections which resulted in itable creations. > > 1. On a successful handshake, you check whether the lru_limit of itable is > > equal to configured value. If not equal, set it to the configured value > > and prune the itable. The cost is that you check inode table's lru limit > > on every client connection. > On successful handshake, for the 1st client inode table will be created > with lru_limit value saved in protocol/server's private. For further > handshakes since inode table is already there, new inode tables will not > be created. So instead of waiting for a new handshake to happen to set > the lru_limit and purge the inode table, I think its better to do it as > part of reconfigure itself. > > > > 2. Traverse through the list of all xlators (since there is no easy way of > > finding potential candidates for bound_xl other than peaking into options > > specific to authentication) and if there is an itable associated with that > > xlator, set its lru limit and prune it. The cost here is traversing the > > list of xlators. However, our xlator list in brick process is relatively > > small, this shouldn't have too much performance impact. > > > > Comments are welcome. > > Regards, > Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] doubts in posix_handle_path and posix_handle_pump
Hi Pranith, I'm not Avati but I think I can explain the purpose of these functions. As I've interpreted it, posix_handle_path() should return the path to a "real" entry of the filesystem. It can be called for all kind of files, but it seems that basename is != NULL only for entries where gfid refers to a directory. That seems good. If the specified gfid refers to anything other than a directory, since they are stored inside .glusterfs as hard links, the returned path should simply be /.glusterfs/xx/yy/. For gfid's that specify directories, since they are stored inside .glusterfs as symlinks, they need to be dereferenced before returning (to point to a real directory). That is the purpose of line 362: only if the lstat succeeded (it basically means that the entry exists), the path refers to a symbolic link, and it has more than 1 hard link (to differentiate a symlink of a directory, with 1 hard link, from a "normal" symlink, with at least 2 hard links) then it tries to replace the symlink by its referenced real directory. This is done using the function posix_handle_pump(). This function reads the symlink and replaces it into the path: If the original path were: /cd/47/cd473c26-b6d6-42a6-a0a4-28e04de13915 And cd473c26-b6d6-42a6-a0a4-28e04de13915 points to: ../../0e/5d/0e5d04a1-f418-4388-90db-d1d7ebb8a071/dir1 The result is: /0e/5d/0e5d04a1-f418-4388-90db-d1d7ebb8a071/dir1 The returned path points to a directory, so it should be ok. However since this path can be composed by other symlinks (in this case 0e5d04a1- f418-4388-90db-d1d7ebb8a071 will be another symlink), the system needs to resolve them every time it accesses the entry. If there are too many of them (for example an entry very deep in the directory hierarchy) the system calls will return ELOOP (even if there isn't any symlink loop, the system has a limit in the number of symlinks it will resolve for any request). This is what is tested in the lstat() call at line 374 and the following test at 375: while ELOOP is returned by lstat, posix_handle_pump() is called to remove another level of symlink indirection. This guarantees that the returned path points to a directory *and* it won't fail on system calls with ELOOP. It's very easy to check this: # cd # mkdir -p d/d/d/d/d/d/d/d/d/d/d/d/d/d/d/d # cd # getfattr -m. -e hex -d d/d/d/d/d/d/d/d/d/d/d/d/d/d/d/d # file: d/d/d/d/d/d/d/d/d/d/d/d/d/d/d/d trusted.gfid=0xf6e98d542c4f48259976861b7269a396 trusted.glusterfs.dht=0x0001 # ls .glusterfs/f6/e9/f6e98d54-2c4f-4825-9976-861b7269a396 ls: cannot access .glusterfs/f6/e9/f6e98d54-2c4f-4825-9976-861b7269a396: Too many levels of symbolic links The loop on posix_handle_pump() avoids this problem. The possible problem I see is that in the comments it says that this function returns a path to an IA_IFDIR (it will return IA_IFDIR on an lstat), however if one of the symlinks is missing or anything else fails, it won't return an error *but* it will return a path to an existing file. An lstat on this path will return IA_IFLNK instead of IA_IFDIR. I don't know if this can be a problem in some places. Xavi On Monday 02 June 2014 02:00:30 Pranith Kumar Karampuri wrote: > hi Avati, >Could you please explain why posix_handle_pump fixes ELOOP. > > posix_handle_path seems to return gfid-path when the lstat on the base-gfid > path fails. I am not sure what the behaviour is supposed to be. > > I added the snippet of code for reference below. > > 346 base_len = (priv->base_path_length + SLEN(GF_HIDDEN_PATH) + 45); > 347 base_str = alloca (base_len + 1); > 348 base_len = snprintf (base_str, base_len + 1, > "%s/%s/%02x/%02x/%s", 349 priv->base_path, > GF_HIDDEN_PATH, gfid[0], gfid[1], 350 > uuid_str); > 351 > 352 pfx_len = priv->base_path_length + 1 + SLEN(GF_HIDDEN_PATH) + 1; > 353 > 354 if (basename) { > 355 len = snprintf (buf, maxlen, "%s/%s", base_str, > basename); 356 } else { > 357 len = snprintf (buf, maxlen, "%s", base_str); > 358 } > 359 > 360 ret = lstat (base_str, &stat); > 361 > 362 if (!(ret == 0 && S_ISLNK(stat.st_mode) && stat.st_nlink == 1)) > <<<- 363 goto out; > > Pranith. > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] op-version issues in the 3.5.1 beta release
On Tue, Jun 03, 2014 at 11:01:25AM +0530, Kaushal M wrote: > Niels, > This approach will work for well when the cluster is uniform, ie. of > the same (major) version. This could lead to problems in mixed > clusters when using volume set. Volume set compares the op-versions of > the options being set and will reject the set operation when the > op-versions are different. So, if a user were to run a mixed cluster > of gluster-3.5.1 and gluster-3.6.0, he wouldn't be able to set > server.manage-gids as its op-versions would be different. Thanks Kaushal, that is exactly my expectation. > But I don't expect anyone to be running such a cluster always. It > would mostly be during upgrades, during which users shouldn't be doing > volume operations. Yes, I agree. And it should be easy enough to diagnose 'volume set' issues when glusterfs versions are different. Would you like me to send a patch for the master branch that makes the changes mentioned below, or is that something you can do soon? I'll be waiting for the change to be merged before 3.5.1 can get the updated op-version for server.manage-gids. Thanks, Niels > > ~kaushal > > On Mon, Jun 2, 2014 at 8:28 PM, Niels de Vos wrote: > > Hi, > > > > today on IRC we has a discussion about the op-version for the current > > 3.5.1 beta. This beta includes a backport that introduces a new volume > > option (server.manage-gids) and needed to increase the op-version to > > prevent issues with systems that do not know about this new option. > > > > Currently, the op-version in 3.5.1 is (seems to be) hard-coded to '3': > > > > libglusterfs/src/globals.h:#define GD_OP_VERSION_MAX 3 > > > > Now, the new option required a glusterd with op-version=4. This worked > > fine when setting the option, and glusterd.info got updated too. > > Unfortunately, a restart of glusterd fails, because the op-version from > > the configuration is greater than the GD_OP_VERSION_MAX. > > > > Increasing GD_OP_VERSION_MAX is not really suitable, because > > op-version=4 would make other systems assume that the 3.5.1 release has > > all the op-version=4 features (incorrect, because the upcoming 3.6 has > > op-version=4). > > > > I see one option to fix this issue, that allows stable branches to > > include backports of volume options and similar, without conflicting > > with the development branch or newer versions: > > > > 1. define an op-version as multi-digit value, with gaps for stable > >releases > > 2. stable releases may only include backports of volume options that are > >in the development branch and newer versions > > 3. stable maintainers should pay extra care when new volume options are > >being backported > > > > The idea is the following: > > > > - update the hard-coded op-version in libglusterfs/src/globals.h in the > > master branch to 360 (based on the 3.6 release for easier matching) > > - update any options that have op-version >= 4 to 360 (master branch) > > - update the op-version in libglusterfs/src/globals.h in the release-3.5 > > branch to 351 > > - update the op-version of server.manage-gids option in 3.5.1 to 351 > > > > > > The only issue would be that current 3.6 packages in testing have > > a lower op-version than the new 3.5.1 packages. I hope it is not > > a common practise to have systems installed with packages from the > > master-branch in the same environment as 3.5.1 servers. > > > > Any ideas, suggestions or thoughts? > > > > If this can not be solved in a similar easy way, I will be forced to > > revert the 3.5.1 server.manage-gids option. Users were expecting this to > > be present so that deployments with many (ca. 93+) secondary groups have > > permissions working as expected. > > > > Thanks, > > Niels > > ___ > > Gluster-devel mailing list > > Gluster-devel@gluster.org > > http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] autodelete in snapshots
- Original Message - From: "M S Vishwanath Bhat" To: "Vijay Bellur" Cc: "Seema Naik" , "Gluster Devel" Sent: Tuesday, June 3, 2014 1:02:08 AM Subject: Re: [Gluster-devel] autodelete in snapshots On 2 June 2014 20:22, Vijay Bellur < vbel...@redhat.com > wrote: On 04/23/2014 05:50 AM, Vijay Bellur wrote: On 04/20/2014 11:42 PM, Lalatendu Mohanty wrote: On 04/16/2014 11:39 AM, Avra Sengupta wrote: The whole purpose of introducing the soft-limit is, that at any point of time the number of snaps should not exceed the hard limit. If we trigger auto-delete on hitting hard-limit, then the purpose itself is lost, because at that point we would be taking a snap, making the limit hard-limit + 1, and then triggering auto-delete, which violates the sanctity of the hard-limit. Also what happens when we are at hard-limit + 1, and another snap is issued, while auto-delete is yet to process the first delete. At that point we end up at hard-limit + 1. Also what happens if for a particular snap the auto-delete fails. We should see the hard-limit, as something set by the admin keeping in mind the resource consumption and at no-point should we cross this limit, come what may. If we hit this limit, the create command should fail asking the user to delete snaps using the "snapshot delete" command. The two options Raghavendra mentioned are applicable for the soft-limit only, in which cases on hitting the soft-limit 1. Trigger auto-delete or 2. Log a warning-message, for the user saying the number of snaps is exceeding the snap-limit and display the number of available snaps Now which of these should happen also depends on the user, because the auto-delete option is configurable. So if the auto-delete option is set as true, auto-delete should be triggered and the above message should also be logged. But if the option is set as false, only the message should be logged. This is the behaviour as designed. Adding Rahul, and Seema in the mail, to reflect upon the behaviour as well. Regards, Avra This sounds correct. However we need to make sure that the usage or documentation around this should be good enough , so that users understand the each of the limits correctly. It might be better to avoid the usage of the term "soft-limit". soft-limit as used in quota and other places generally has an alerting connotation. Something like "auto-deletion-limit" might be better. I still see references to "soft-limit" and auto deletion seems to get triggered upon reaching soft-limit. Why is the ability to auto delete not configurable? It does seem pretty nasty to go about deleting snapshots without obtaining explicit consent from the user. I agree with Vijay here. It's not good to delete a snap (even though it is oldest) without the explicit consent from user. FYI It took me more than 2 weeks to figure out that my snaps were getting autodeleted after reaching "soft-limit". For all I know I had not done anything and my snap restore were failing. I propose to remove the terms "soft" and "hard" limit. I believe there should be a limit (just "limit") after which all snapshot creates should fail with proper error messages. And there can be a water-mark after which user should get warning messages. So below is my proposal. auto-delete + snap-limit: If the snap-limit is set to n , next snap create (n+1th) will succeed only if if auto-delete is set to on/true/1 and oldest snap will get deleted automatically. If autodelete is set to off/false/0 , (n+1)th snap create will fail with proper error message from gluster CLI command. But again by default autodelete should be off. snap-water-mark : This should come in picture only if autodelete is turned off. It should not have any meaning if auto-delete is turned ON. Basically it's usage is to give the user warning that limit almost being reached and it is time for admin to decide which snaps should be deleted (or which should be kept) *my two cents* -MS The reason for having a hard-limit is to stop snapshot creation once we reached this limit. This helps to have a control over the resource consumption. Therefore if we only have this limit (as snap-limit) then there is no question of auto-delete. Auto-delete can only be triggered once the count crosses the limit. Therefore we introduced the concept of soft-limit and a hard-limit. As the name suggests once the hard-limit is reached no more snaps will be created. So the idea is to keep the number of snapshots always less than the hard-limit. To do so we introduced the concept of soft-limit, wherein we allow snapshots even when this limit is crossed and once the snapshot is taken we delete the oldest snap. If you consider this definition then the name soft-limit and hard-limit looks ok to me. In phase II we are planning to have auto-delete feature configurable with different policies, e.g. delete oldest, delete with more space con
Re: [Gluster-devel] doubts in posix_handle_path and posix_handle_pump
On 06/03/2014 02:42 PM, Xavier Hernandez wrote: Hi Pranith, I'm not Avati but I think I can explain the purpose of these functions. As I've interpreted it, posix_handle_path() should return the path to a "real" entry of the filesystem. It can be called for all kind of files, but it seems that basename is != NULL only for entries where gfid refers to a directory. That seems good. If the specified gfid refers to anything other than a directory, since they are stored inside .glusterfs as hard links, the returned path should simply be /.glusterfs/xx/yy/. For gfid's that specify directories, since they are stored inside .glusterfs as symlinks, they need to be dereferenced before returning (to point to a real directory). That is the purpose of line 362: only if the lstat succeeded (it basically means that the entry exists), the path refers to a symbolic link, and it has more than 1 hard link (to differentiate a symlink of a directory, with 1 hard link, from a "normal" symlink, with at least 2 hard links) then it tries to replace the symlink by its referenced real directory. This is done using the function posix_handle_pump(). This function reads the symlink and replaces it into the path: If the original path were: /cd/47/cd473c26-b6d6-42a6-a0a4-28e04de13915 And cd473c26-b6d6-42a6-a0a4-28e04de13915 points to: ../../0e/5d/0e5d04a1-f418-4388-90db-d1d7ebb8a071/dir1 The result is: /0e/5d/0e5d04a1-f418-4388-90db-d1d7ebb8a071/dir1 The returned path points to a directory, so it should be ok. However since this path can be composed by other symlinks (in this case 0e5d04a1- f418-4388-90db-d1d7ebb8a071 will be another symlink), the system needs to resolve them every time it accesses the entry. If there are too many of them (for example an entry very deep in the directory hierarchy) the system calls will return ELOOP (even if there isn't any symlink loop, the system has a limit in the number of symlinks it will resolve for any request). This is what is tested in the lstat() call at line 374 and the following test at 375: while ELOOP is returned by lstat, posix_handle_pump() is called to remove another level of symlink indirection. This guarantees that the returned path points to a directory *and* it won't fail on system calls with ELOOP. It's very easy to check this: # cd # mkdir -p d/d/d/d/d/d/d/d/d/d/d/d/d/d/d/d # cd # getfattr -m. -e hex -d d/d/d/d/d/d/d/d/d/d/d/d/d/d/d/d # file: d/d/d/d/d/d/d/d/d/d/d/d/d/d/d/d trusted.gfid=0xf6e98d542c4f48259976861b7269a396 trusted.glusterfs.dht=0x0001 # ls .glusterfs/f6/e9/f6e98d54-2c4f-4825-9976-861b7269a396 ls: cannot access .glusterfs/f6/e9/f6e98d54-2c4f-4825-9976-861b7269a396: Too many levels of symbolic links The loop on posix_handle_pump() avoids this problem. Wonderful explanation! :-). Thanks!. Now I understand why there is that while loop. I will add this part of the mail as a comment for that function when I re-submit the code. The possible problem I see is that in the comments it says that this function returns a path to an IA_IFDIR (it will return IA_IFDIR on an lstat), however if one of the symlinks is missing or anything else fails, it won't return an error *but* it will return a path to an existing file. An lstat on this path will return IA_IFLNK instead of IA_IFDIR. I don't know if this can be a problem in some places. This is exactly what I was referring to, I don't see an easy way to find out if there is any failure in the function. One needs to do extra lstat or a 'path' based syscall like getxattr etc on the returned path to check if it returned a good path. So do you think the best thing is to ignore the return value of the function call but depend on an lstat or a path based syscall of the path? Xavi On Monday 02 June 2014 02:00:30 Pranith Kumar Karampuri wrote: hi Avati, Could you please explain why posix_handle_pump fixes ELOOP. posix_handle_path seems to return gfid-path when the lstat on the base-gfid path fails. I am not sure what the behaviour is supposed to be. I added the snippet of code for reference below. 346 base_len = (priv->base_path_length + SLEN(GF_HIDDEN_PATH) + 45); 347 base_str = alloca (base_len + 1); 348 base_len = snprintf (base_str, base_len + 1, "%s/%s/%02x/%02x/%s", 349 priv->base_path, GF_HIDDEN_PATH, gfid[0], gfid[1], 350 uuid_str); 351 352 pfx_len = priv->base_path_length + 1 + SLEN(GF_HIDDEN_PATH) + 1; 353 354 if (basename) { 355 len = snprintf (buf, maxlen, "%s/%s", base_str, basename); 356 } else { 357 len = snprintf (buf, maxlen, "%s", base_str); 358 } 359 360 ret = lstat (base_str, &stat); 361 362 if (!(ret == 0 && S_ISLNK(stat.st_mode) && stat.st_nlink == 1)) <<<- 363 goto out; Pranith. ___ Gluster-devel mailing list Gl
Re: [Gluster-devel] doubts in posix_handle_path and posix_handle_pump
On Tuesday 03 June 2014 15:42:19 Pranith Kumar Karampuri wrote: > On 06/03/2014 02:42 PM, Xavier Hernandez wrote: > > The possible problem I see is that in the comments it says that this > > function returns a path to an IA_IFDIR (it will return IA_IFDIR on an > > lstat), however if one of the symlinks is missing or anything else fails, > > it won't return an error *but* it will return a path to an existing file. > > An lstat on this path will return IA_IFLNK instead of IA_IFDIR. I don't > > know if this can be a problem in some places. > > This is exactly what I was referring to, I don't see an easy way to find > out if there is any failure in the function. One needs to do extra lstat > or a 'path' based syscall like getxattr etc on the returned path to > check if it returned a good path. So do you think the best thing is to > ignore the return value of the function call but depend on an lstat or a > path based syscall of the path? > The only point to consider is for gfid's representing directories. Other types of file do not have any problem (the returned path can be considered valid even if lstat() fails). For directories there are 3 places where things can fail: At line 360: I think this is not a problem. If lstat() fails (basically because it does not exist), the returned path can be considered valid. At line 367: If posix_handle_pump() fails, it could mean: * The symlink is not a valid directory symlink: * it's a corrupted one: any operation on this file should be denied * it's a normal symlink that has lost one of the hard-links: though it's bad to have damaged gfid's, the returned path can be considered valid. * readlink() failed: this woulb be very weird. Access to the file should be denied. At line 374: If lstat() fails, probably it means that the symlink of one of the parents of the directory is missing. The returned path won't fail on lstat(), but it should because it will return symlink information instead of directory information. I think that it's very hard to determine if something went wrong only by inspecting the returned path. I think the best approach would be that posix_handle_path() return -1 if posix_handle_pump() or lstat() at line 374 fail, and each caller decide what to do in case of failure. However I don't know all the details of the posix xlator, so maybe I'm wrong and this is not necessary. Let's see if there is someone else with more experience on it to see what he thinks. Xavi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] All builds are failing with BUILD ERROR
Guys its failing again with the same error: Please proceed with configuring, compiling, and installing. rm: cannot remove `/build/install/var/run/gluster/patchy': Device or resource busy + RET=1 + '[' 1 '!=' 0 ']' + VERDICT='BUILD FAILURE' Pranith On 06/02/2014 09:08 PM, Justin Clift wrote: On 02/06/2014, at 7:04 AM, Kaleb KEITHLEY wrote: someone cleaned the loopback devices. I deleted 500 unix domain sockets in /d/install/var/run and requeued the regressions. Interesting. The extra sockets problem is what prompted me to rewrite the cleanup function. The sockets are being created by glusterd during each test startup, but aren't removed by the existing cleanup function. (so, substantial build up over time) I'm not sure which of those two things was the solution. _Probably_ the loopback device thing. The extra sockets seem to be "messy" but (so far) I haven't seen them break anything. + Justin -- Open Source and Standards @ Red Hat twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] op-version issues in the 3.5.1 beta release
I've sent out a draft/rfc patch on the master branch http://review.gluster.org/7963. ~kaushal On Tue, Jun 3, 2014 at 3:20 PM, Niels de Vos wrote: > On Tue, Jun 03, 2014 at 11:01:25AM +0530, Kaushal M wrote: >> Niels, >> This approach will work for well when the cluster is uniform, ie. of >> the same (major) version. This could lead to problems in mixed >> clusters when using volume set. Volume set compares the op-versions of >> the options being set and will reject the set operation when the >> op-versions are different. So, if a user were to run a mixed cluster >> of gluster-3.5.1 and gluster-3.6.0, he wouldn't be able to set >> server.manage-gids as its op-versions would be different. > > Thanks Kaushal, that is exactly my expectation. > >> But I don't expect anyone to be running such a cluster always. It >> would mostly be during upgrades, during which users shouldn't be doing >> volume operations. > > Yes, I agree. And it should be easy enough to diagnose 'volume set' > issues when glusterfs versions are different. > > Would you like me to send a patch for the master branch that makes the > changes mentioned below, or is that something you can do soon? I'll be > waiting for the change to be merged before 3.5.1 can get the updated > op-version for server.manage-gids. > > Thanks, > Niels > >> >> ~kaushal >> >> On Mon, Jun 2, 2014 at 8:28 PM, Niels de Vos wrote: >> > Hi, >> > >> > today on IRC we has a discussion about the op-version for the current >> > 3.5.1 beta. This beta includes a backport that introduces a new volume >> > option (server.manage-gids) and needed to increase the op-version to >> > prevent issues with systems that do not know about this new option. >> > >> > Currently, the op-version in 3.5.1 is (seems to be) hard-coded to '3': >> > >> > libglusterfs/src/globals.h:#define GD_OP_VERSION_MAX 3 >> > >> > Now, the new option required a glusterd with op-version=4. This worked >> > fine when setting the option, and glusterd.info got updated too. >> > Unfortunately, a restart of glusterd fails, because the op-version from >> > the configuration is greater than the GD_OP_VERSION_MAX. >> > >> > Increasing GD_OP_VERSION_MAX is not really suitable, because >> > op-version=4 would make other systems assume that the 3.5.1 release has >> > all the op-version=4 features (incorrect, because the upcoming 3.6 has >> > op-version=4). >> > >> > I see one option to fix this issue, that allows stable branches to >> > include backports of volume options and similar, without conflicting >> > with the development branch or newer versions: >> > >> > 1. define an op-version as multi-digit value, with gaps for stable >> >releases >> > 2. stable releases may only include backports of volume options that are >> >in the development branch and newer versions >> > 3. stable maintainers should pay extra care when new volume options are >> >being backported >> > >> > The idea is the following: >> > >> > - update the hard-coded op-version in libglusterfs/src/globals.h in the >> > master branch to 360 (based on the 3.6 release for easier matching) >> > - update any options that have op-version >= 4 to 360 (master branch) >> > - update the op-version in libglusterfs/src/globals.h in the release-3.5 >> > branch to 351 >> > - update the op-version of server.manage-gids option in 3.5.1 to 351 >> > >> > >> > The only issue would be that current 3.6 packages in testing have >> > a lower op-version than the new 3.5.1 packages. I hope it is not >> > a common practise to have systems installed with packages from the >> > master-branch in the same environment as 3.5.1 servers. >> > >> > Any ideas, suggestions or thoughts? >> > >> > If this can not be solved in a similar easy way, I will be forced to >> > revert the 3.5.1 server.manage-gids option. Users were expecting this to >> > be present so that deployments with many (ca. 93+) secondary groups have >> > permissions working as expected. >> > >> > Thanks, >> > Niels >> > ___ >> > Gluster-devel mailing list >> > Gluster-devel@gluster.org >> > http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Erasure coding doubts session
hi Xavier, Some of the developers are reading the code you submitted for erasure code. We want to know if you would be available on Friday IST so that we can have a discussion and doubt clarification session on IRC. Could you tell which time is good for you. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Erasure coding doubts session
Hi Pranith, On Tuesday 03 June 2014 17:04:12 Pranith Kumar Karampuri wrote: > hi Xavier, > Some of the developers are reading the code you submitted for > erasure code. We want to know if you would be available on Friday IST so > that we can have a discussion and doubt clarification session on IRC. > Could you tell which time is good for you. Sure. I can get some time between 12:30 PM IST and 6:30 PM IST. Tell me when do you prefer. Xavi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] autodelete in snapshots
On 3 June 2014 15:21, Rajesh Joseph wrote: > > > - Original Message - > From: "M S Vishwanath Bhat" > To: "Vijay Bellur" > Cc: "Seema Naik" , "Gluster Devel" < > gluster-devel@gluster.org> > Sent: Tuesday, June 3, 2014 1:02:08 AM > Subject: Re: [Gluster-devel] autodelete in snapshots > > > > > On 2 June 2014 20:22, Vijay Bellur < vbel...@redhat.com > wrote: > > > > On 04/23/2014 05:50 AM, Vijay Bellur wrote: > > > On 04/20/2014 11:42 PM, Lalatendu Mohanty wrote: > > > On 04/16/2014 11:39 AM, Avra Sengupta wrote: > > > The whole purpose of introducing the soft-limit is, that at any point > of time the number of > snaps should not exceed the hard limit. If we trigger auto-delete on > hitting hard-limit, then > the purpose itself is lost, because at that point we would be taking a > snap, making the limit > hard-limit + 1, and then triggering auto-delete, which violates the > sanctity of the hard-limit. > Also what happens when we are at hard-limit + 1, and another snap is > issued, while auto-delete > is yet to process the first delete. At that point we end up at > hard-limit + 1. Also what happens > if for a particular snap the auto-delete fails. > > We should see the hard-limit, as something set by the admin keeping in > mind the resource consumption > and at no-point should we cross this limit, come what may. If we hit > this limit, the create command > should fail asking the user to delete snaps using the "snapshot > delete" command. > > The two options Raghavendra mentioned are applicable for the > soft-limit only, in which cases on > hitting the soft-limit > > 1. Trigger auto-delete > > or > > 2. Log a warning-message, for the user saying the number of snaps is > exceeding the snap-limit and > display the number of available snaps > > Now which of these should happen also depends on the user, because the > auto-delete option > is configurable. > > So if the auto-delete option is set as true, auto-delete should be > triggered and the above message > should also be logged. > > But if the option is set as false, only the message should be logged. > > This is the behaviour as designed. Adding Rahul, and Seema in the > mail, to reflect upon the > behaviour as well. > > Regards, > Avra > > This sounds correct. However we need to make sure that the usage or > documentation around this should be good enough , so that users > understand the each of the limits correctly. > > > It might be better to avoid the usage of the term "soft-limit". > soft-limit as used in quota and other places generally has an alerting > connotation. Something like "auto-deletion-limit" might be better. > > > I still see references to "soft-limit" and auto deletion seems to get > triggered upon reaching soft-limit. > > Why is the ability to auto delete not configurable? It does seem pretty > nasty to go about deleting snapshots without obtaining explicit consent > from the user. > > I agree with Vijay here. It's not good to delete a snap (even though it is > oldest) without the explicit consent from user. > > FYI It took me more than 2 weeks to figure out that my snaps were getting > autodeleted after reaching "soft-limit". For all I know I had not done > anything and my snap restore were failing. > > I propose to remove the terms "soft" and "hard" limit. I believe there > should be a limit (just "limit") after which all snapshot creates should > fail with proper error messages. And there can be a water-mark after which > user should get warning messages. So below is my proposal. > > auto-delete + snap-limit: If the snap-limit is set to n , next snap create > (n+1th) will succeed only if if auto-delete is set to on/true/1 and oldest > snap will get deleted automatically. If autodelete is set to off/false/0 , > (n+1)th snap create will fail with proper error message from gluster CLI > command. But again by default autodelete should be off. > > snap-water-mark : This should come in picture only if autodelete is turned > off. It should not have any meaning if auto-delete is turned ON. Basically > it's usage is to give the user warning that limit almost being reached and > it is time for admin to decide which snaps should be deleted (or which > should be kept) > > *my two cents* > > -MS > > > The reason for having a hard-limit is to stop snapshot creation once we > reached this limit. This helps to have a control over the resource > consumption. Therefore if we only have this limit (as snap-limit) then > there is no question of auto-delete. Auto-delete can only be triggered once > the count crosses the limit. Therefore we introduced the concept of > soft-limit and a hard-limit. As the name suggests once the hard-limit is > reached no more snaps will be created. > Perhaps I could have been more clearer. auto-delete value does come into picture when limit is reached. There is a limit 'n' (snap-limit), and when we reach this limit, what happens to next snap create depends on the value of auto-delete ( should be user configurable)
Re: [Gluster-devel] All builds are failing with BUILD ERROR
On 03/06/2014, at 12:12 PM, Pranith Kumar Karampuri wrote: > Guys its failing again with the same error: > > Please proceed with configuring, compiling, and installing. > rm: cannot remove `/build/install/var/run/gluster/patchy': Device or resource > busy > + RET=1 > + '[' 1 '!=' 0 ']' > + VERDICT='BUILD FAILURE' You have a login to build.gluster.org don't you? :) + Justin -- Open Source and Standards @ Red Hat twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Erasure coding doubts session
On 06/03/2014 05:31 PM, Xavier Hernandez wrote: Hi Pranith, On Tuesday 03 June 2014 17:04:12 Pranith Kumar Karampuri wrote: hi Xavier, Some of the developers are reading the code you submitted for erasure code. We want to know if you would be available on Friday IST so that we can have a discussion and doubt clarification session on IRC. Could you tell which time is good for you. Sure. I can get some time between 12:30 PM IST and 6:30 PM IST. Tell me when do you prefer. Would 2:00 - 3:00 PM IST work for you? We could hang out in #gluster-meeting. -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Erasure coding doubts session
On Tuesday 03 June 2014 19:12:30 Vijay Bellur wrote: > > Sure. I can get some time between 12:30 PM IST and 6:30 PM IST. Tell me > > when do you prefer. > > Would 2:00 - 3:00 PM IST work for you? We could hang out in > #gluster-meeting. > Yes, no problem. I'll be there. Xavi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] All builds are failing with BUILD ERROR
On 06/03/2014 07:00 PM, Justin Clift wrote: On 03/06/2014, at 12:12 PM, Pranith Kumar Karampuri wrote: Guys its failing again with the same error: Please proceed with configuring, compiling, and installing. rm: cannot remove `/build/install/var/run/gluster/patchy': Device or resource busy + RET=1 + '[' 1 '!=' 0 ']' + VERDICT='BUILD FAILURE' You have a login to build.gluster.org don't you? :) I only have login to launch builds. No permission to inspect the machine :-(. Pranith + Justin -- Open Source and Standards @ Red Hat twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] All builds are failing with BUILD ERROR
On 03/06/2014, at 3:04 PM, Pranith Kumar Karampuri wrote: > On 06/03/2014 07:00 PM, Justin Clift wrote: >> On 03/06/2014, at 12:12 PM, Pranith Kumar Karampuri wrote: >>> Guys its failing again with the same error: >>> >>> Please proceed with configuring, compiling, and installing. >>> rm: cannot remove `/build/install/var/run/gluster/patchy': Device or >>> resource busy >>> + RET=1 >>> + '[' 1 '!=' 0 ']' >>> + VERDICT='BUILD FAILURE' >> >> You have a login to build.gluster.org don't you? :) > I only have login to launch builds. No permission to inspect the machine :-(. Just checked, and you're definitely in the sudo setup, with the same permissions as me. eg you can "sudo su -" to root If it asks for a password, its your own password. If you don't know it, let me know, and I'll set an initial one for you. :) + Justin -- Open Source and Standards @ Red Hat twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] All builds are failing with BUILD ERROR
On 06/03/2014 07:45 PM, Justin Clift wrote: On 03/06/2014, at 3:04 PM, Pranith Kumar Karampuri wrote: On 06/03/2014 07:00 PM, Justin Clift wrote: On 03/06/2014, at 12:12 PM, Pranith Kumar Karampuri wrote: Guys its failing again with the same error: Please proceed with configuring, compiling, and installing. rm: cannot remove `/build/install/var/run/gluster/patchy': Device or resource busy + RET=1 + '[' 1 '!=' 0 ']' + VERDICT='BUILD FAILURE' You have a login to build.gluster.org don't you? :) I only have login to launch builds. No permission to inspect the machine :-(. Just checked, and you're definitely in the sudo setup, with the same permissions as me. eg you can "sudo su -" to root If it asks for a password, its your own password. If you don't know it, let me know, and I'll set an initial one for you. :) Justin fixed the login for me. Feel free to reach out to me as well if there is any problem with build machine that anyone needs fixing. Pranith + Justin -- Open Source and Standards @ Red Hat twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Fwd: New Defects reported by Coverity Scan for GlusterFS
If you are interested to fix these Coverity issues , check the below link for guidelines: http://www.gluster.org/community/documentation/index.php/Fixing_Issues_Reported_By_Tools_For_Static_Code_Analysis#Coverity Thanks, Lala Original Message Subject:New Defects reported by Coverity Scan for GlusterFS Date: Tue, 03 Jun 2014 08:22:38 -0700 From: scan-ad...@coverity.com Hi, Please find the latest report on new defect(s) introduced to GlusterFS found with Coverity Scan. Defect(s) Reported-by: Coverity Scan Showing 14 of 14 defect(s) ** CID 1220068: Missing parentheses (CONSTANT_EXPRESSION_RESULT) /xlators/features/snapview-server/src/snapview-server.c: 1265 in svs_fgetxattr() ** CID 1220067: Missing parentheses (CONSTANT_EXPRESSION_RESULT) /xlators/features/snapview-server/src/snapview-server.c: 1158 in svs_getxattr() ** CID 1220066: Logically dead code (DEADCODE) /xlators/features/snapview-server/src/snapview-server.c: 1268 in svs_fgetxattr() ** CID 1220065: Logically dead code (DEADCODE) /xlators/features/snapview-server/src/snapview-server.c: 1160 in svs_getxattr() ** CID 1220064: Logically dead code (DEADCODE) /xlators/features/snapview-server/src/snapview-server.c: 594 in svs_lookup_entry_point() ** CID 1220060: Resource leak (RESOURCE_LEAK) /xlators/features/snapview-server/src/snapview-server.c: 1500 in svs_get_snapshot_list() ** CID 1220059: Resource leak (RESOURCE_LEAK) /xlators/features/snapview-server/src/snapview-server.c: 1500 in svs_get_snapshot_list() ** CID 1220058: Resource leak (RESOURCE_LEAK) /xlators/features/snapview-client/src/snapview-client.c: 1315 in svc_readdirp() ** CID 1220057: Resource leak (RESOURCE_LEAK) /xlators/features/snapview-server/src/snapview-server.c: 860 in svs_lookup_entry() ** CID 1220063: Improper use of negative value (NEGATIVE_RETURNS) /xlators/features/snapview-server/src/snapview-server.c: 1264 in svs_fgetxattr() ** CID 1220062: Improper use of negative value (NEGATIVE_RETURNS) /xlators/features/snapview-server/src/snapview-server.c: 1157 in svs_getxattr() ** CID 1220061: Array compared against 0 (NO_EFFECT) /xlators/features/snapview-server/src/snapview-server.c: 693 in svs_lookup_gfid() ** CID 1220056: Unused pointer value (UNUSED_VALUE) /xlators/features/snapview-server/src/snapview-server.c: 999 in svs_lookup() ** CID 1220055: Use after free (USE_AFTER_FREE) /xlators/features/snapview-server/src/snapview-server.c: 1319 in svs_fgetxattr() /xlators/features/snapview-server/src/snapview-server.c: 1319 in svs_fgetxattr() *** CID 1220068: Missing parentheses (CONSTANT_EXPRESSION_RESULT) /xlators/features/snapview-server/src/snapview-server.c: 1265 in svs_fgetxattr() 1259 op_errno = EINVAL; 1260 goto out; 1261 } 1262 1263 if (inode_ctx->type == SNAP_VIEW_VIRTUAL_INODE) { 1264 size = glfs_fgetxattr (glfd, name, NULL, 0); CID 1220068: Missing parentheses (CONSTANT_EXPRESSION_RESULT) "!size == -1" is always false regardless of the values of its operands. Did you intend to either negate the entire comparison expression, in which case parentheses would be required around the entire comparison expression to force that interpretation, or negate the sense of the comparison (that is, use '!=' rather than '==')? This occurs as the logical operand of if. 1265 if (!size == -1) { 1266 gf_log (this->name, GF_LOG_ERROR, "getxattr on %s " 1267 "failed (key: %s)", uuid_utoa (fd->inode->gfid), 1268 name); 1269 op_ret = -1; 1270 op_errno = errno; *** CID 1220067: Missing parentheses (CONSTANT_EXPRESSION_RESULT) /xlators/features/snapview-server/src/snapview-server.c: 1158 in svs_getxattr() 1152 op_errno = EINVAL; 1153 goto out; 1154 } else if (inode_ctx->type == SNAP_VIEW_VIRTUAL_INODE) { 1155 fs = inode_ctx->fs; 1156 object = inode_ctx->object; 1157 size = glfs_h_getxattrs (fs, object, name, NULL, 0); CID 1220067: Missing parentheses (CONSTANT_EXPRESSION_RESULT) "!size == -1" is always false regardless of the values of its operands. Did you intend to either negate the entire comparison expression, in which case parentheses would be required around the entire comparison expression to force that interpretation, or negate the sense of the comparison (that is, use '!=' rather than '==')? This occurs as the logical operand of if. 1158 if (!size == -1) { 1159
Re: [Gluster-devel] [Gluster-users] Need testers for GlusterFS 3.4.4
On 03/06/2014, at 9:05 PM, Ben Turner wrote: >> From: "Justin Clift" >> Sent: Thursday, May 29, 2014 6:12:40 PM >> Excellent Ben! Please send feedback to gluster-devel. :) > > So far so good on 3.4.4, sorry for the delay here. I had to fix my > downstream test suites to run outside of RHS / downstream gluster. I did > basic sanity testing on glusterfs mounts including: > > FSSANITY_TEST_LIST: arequal bonnie glusterfs_build compile_kernel dbench dd > ffsb fileop fsx fs_mark iozone locks ltp multiple_files posix_compliance > postmark read_large rpc syscallbench tiobench > > I am starting on NFS now, I'll have results tonight or tomorrow morning. > I'll look updating the component scripts to work and run them as well. Thanks Ben. :) + Justin -- Open Source and Standards @ Red Hat twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] [RFC] GlusterFS Operations Guide
This is a really good initiative Lala. Anything that helps Operations folks always gets my vote :) I've added a few items to the etherpad. Cheers, PC - Original Message - > From: "Lalatendu Mohanty" > To: gluster-us...@gluster.org, gluster-devel@gluster.org > Sent: Friday, 30 May, 2014 11:33:35 PM > Subject: Re: [Gluster-users] [Gluster-devel] [RFC] GlusterFS Operations Guide > On 05/30/2014 04:50 PM, Lalatendu Mohanty wrote: > > I think it is time to create an operations/ops guide for GlusterFS. > > Operations guide should address issues which administrators face while > > running/maintaining GlusterFS storage nodes. Openstack project has an > > operations guide [2] which try to address similar issues and it is pretty > > cool. > > > IMO these are the typical/example questions which operations guide should > > try > > to address. > > > * Maintenance, Failures, and Debugging > > > > * What are steps for planned maintenance for GlusterFS node? > > > > > > * Steps for replacing a failed node? > > > > > > * Steps to decommission a brick? > > > > > * Logging and Monitoring > > > > * Where are the log files? > > > > > > * How to find-out if self-heal is working properly? > > > > > > * Which log files to monitor for detecting failures? > > > > > Operating guide needs good amount of work, hence we all need to come > > together > > for this. You can contribute for this by either of the following > > > * Let know others about the questions you want to get answered in the > > operating guide. ( I have set-up a etherpad for this [1]) > > > * Answer the questions/issues raised by others. > > > Comments, suggestions? > > > Should this be part of gluster code base i.e. /doc or somewhere else? > > > [1] http://titanpad.com/op-guide > > > [2] http://docs.openstack.org/ops/oreilly-openstack-ops-guide.pdf > > > Thanks, > > > Lala > > > #lalatenduM on freenode > > > ___ > > > Gluster-devel mailing list Gluster-devel@gluster.org > > http://supercolony.gluster.org/mailman/listinfo/gluster-devel > > Resending after fixing few typos. > ___ > Gluster-users mailing list > gluster-us...@gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] All builds are failing with BUILD ERROR
On 06/03/2014 04:42 PM, Pranith Kumar Karampuri wrote: Guys its failing again with the same error: Please proceed with configuring, compiling, and installing. rm: cannot remove `/build/install/var/run/gluster/patchy': Device or resource busy + RET=1 + '[' 1 '!=' 0 ']' + VERDICT='BUILD FAILURE' Has someone changed the way builds are cleaned up? Recent regressions are now failing with ... Running automake... Running autogen.sh in argp-standalone ... Please proceed with configuring, compiling, and installing. configure: error: source directory already configured; run "make distclean" there first + RET=1 + '[' 1 '!=' 0 ']' + VERDICT='BUILD FAILURE' ... I looked and found that /var/lib/jenkins/jobs/regression/workspace contained what looked like a _previous_ successful regression build. I manually cleaned the directory (moved it and created a new workspace dir actually) and now a regression is running. What's going on? !!! -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Need testers for GlusterFS 3.4.4
On 06/04/2014 01:35 AM, Ben Turner wrote: - Original Message - From: "Justin Clift" To: "Ben Turner" Cc: "James" , gluster-us...@gluster.org, "Gluster Devel" Sent: Thursday, May 29, 2014 6:12:40 PM Subject: Re: [Gluster-users] [Gluster-devel] Need testers for GlusterFS 3.4.4 On 29/05/2014, at 8:04 PM, Ben Turner wrote: From: "James" Sent: Wednesday, May 28, 2014 5:21:21 PM On Wed, May 28, 2014 at 5:02 PM, Justin Clift wrote: Hi all, Are there any Community members around who can test the GlusterFS 3.4.4 beta (rpms are available)? I've provided all the tools and how-to to do this yourself. Should probably take about ~20 min. Old example: https://ttboj.wordpress.com/2014/01/16/testing-glusterfs-during-glusterfest/ Same process should work, except base your testing on the latest vagrant article: https://ttboj.wordpress.com/2014/05/13/vagrant-on-fedora-with-libvirt-reprise/ If you haven't set it up already. I can help out here, I'll have a chance to run through some stuff this weekend. Where should I post feedback? Excellent Ben! Please send feedback to gluster-devel. :) So far so good on 3.4.4, sorry for the delay here. I had to fix my downstream test suites to run outside of RHS / downstream gluster. I did basic sanity testing on glusterfs mounts including: FSSANITY_TEST_LIST: arequal bonnie glusterfs_build compile_kernel dbench dd ffsb fileop fsx fs_mark iozone locks ltp multiple_files posix_compliance postmark read_large rpc syscallbench tiobench I am starting on NFS now, I'll have results tonight or tomorrow morning. I'll look updating the component scripts to work and run them as well. Thanks a lot for this ben. Justin, Ben, Do you think we can automate running of these scripts without a lot of human intervention? If yes, how can I help? We can use that just before making any release in future :-). Pranith -b + Justin -- Open Source and Standards @ Red Hat twitter.com/realjustinclift ___ Gluster-users mailing list gluster-us...@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] autodelete in snapshots
- Original Message - > From: "M S Vishwanath Bhat" > To: "Rajesh Joseph" > Cc: "Vijay Bellur" , "Seema Naik" , > "Gluster Devel" > > Sent: Tuesday, June 3, 2014 5:55:27 PM > Subject: Re: [Gluster-devel] autodelete in snapshots > > On 3 June 2014 15:21, Rajesh Joseph wrote: > > > > > > > - Original Message - > > From: "M S Vishwanath Bhat" > > To: "Vijay Bellur" > > Cc: "Seema Naik" , "Gluster Devel" < > > gluster-devel@gluster.org> > > Sent: Tuesday, June 3, 2014 1:02:08 AM > > Subject: Re: [Gluster-devel] autodelete in snapshots > > > > > > > > > > On 2 June 2014 20:22, Vijay Bellur < vbel...@redhat.com > wrote: > > > > > > > > On 04/23/2014 05:50 AM, Vijay Bellur wrote: > > > > > > On 04/20/2014 11:42 PM, Lalatendu Mohanty wrote: > > > > > > On 04/16/2014 11:39 AM, Avra Sengupta wrote: > > > > > > The whole purpose of introducing the soft-limit is, that at any point > > of time the number of > > snaps should not exceed the hard limit. If we trigger auto-delete on > > hitting hard-limit, then > > the purpose itself is lost, because at that point we would be taking a > > snap, making the limit > > hard-limit + 1, and then triggering auto-delete, which violates the > > sanctity of the hard-limit. > > Also what happens when we are at hard-limit + 1, and another snap is > > issued, while auto-delete > > is yet to process the first delete. At that point we end up at > > hard-limit + 1. Also what happens > > if for a particular snap the auto-delete fails. > > > > We should see the hard-limit, as something set by the admin keeping in > > mind the resource consumption > > and at no-point should we cross this limit, come what may. If we hit > > this limit, the create command > > should fail asking the user to delete snaps using the "snapshot > > delete" command. > > > > The two options Raghavendra mentioned are applicable for the > > soft-limit only, in which cases on > > hitting the soft-limit > > > > 1. Trigger auto-delete > > > > or > > > > 2. Log a warning-message, for the user saying the number of snaps is > > exceeding the snap-limit and > > display the number of available snaps > > > > Now which of these should happen also depends on the user, because the > > auto-delete option > > is configurable. > > > > So if the auto-delete option is set as true, auto-delete should be > > triggered and the above message > > should also be logged. > > > > But if the option is set as false, only the message should be logged. > > > > This is the behaviour as designed. Adding Rahul, and Seema in the > > mail, to reflect upon the > > behaviour as well. > > > > Regards, > > Avra > > > > This sounds correct. However we need to make sure that the usage or > > documentation around this should be good enough , so that users > > understand the each of the limits correctly. > > > > > > It might be better to avoid the usage of the term "soft-limit". > > soft-limit as used in quota and other places generally has an alerting > > connotation. Something like "auto-deletion-limit" might be better. > > > > > > I still see references to "soft-limit" and auto deletion seems to get > > triggered upon reaching soft-limit. > > > > Why is the ability to auto delete not configurable? It does seem pretty > > nasty to go about deleting snapshots without obtaining explicit consent > > from the user. > > > > I agree with Vijay here. It's not good to delete a snap (even though it is > > oldest) without the explicit consent from user. > > > > FYI It took me more than 2 weeks to figure out that my snaps were getting > > autodeleted after reaching "soft-limit". For all I know I had not done > > anything and my snap restore were failing. > > > > I propose to remove the terms "soft" and "hard" limit. I believe there > > should be a limit (just "limit") after which all snapshot creates should > > fail with proper error messages. And there can be a water-mark after which > > user should get warning messages. So below is my proposal. > > > > auto-delete + snap-limit: If the snap-limit is set to n , next snap create > > (n+1th) will succeed only if if auto-delete is set to on/true/1 and oldest > > snap will get deleted automatically. If autodelete is set to off/false/0 , > > (n+1)th snap create will fail with proper error message from gluster CLI > > command. But again by default autodelete should be off. > > > > snap-water-mark : This should come in picture only if autodelete is turned > > off. It should not have any meaning if auto-delete is turned ON. Basically > > it's usage is to give the user warning that limit almost being reached and > > it is time for admin to decide which snaps should be deleted (or which > > should be kept) > > > > *my two cents* > > > > -MS > > > > > > The reason for having a hard-limit is to stop snapshot creation once we > > reached this limit. This helps to have a control over the resource > > consumption. Therefore if we only have this limit (as snap-limit) then > > there is no question of auto-delet
[Gluster-devel] Regarding doing away with refkeeper in locks xlator
Hi, Recently there was a crash in locks translator (BZ 1103347, BZ 1097102) with the following backtrace: (gdb) bt #0 uuid_unpack (in=0x8 , uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44 #1 0x7feeba9e19d6 in uuid_unparse_x (uu=, out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9", fmt=0x7feebaa08e00 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at ../../contrib/uuid/unparse.c:55 #2 0x7feeba9be837 in uuid_utoa (uuid=0x8 ) at common-utils.c:2138 #3 0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:396 #4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:428 #5 0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910, client=) at posix.c:2550 #6 0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at client_t.c:368 #7 0x7feeab77ed48 in server_connection_cleanup (this=0x2316390, client=0x27724a0, flags=) at server-helpers.c:354 #8 0x7feeab77ae2c in server_rpc_notify (rpc=, xl=0x2316390, event=, data=0x2bf51c0) at server.c:527 #9 0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980, trans=0x2bf51c0) at rpcsvc.c:720 #10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, mydata=, event=, data=0x2bf51c0) at rpcsvc.c:758 #11 0x7feeba778638 in rpc_transport_notify (this=, event=, data=) at rpc-transport.c:512 #12 0x7feeb115e971 in socket_event_poll_err (fd=, idx=, data=0x2bf51c0, poll_in=, poll_out=0, poll_err=0) at socket.c:1071 #13 socket_event_handler (fd=, idx=, data=0x2bf51c0, poll_in=, poll_out=0, poll_err=0) at socket.c:2240 #14 0x7feeba9fc6a7 in event_dispatch_epoll_handler (event_pool=0x22e2d00) at event-epoll.c:384 #15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445 #16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at glusterfsd.c:2023 (gdb) f 4 #4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:428 428 pl_inodelk_log_cleanup (l); (gdb) p l->pl_inode->refkeeper $1 = (inode_t *) 0x0 (gdb) pl_inode->refkeeper was found to be NULL even when there were some blocked inodelks in a certain domain of the inode, which when dereferenced by the epoll thread in the cleanup codepath led to a crash. On inspecting the code (for want of a consistent reproducer), three things were found: 1. The function where the crash happens (pl_inodelk_log_cleanup()), makes an attempt to resolve the inode to path as can be seen below. But the way inode_path() itself works is to first construct the path based on the given inode's ancestry and place it in the buffer provided. And if all else fails, the gfid of the inode is placed in a certain format (""). This eliminates the need for statements from line 4 through 7 below, thereby "preventing" dereferencing of pl_inode->refkeeper. Now, although this change prevents the crash altogether, it still does not fix the race that led to pl_inode->refkeeper becoming NULL, and comes at the cost of printing "(null)" in the log message on line 9 every time pl_inode->refkeeper is found to be NULL, rendering the logged messages somewhat useless. 0 pl_inode = lock->pl_inode; 1 2 inode_path (pl_inode->refkeeper, NULL, &path); 3 4 if (path) 5 file = path; 6 else 7 file = uuid_utoa (pl_inode->refkeeper->gfid); 8 9 gf_log (THIS->name, GF_LOG_WARNING, 10 "releasing lock on %s held by " 11 "{client=%p, pid=%"PRId64" lk-owner=%s}", 12 file, lock->client, (uint64_t) lock->client_pid, 13 lkowner_utoa (&lock->owner)); <\code> 2. There is at least one codepath found that can lead to this crash: Imagine an inode on which an inodelk operation is attempted by a client and is successfully granted too. Now, between the time the lock was granted and pl_update_refkeeper() was called by this thread, the client could send a DISCONNECT event, causing cleanup codepath to be executed, where the epoll thread crashes on dereferencing pl_inode->refkeeper which is STILL NULL at this point. Besides, there are still places in locks xlator where the refkeeper is NOT updated whenever the lists are modified - for instance in the cleanup codepath from a DISCONNECT. 3. Also, pl_update_refkeeper() seems to be not taking into account blocked locks on the inode in the __pl_inode_is_empty() check, when it should, as there could still be cases where the granted list could be empty but not the blocked list at the time of udpating the refkeeper, in which case pl_inode must still take ref on the inode. Proposed solution to 2/3: 1. Do away with refkeeper in pl_inode_t altogether. 2. Let every lock object (inodelk/entryllk/posix_lock) have an inode_t * member to act as a placeholder for the associated inode object on which it is locking. 3. Let each lock object hold a ref on the inode at the time of its creation and unref the inode during its destruction. Please let me know what you think of the above. -Krutika ___
Re: [Gluster-devel] All builds are failing with BUILD ERROR
And since doing this, regression runs seem to be proceeding without issues. On 06/04/2014 09:59 AM, Kaleb KEITHLEY wrote: On 06/03/2014 04:42 PM, Pranith Kumar Karampuri wrote: Guys its failing again with the same error: Please proceed with configuring, compiling, and installing. rm: cannot remove `/build/install/var/run/gluster/patchy': Device or resource busy + RET=1 + '[' 1 '!=' 0 ']' + VERDICT='BUILD FAILURE' Has someone changed the way builds are cleaned up? Recent regressions are now failing with ... Running automake... Running autogen.sh in argp-standalone ... Please proceed with configuring, compiling, and installing. configure: error: source directory already configured; run "make distclean" there first + RET=1 + '[' 1 '!=' 0 ']' + VERDICT='BUILD FAILURE' ... I looked and found that /var/lib/jenkins/jobs/regression/workspace contained what looked like a _previous_ successful regression build. I manually cleaned the directory (moved it and created a new workspace dir actually) and now a regression is running. What's going on? !!! -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator
On 06/04/2014 11:37 AM, Krutika Dhananjay wrote: Hi, Recently there was a crash in locks translator (BZ 1103347, BZ 1097102) with the following backtrace: (gdb) bt #0 uuid_unpack (in=0x8 , uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44 #1 0x7feeba9e19d6 in uuid_unparse_x (uu=, out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9", fmt=0x7feebaa08e00 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at ../../contrib/uuid/unparse.c:55 #2 0x7feeba9be837 in uuid_utoa (uuid=0x8 bounds>) at common-utils.c:2138 #3 0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:396 #4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:428 #5 0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910, client=) at posix.c:2550 #6 0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at client_t.c:368 #7 0x7feeab77ed48 in server_connection_cleanup (this=0x2316390, client=0x27724a0, flags=) at server-helpers.c:354 #8 0x7feeab77ae2c in server_rpc_notify (rpc=out>, xl=0x2316390, event=, data=0x2bf51c0) at server.c:527 #9 0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980, trans=0x2bf51c0) at rpcsvc.c:720 #10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, mydata=, event=, data=0x2bf51c0) at rpcsvc.c:758 #11 0x7feeba778638 in rpc_transport_notify (this=out>, event=, data=) at rpc-transport.c:512 #12 0x7feeb115e971 in socket_event_poll_err (fd=out>, idx=, data=0x2bf51c0, poll_in=optimized out>, poll_out=0, poll_err=0) at socket.c:1071 #13 socket_event_handler (fd=, idx=optimized out>, data=0x2bf51c0, poll_in=, poll_out=0, poll_err=0) at socket.c:2240 #14 0x7feeba9fc6a7 in event_dispatch_epoll_handler (event_pool=0x22e2d00) at event-epoll.c:384 #15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445 #16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at glusterfsd.c:2023 (gdb) f 4 #4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:428 428pl_inodelk_log_cleanup (l); (gdb) p l->pl_inode->refkeeper $1 = (inode_t *) 0x0 (gdb) pl_inode->refkeeper was found to be NULL even when there were some blocked inodelks in a certain domain of the inode, which when dereferenced by the epoll thread in the cleanup codepath led to a crash. On inspecting the code (for want of a consistent reproducer), three things were found: 1. The function where the crash happens (pl_inodelk_log_cleanup()), makes an attempt to resolve the inode to path as can be seen below. But the way inode_path() itself works is to first construct the path based on the given inode's ancestry and place it in the buffer provided. And if all else fails, the gfid of the inode is placed in a certain format (""). This eliminates the need for statements from line 4 through 7 below, thereby "preventing" dereferencing of pl_inode->refkeeper. Now, although this change prevents the crash altogether, it still does not fix the race that led to pl_inode->refkeeper becoming NULL, and comes at the cost of printing "(null)" in the log message on line 9 every time pl_inode->refkeeper is found to be NULL, rendering the logged messages somewhat useless. 0 pl_inode = lock->pl_inode; 1 2 inode_path (pl_inode->refkeeper, NULL, &path); 3 4 if (path) 5 file = path; 6 else 7 file = uuid_utoa (pl_inode->refkeeper->gfid); 8 9 gf_log (THIS->name, GF_LOG_WARNING, 10 "releasing lock on %s held by " 11 "{client=%p, pid=%"PRId64" lk-owner=%s}", 12 file, lock->client, (uint64_t) lock->client_pid, 13 lkowner_utoa (&lock->owner)); <\code> I think this logging code is from the days when gfid handle concept was not there. So it wasn't returning in cases the path is not present in the dentries. I believe the else block can be deleted safely now. Pranith 2. There is at least one codepath found that can lead to this crash: Imagine an inode on which an inodelk operation is attempted by a client and is successfully granted too. Now, between the time the lock was granted and pl_update_refkeeper() was called by this thread, the client could send a DISCONNECT event, causing cleanup codepath to be executed, where the epoll thread crashes on dereferencing pl_inode->refkeeper which is STILL NULL at this point. Besides, there are still places in locks xlator where the refkeeper is NOT updated whenever the lists are modified - for instance in the cleanup codepath from a DISCONNECT. 3. Also, pl_update_refkeeper() seems to be not taking into account blocked locks on the inode in the __pl_inode_is_empty() check, when it should, as there could still be cases where the granted list could be empty but not the blocked list at the time of udpating the refkeeper, in which c
Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator
On 06/04/2014 12:02 PM, Pranith Kumar Karampuri wrote: On 06/04/2014 11:37 AM, Krutika Dhananjay wrote: Hi, Recently there was a crash in locks translator (BZ 1103347, BZ 1097102) with the following backtrace: (gdb) bt #0 uuid_unpack (in=0x8 , uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44 #1 0x7feeba9e19d6 in uuid_unparse_x (uu=, out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9", fmt=0x7feebaa08e00 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at ../../contrib/uuid/unparse.c:55 #2 0x7feeba9be837 in uuid_utoa (uuid=0x8 bounds>) at common-utils.c:2138 #3 0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:396 #4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:428 #5 0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910, client=) at posix.c:2550 #6 0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at client_t.c:368 #7 0x7feeab77ed48 in server_connection_cleanup (this=0x2316390, client=0x27724a0, flags=) at server-helpers.c:354 #8 0x7feeab77ae2c in server_rpc_notify (rpc=out>, xl=0x2316390, event=, data=0x2bf51c0) at server.c:527 #9 0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980, trans=0x2bf51c0) at rpcsvc.c:720 #10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, mydata=, event=, data=0x2bf51c0) at rpcsvc.c:758 #11 0x7feeba778638 in rpc_transport_notify (this=out>, event=, data=) at rpc-transport.c:512 #12 0x7feeb115e971 in socket_event_poll_err (fd=out>, idx=, data=0x2bf51c0, poll_in=optimized out>, poll_out=0, poll_err=0) at socket.c:1071 #13 socket_event_handler (fd=, idx=optimized out>, data=0x2bf51c0, poll_in=, poll_out=0, poll_err=0) at socket.c:2240 #14 0x7feeba9fc6a7 in event_dispatch_epoll_handler (event_pool=0x22e2d00) at event-epoll.c:384 #15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445 #16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at glusterfsd.c:2023 (gdb) f 4 #4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:428 428pl_inodelk_log_cleanup (l); (gdb) p l->pl_inode->refkeeper $1 = (inode_t *) 0x0 (gdb) pl_inode->refkeeper was found to be NULL even when there were some blocked inodelks in a certain domain of the inode, which when dereferenced by the epoll thread in the cleanup codepath led to a crash. On inspecting the code (for want of a consistent reproducer), three things were found: 1. The function where the crash happens (pl_inodelk_log_cleanup()), makes an attempt to resolve the inode to path as can be seen below. But the way inode_path() itself works is to first construct the path based on the given inode's ancestry and place it in the buffer provided. And if all else fails, the gfid of the inode is placed in a certain format (""). This eliminates the need for statements from line 4 through 7 below, thereby "preventing" dereferencing of pl_inode->refkeeper. Now, although this change prevents the crash altogether, it still does not fix the race that led to pl_inode->refkeeper becoming NULL, and comes at the cost of printing "(null)" in the log message on line 9 every time pl_inode->refkeeper is found to be NULL, rendering the logged messages somewhat useless. 0 pl_inode = lock->pl_inode; 1 2 inode_path (pl_inode->refkeeper, NULL, &path); 3 4 if (path) 5 file = path; 6 else 7 file = uuid_utoa (pl_inode->refkeeper->gfid); 8 9 gf_log (THIS->name, GF_LOG_WARNING, 10 "releasing lock on %s held by " 11 "{client=%p, pid=%"PRId64" lk-owner=%s}", 12 file, lock->client, (uint64_t) lock->client_pid, 13 lkowner_utoa (&lock->owner)); <\code> I think this logging code is from the days when gfid handle concept was not there. So it wasn't returning in cases the path is not present in the dentries. I believe the else block can be deleted safely now. Pranith 2. There is at least one codepath found that can lead to this crash: Imagine an inode on which an inodelk operation is attempted by a client and is successfully granted too. Now, between the time the lock was granted and pl_update_refkeeper() was called by this thread, the client could send a DISCONNECT event, causing cleanup codepath to be executed, where the epoll thread crashes on dereferencing pl_inode->refkeeper which is STILL NULL at this point. Besides, there are still places in locks xlator where the refkeeper is NOT updated whenever the lists are modified - for instance in the cleanup codepath from a DISCONNECT. 3. Also, pl_update_refkeeper() seems to be not taking into account blocked locks on the inode in the __pl_inode_is_empty() check, when it should, as there could still be cases where the granted list could be empty but not the blocke