So... do we need to spin up new packages for existing releases? ----- Original Message ----- > hi Xavier, > > > The str_uuid pointer used can point to the uuid of another thread that > > could have been modified. I don't know very well the internals of > > syncops and synctask, but I think it's even possible that another task > > running on the same thread while the current one was sleeping could > > modify the contents of str_uuid. > > > > This case seems very obvious and useless but maybe other combinations > > are harder to detect. > > > > I think that currently they are only used by uuid_utoa() and > > lkowner_utoa() to write log messages, which seems safe, but we will have > > to be careful when using syncops to avoid possible future problems like > > this. > > Probably the reason why I never considered these functions as a danger is for > some reason > I always thought they are supposed to be used in logging functions. After > your > response I went through the commits and indeed they are introduced just for > logging. > > Here is the snippet from the bug > (https://bugzilla.redhat.com/show_bug.cgi?id=GLUSTER-2308) > as part of which this is introduced: > > "This function would take in a uuid_t and return a static char buffer. > This would be useful in logging the uuid_t. We already make use of > uuid_unparse() provided by libuuid to convert uuid_t to string but that > takes an additional out buffer parameter, does not return the out > buffer and hence cannot be used directly within logging functions like > gf_log (). This necessitates uuid_unparse () to be done even if we are > not going to log the uuid (due to the log message being at a lower log > level) and hence it does have a performance impact. In all such > instances, let us get rid of uuid_unparse() and use uuid_utoa() instead. > > Prototype :-> char * uuid_utoa (uuid_t uuid)" > > lkowner_utoa is implemented much later for similar usage. > > Both these functions have re-entrant variants, lkowner_utoa_r() and > uuid_utoa_r(). > > So while the case that you showed for uuid_utoa() exposes the bug, I am not > sure if it is supposed to be used in that way to begin with. > Even in gf_log() if we want to log more than 1 uuid we should not be using > this function because it ends up logging same uuid twice. > > Pranith. > > > Xavi > > > > El 11/12/13 12:02, Pranith Kumar Karampuri ha escrit: > > >> it's really a very interesting and hard to find bug, but I'm wondering > > >> how we can prevent this to happen in the general case. There might be > > >> other operations based on pointers to thread local storage that will > > >> suffer this problem. Probably 'errno' is one of the most dangerous, but > > >> TLS is also used to resolve 'THIS' in a very similar way to errno, and > > >> there are temporary uuid and lkowner values also stored into TLS. More > > >> things could appear in the future. These are also potential cases to > > >> have problems with syncops. > > > > > >> An access to THIS before and after a syncop might trigger this bug, > > >> right > > >> ? > > > for errno it happens because it has the attribute __attribute__((const)). > > > > > > # grep errno_location /usr/include/bits/errno.h > > > extern int *__errno_location (void) __THROW __attribute__ ((__const__)); > > > # define errno (*__errno_location ()) > > > > > > (The following article explains it best IMO: > > > http://lwn.net/Articles/285332) > > > > > > But it is not there for lkowner, uuid-buf, THIS. Just to confirm, I added > > > the following code > > > ------------------------------------------------------ > > > + xlator_t *tmp_this = NULL; > > > > > > gf_log (this->name, GF_LOG_INFO, "migrate data called on %s", > > > loc->path); > > > @@ -1258,8 +1259,10 @@ gf_defrag_migrate_data (xlator_t *this, > > > gf_defrag_info_t *defrag, loc_t *loc, > > > * and also that guarantees that file has to be > > > mostly > > > * migrated */ > > > > > > + tmp_this = THIS; > > > ret = syncop_getxattr (this, &entry_loc, &dict, > > > GF_XATTR_LINKINFO_KEY); > > > + tmp_this = THIS; > > > if (ret < 0) { > > > if (errno != ENODATA) { > > > loglevel = GF_LOG_ERROR; > > > @@ -1267,7 +1270,7 @@ gf_defrag_migrate_data (xlator_t *this, > > > gf_defrag_info_t *defrag, loc_t *loc, > > > } else { > > > loglevel = GF_LOG_TRACE; > > > } > > > - gf_log (this->name, loglevel, "%s: > > > failed > > > to " > > > + gf_log (tmp_this->name, loglevel, "%s: > > > failed to " > > > ----------------------------------------------------- > > > assembly code it generated: > > > ===================================================== > > > 1262 tmp_this = THIS; > > > 0x00007faf17120dbc <+1788>: xor %eax,%eax > > > 0x00007faf17120dbe <+1790>: callq 0x7faf171191d0 > > > <__glusterfs_this_location@plt> <<----------------first call to THIS > > > > > > 1263 ret = syncop_getxattr (this, &entry_loc, > > > &dict, > > > 0x00007faf17120dc3 <+1795>: mov 0x60(%rsp),%rdx > > > 0x00007faf17120dc8 <+1800>: mov 0x28(%rsp),%rsi > > > 0x00007faf17120dcd <+1805>: lea 0x2b550(%rip),%rcx # > > > 0x7faf1714c324 > > > 0x00007faf17120dd4 <+1812>: mov 0x18(%rsp),%rdi > > > 0x00007faf17120dd9 <+1817>: callq 0x7faf17118a30 > > > <syncop_getxattr@plt> > > > 0x00007faf17120dde <+1822>: mov %eax,%ebp > > > > > > 1264 > > > GF_XATTR_LINKINFO_KEY); > > > ---Type <return> to continue, or q <return> to quit--- > > > 1265 tmp_this = THIS; > > > 0x00007faf17120de0 <+1824>: xor %eax,%eax > > > 0x00007faf17120de2 <+1826>: callq 0x7faf171191d0 > > > <__glusterfs_this_location@plt> <<----------------second call to THIS > > > 0x00007faf17120de9 <+1833>: mov (%rax),%r15 > > > > > > 1266 if (ret < 0) { > > > 0x00007faf17120de7 <+1831>: test %ebp,%ebp > > > 0x00007faf17120dec <+1836>: js 0x7faf1712100e > > > <gf_defrag_migrate_data+2382> > > > > > > 1267 if (errno != ENODATA) { > > > 0x00007faf1712100e <+2382>: mov 0x50(%rsp),%rax > > > <<<---------------------------------------------check that errno does > > > not callq __errno_location > > > 0x00007faf17121013 <+2387>: mov $0x9,%ebp > > > 0x00007faf17121018 <+2392>: mov (%rax),%edi > > > 0x00007faf1712101a <+2394>: cmp $0x3d,%edi > > > > > > ================================================ > > > > > >> I think it's very difficult to track all these cases and handle them > > >> correctly. Another solution could be to tell the compiler to forget > > >> previous pointer values when a syncop is called, but I don't see any way > > >> to do so. Since it's storing pointers to values, any barrier or volatile > > >> seems useless. > > > We can do this by putting __attribute__((noinline)) for such functions. > > > But > > > since errno is not in our control we can't do that. > > > > > >> Xavi > > >> > > >> > > >> El 11/12/13 09:51, Pranith Kumar Karampuri ha escrit: > > >>> hi, > > >>> We found a day-1 bug when syncop_xxx() infra is used inside a > > >>> synctask > > >>> with compilation optimization (CFLAGS -O2). This bug has been > > >>> dormant > > >>> for at least 2 years. > > >>> There are around ~400(rebalance, replace-brick, bd, self-heal-daemon, > > >>> quota, fuse lock/fd migration) places where syncop is used in the code > > >>> base all of which are potential candidates which can take the hit. > > >>> > > >>> I sent first round of patch at http://review.gluster.com/6475 to catch > > >>> regressions upstream. > > >>> These are the files that are affected by the changes I introduced to > > >>> fix > > >>> this: > > >>> > > >>> api/src/glfs-fops.c | 36 > > >>> ++++++++++++++++++++++++++++++++++ > > >>> api/src/glfs-handleops.c | 15 ++++++++++++++ > > >>> api/src/glfs-internal.h | 7 +++++++ > > >>> api/src/glfs-resolve.c | 10 ++++++++++ > > >>> libglusterfs/src/syncop.c | 117 > > >>> > > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------- > > >>> xlators/cluster/afr/src/afr-self-heald.c | 45 > > >>> +++++++++++++++++++++++++++++------------- > > >>> xlators/cluster/afr/src/pump.c | 12 ++++++++++-- > > >>> xlators/cluster/dht/src/dht-helper.c | 24 > > >>> +++++++++++++++-------- > > >>> xlators/cluster/dht/src/dht-rebalance.c | 168 > > >>> > > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------------------- > > >>> xlators/cluster/dht/src/dht-selfheal.c | 6 ++++-- > > >>> xlators/features/locks/src/posix.c | 3 ++- > > >>> xlators/features/qemu-block/src/bdrv-xlator.c | 15 ++++---------- > > >>> xlators/features/qemu-block/src/qb-coroutines.c | 14 ++++++++++---- > > >>> xlators/mount/fuse/src/fuse-bridge.c | 16 > > >>> ++++++++++----- > > >>> > > >>> Please review your respective component for these changes in gerrit. > > >>> > > >>> Thanks > > >>> Pranith. > > >>> > > >>> Detailed explanation of the Root cause: > > >>> We found the bug in 'gf_defrag_migrate_data' in rebalance operation: > > >>> > > >>> Lets look at interesting parts of the function: > > >>> > > >>> int > > >>> gf_defrag_migrate_data (xlator_t *this, gf_defrag_info_t *defrag, loc_t > > >>> *loc, > > >>> dict_t *migrate_data) > > >>> { > > >>> ..... > > >>> code section - [ Loop ] > > >>> while ((ret = syncop_readdirp (this, fd, 131072, offset, > > >>> NULL, > > >>> &entries)) != 0) { > > >>> ..... > > >>> code section - [ ERRNO-1 ] (errno of readdirp is stored in > > >>> readdir_operrno > > >>> by a thread) > > >>> /* Need to keep track of ENOENT errno, that means, > > >>> there > > >>> is no > > >>> need to send more readdirp() */ > > >>> readdir_operrno = errno; > > >>> ..... > > >>> code section - [ SYNCOP-1 ] (syncop_getxattr is called by a thread) > > >>> ret = syncop_getxattr (this, &entry_loc, > > >>> &dict, > > >>> GF_XATTR_LINKINFO_KEY); > > >>> code section - [ ERRNO-2] (checking for failures of > > >>> syncop_getxattr(). > > >>> This may not always be executed in same thread which executed > > >>> [SYNCOP-1]) > > >>> if (ret < 0) { > > >>> if (errno != ENODATA) { > > >>> loglevel = GF_LOG_ERROR; > > >>> defrag->total_failures += 1; > > >>> ..... > > >>> } > > >>> > > >>> the function above could be executed by thread(t1) till [SYNCOP-1] and > > >>> code > > >>> from [ERRNO-2] can be executed by a different thread(t2) because of the > > >>> way syncop-infra schedules the tasks. > > >>> > > >>> when the code is compiled with -O2 optimization this is the assembly > > >>> code > > >>> that is generated: > > >>> [ERRNO-1] > > >>> 1165 readdir_operrno = errno; <<---- errno gets > > >>> expanded as *(__errno_location()) > > >>> 0x00007fd149d48b60 <+496>: callq 0x7fd149d410c0 > > >>> <__errno_location@plt> > > >>> 0x00007fd149d48b72 <+514>: mov %rax,0x50(%rsp) <<------ > > >>> Address returned by __errno_location() is stored in a special > > >>> location > > >>> in stack for later use. > > >>> 0x00007fd149d48b77 <+519>: mov (%rax),%eax > > >>> 0x00007fd149d48b79 <+521>: mov %eax,0x78(%rsp) > > >>> .... > > >>> [ERRNO-2] > > >>> 1281 if (errno != ENODATA) { > > >>> 0x00007fd149d492ae <+2366>: mov 0x50(%rsp),%rax <<----- > > >>> Because it already stored the address returned by > > >>> __errno_location(), > > >>> it just dereferences the address to get the errno value. BUT THIS > > >>> CODE > > >>> NEED NOT BE EXECUTED BY SAME THREAD!!! > > >>> 0x00007fd149d492b3 <+2371>: mov $0x9,%ebp > > >>> 0x00007fd149d492b8 <+2376>: mov (%rax),%edi > > >>> 0x00007fd149d492ba <+2378>: cmp $0x3d,%edi > > >>> > > >>> The problem is that __errno_location() value of t1 and t2 are > > >>> different. > > >>> So > > >>> [ERRNO-2] ends up reading errno of t1 instead of errno of t2 even > > >>> though > > >>> t2 is executing [ERRNO-2] code section. > > >>> > > >>> When code is compiled without any optimization for [ERRNO-2]: > > >>> 1281 if (errno != ENODATA) { > > >>> 0x00007fd58e7a326f <+2237>: callq 0x7fd58e797300 > > >>> <__errno_location@plt><<--- As it is calling __errno_location() > > >>> again > > >>> it gets the location from t2 so it works as intended. > > >>> 0x00007fd58e7a3274 <+2242>: mov (%rax),%eax > > >>> 0x00007fd58e7a3276 <+2244>: cmp $0x3d,%eax > > >>> 0x00007fd58e7a3279 <+2247>: je 0x7fd58e7a32a1 > > >>> <gf_defrag_migrate_data+2287> > > >>> > > >>> Fix: > > >>> We decided to make syncop_xxx() return (-errno) value as the return > > >>> value > > >>> in case of errors and all the functions which make syncop_xxx() will > > >>> need > > >>> to use (-ret) to figure out the reason for failure in case of > > >>> syncop_xxx() > > >>> failures. > > >>> > > >>> Pranith > > >>> > > >>> _______________________________________________ > > >>> Gluster-devel mailing list > > >>> Gluster-devel@nongnu.org > > >>> https://lists.nongnu.org/mailman/listinfo/gluster-devel > > >> > > > _______________________________________________ > > > Gluster-devel mailing list > > > Gluster-devel@nongnu.org > > > https://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@nongnu.org > https://lists.nongnu.org/mailman/listinfo/gluster-devel >
_______________________________________________ Gluster-devel mailing list Gluster-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/gluster-devel