The decision of doing away with -O2 is beyond our control, and we shouldn't have code which depend on optimization to be disabled to behave properly. Representing -errno as return is the cleanest fix (that's how other projects which use setcontext/getcontext are behaving too.) If there are any new further issues which arise from setcontext/getcontext, I'm tempted to change the internal implementation to use a a vanilla pthread pool.
Avati On Wed, Dec 11, 2013 at 10:29 PM, Anand Subramanian <ansub...@redhat.com>wrote: > Is doing away with -O2 an option that was ever considered or is it that we > simply must have O2 on? > > (I understand that turning off O2 can open some so-far-unexposed can of > worms and a lot of soaking maybe required, and also that we may have had a > good set of perf related reasons to have settled on -O2 in the first place, > but wanted to understand nevertheless...) > > Anand > > > > On 12/11/2013 02:21 PM, Pranith Kumar Karampuri wrote: > >> hi, >> We found a day-1 bug when syncop_xxx() infra is used inside a >> synctask with compilation optimization (CFLAGS -O2). This bug has been >> dormant for at least 2 years. >> There are around ~400(rebalance, replace-brick, bd, self-heal-daemon, >> quota, fuse lock/fd migration) places where syncop is used in the code base >> all of which are potential candidates which can take the hit. >> >> I sent first round of patch at http://review.gluster.com/6475 to catch >> regressions upstream. >> These are the files that are affected by the changes I introduced to fix >> this: >> >> api/src/glfs-fops.c | 36 >> ++++++++++++++++++++++++++++++++++ >> api/src/glfs-handleops.c | 15 ++++++++++++++ >> api/src/glfs-internal.h | 7 +++++++ >> api/src/glfs-resolve.c | 10 ++++++++++ >> libglusterfs/src/syncop.c | 117 >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++++++++++------------------------------------- >> xlators/cluster/afr/src/afr-self-heald.c | 45 >> +++++++++++++++++++++++++++++------------- >> xlators/cluster/afr/src/pump.c | 12 ++++++++++-- >> xlators/cluster/dht/src/dht-helper.c | 24 >> +++++++++++++++-------- >> xlators/cluster/dht/src/dht-rebalance.c | 168 >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++++++++++++++++++++++++++++++++-------------------------- >> ------------------------------------- >> xlators/cluster/dht/src/dht-selfheal.c | 6 ++++-- >> xlators/features/locks/src/posix.c | 3 ++- >> xlators/features/qemu-block/src/bdrv-xlator.c | 15 ++++---------- >> xlators/features/qemu-block/src/qb-coroutines.c | 14 ++++++++++---- >> xlators/mount/fuse/src/fuse-bridge.c | 16 ++++++++++----- >> >> Please review your respective component for these changes in gerrit. >> >> Thanks >> Pranith. >> >> Detailed explanation of the Root cause: >> We found the bug in 'gf_defrag_migrate_data' in rebalance operation: >> >> Lets look at interesting parts of the function: >> >> int >> gf_defrag_migrate_data (xlator_t *this, gf_defrag_info_t *defrag, loc_t >> *loc, >> dict_t *migrate_data) >> { >> ..... >> code section - [ Loop ] >> while ((ret = syncop_readdirp (this, fd, 131072, offset, NULL, >> &entries)) != 0) { >> ..... >> code section - [ ERRNO-1 ] (errno of readdirp is stored in >> readdir_operrno by a thread) >> /* Need to keep track of ENOENT errno, that means, there >> is no >> need to send more readdirp() */ >> readdir_operrno = errno; >> ..... >> code section - [ SYNCOP-1 ] (syncop_getxattr is called by a thread) >> ret = syncop_getxattr (this, &entry_loc, &dict, >> GF_XATTR_LINKINFO_KEY); >> code section - [ ERRNO-2] (checking for failures of syncop_getxattr(). >> This may not always be executed in same thread which executed [SYNCOP-1]) >> if (ret < 0) { >> if (errno != ENODATA) { >> loglevel = GF_LOG_ERROR; >> defrag->total_failures += 1; >> ..... >> } >> >> the function above could be executed by thread(t1) till [SYNCOP-1] and >> code from [ERRNO-2] can be executed by a different thread(t2) because of >> the way syncop-infra schedules the tasks. >> >> when the code is compiled with -O2 optimization this is the assembly code >> that is generated: >> [ERRNO-1] >> 1165 readdir_operrno = errno; <<---- errno gets >> expanded as *(__errno_location()) >> 0x00007fd149d48b60 <+496>: callq 0x7fd149d410c0 >> <__errno_location@plt> >> 0x00007fd149d48b72 <+514>: mov %rax,0x50(%rsp) <<------ >> Address returned by __errno_location() is stored in a special location in >> stack for later use. >> 0x00007fd149d48b77 <+519>: mov (%rax),%eax >> 0x00007fd149d48b79 <+521>: mov %eax,0x78(%rsp) >> .... >> [ERRNO-2] >> 1281 if (errno != ENODATA) { >> 0x00007fd149d492ae <+2366>: mov 0x50(%rsp),%rax <<----- >> Because it already stored the address returned by __errno_location(), it >> just dereferences the address to get the errno value. BUT THIS CODE NEED >> NOT BE EXECUTED BY SAME THREAD!!! >> 0x00007fd149d492b3 <+2371>: mov $0x9,%ebp >> 0x00007fd149d492b8 <+2376>: mov (%rax),%edi >> 0x00007fd149d492ba <+2378>: cmp $0x3d,%edi >> >> The problem is that __errno_location() value of t1 and t2 are different. >> So [ERRNO-2] ends up reading errno of t1 instead of errno of t2 even though >> t2 is executing [ERRNO-2] code section. >> >> When code is compiled without any optimization for [ERRNO-2]: >> 1281 if (errno != ENODATA) { >> 0x00007fd58e7a326f <+2237>: callq 0x7fd58e797300 >> <__errno_location@plt><<--- As it is calling __errno_location() again it >> gets the location from t2 so it works as intended. >> 0x00007fd58e7a3274 <+2242>: mov (%rax),%eax >> 0x00007fd58e7a3276 <+2244>: cmp $0x3d,%eax >> 0x00007fd58e7a3279 <+2247>: je 0x7fd58e7a32a1 >> <gf_defrag_migrate_data+2287> >> >> Fix: >> We decided to make syncop_xxx() return (-errno) value as the return value >> in case of errors and all the functions which make syncop_xxx() will need >> to use (-ret) to figure out the reason for failure in case of syncop_xxx() >> failures. >> >> Pranith >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@nongnu.org >> https://lists.nongnu.org/mailman/listinfo/gluster-devel >> > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@nongnu.org > https://lists.nongnu.org/mailman/listinfo/gluster-devel >
_______________________________________________ Gluster-devel mailing list Gluster-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/gluster-devel