Re: [Gluster-devel] glusterfsd crash when using quota without io-threads

Kinglong Mee Wed, 28 Jun 2017 05:40:28 -0700

Hi Sanoj,

Do you testing with changing the 16K to 2M (default stack size)?


If also overflowing, there must be an important issue exist;
not overflowing, only the issue of 16K is small for marker without io-threads?

thanks,
Kinglong Mee

On 6/28/2017 19:48, Sanoj Unnikrishnan wrote:
> Hi Kinglong,
> 
> I opened https://bugzilla.redhat.com/show_bug.cgi?id=1465861 to track this. 
> It seems to be an important issue that went undetected.
> Thanks for catching this!!
> 
> In my attempts to repro this , I found that on each run some random 
> structures where getting corrupted and running into segfault.
> In order to assert that the stack was indeed growing into all the allocated 
> space and beyond, I set a guard page in the end of the allocated stack space 
> (so that we hit a segfault before overusing the space).
> Below are the code changes.
> 
> @@ -443,6 +443,8 @@ synctask_create (struct syncenv *env, size_t stacksize, 
> synctask_fn_t fn,
>          struct synctask *newtask = NULL;
>          xlator_t        *this    = THIS;
>          int             destroymode = 0;
> +        int                     r=0;
> +        char                    *v;
>  
>          VALIDATE_OR_GOTO (env, err);
>          VALIDATE_OR_GOTO (fn, err);
> @@ -498,9 +500,15 @@ synctask_create (struct syncenv *env, size_t stacksize, 
> synctask_fn_t fn,
>                                              gf_common_mt_syncstack);
>                  newtask->ctx.uc_stack.ss_size = env->stacksize;
>          } else {
> -                newtask->stack = GF_CALLOC (1, stacksize,
> +               newtask->stack = GF_CALLOC (1, stacksize,
>                                              gf_common_mt_syncstack);
>                  newtask->ctx.uc_stack.ss_size = stacksize;
> +                if (stacksize == 16*1024) {
> +                        v = (unsigned long)((char *)(newtask->stack) + 4095) 
> & (~4095);
> +                        r = mprotect(v, 4096, PROT_NONE);
> +                       gf_msg ("syncop", GF_LOG_ERROR, errno,
> +                                LG_MSG_GETCONTEXT_FAILED, "SKU: using 16k 
> stack starting at %p, mprotect returned %d, guard page: %p", newtask->stack, 
> r, v);
> +               }
>          }
>  
> (gdb) where
> #0  0x00007f8a92c51204 in _dl_lookup_symbol_x () from 
> /lib64/ld-linux-x86-64.so.2
> #1  0x00007f8a92c561e3 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
> #2  0x00007f8a92c5dd33 in _dl_runtime_resolve_avx () from 
> /lib64/ld-linux-x86-64.so.2
> #3  0x0000000000000000 in ?? ()
> 
> 
> (gdb) info reg
> 
> rdi            0x7f8a92946188    140233141412232
> rbp            0x7f8a800b4000    0x7f8a800b4000
> rsp            0x7f8a800b4000    0x7f8a800b4000
> r8             0x7f8a92e4ba60    140233146677856
> 
> (gdb) layout asm
> 
>   >│0x7f8a92c51204 <_dl_lookup_symbol_x+4>          push   %r15               
>     <== push on stack at the guarded page caused segfault
> 
> From the brick log we have,
> 
> [syncop.c:515:synctask_create] 0-syncop: SKU: using 16k stack starting at 
> 0x7f8a800b28f0, mprotect returned 0, guard page: 0x7f8a800b3000 [No data 
> available]
> 
> Stack grows downward from 0x7f8a800b68f0 to 0x7f8a800b28f0  and the page 
> 0x7f8a800b3000 - 0x7f8a800b4000 is guarded , which is where the segfault hit 
> as seen in gdb.
> 
> This confirms that the stack space is not sufficient and overflowing,
> I am not sure why we don't hit this in the presence of IO threads though, It 
> may just be that with io threads in graph, we may have some allocated and 
> unused memory which our stack freely grows into. It may just be a silent 
> undetected reuse of some memory.
> 
> Regards,
> Sanoj
> 
> 
> 
> On Tue, Jun 13, 2017 at 10:49 AM, Sanoj Unnikrishnan <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>     Hi Kinglong
>     I was reading about makecontext/swapcontext as well,
>     I did find an article that suggested to use mprotect and force a segfault 
> to check if we have a stack space issue here.
>     here is the link. http://www.evanjones.ca/software/threading.html 
> <http://www.evanjones.ca/software/threading.html>.
> 
>     I don't think i can try this until tomorrow.
>     thanks and regards,
>     Sanoj
> 
>     On Tue, Jun 13, 2017 at 5:58 AM, Kinglong Mee <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>         Hi Sanoj,
> 
>         What's your opinion about this problem?
> 
>         thanks,
>         Kinglong Mee
> 
>         On 6/9/2017 17:20, Kinglong Mee wrote:
>         > Hi Sanoj,
>         >
>         > On 6/9/2017 15:48, Sanoj Unnikrishnan wrote:
>         >> I have not used valgrind before, so I may be wrong here.
>         >>
>         >> I think the valgrind_stack_deregister should have been after 
> GF_FREE_STACK.
>         >> That may explain the instance of invalid_write during 
> stack_destroy calls in after.log.
>         >
>         > No. I move it, but the instance of invalid_write also exist.
>         >
>         >>
>         >> There seems to be numerous issues reported in before.log (I am 
> assuming, you did not have the valgrind_stack_register call in it),
>         >
>         > Yes, the before.log is test without any code change(but without 
> io-threads).
>         >
>         >> From http://valgrind.org/docs/manual/manual-core.html 
> <http://valgrind.org/docs/manual/manual-core.html> 
> <http://valgrind.org/docs/manual/manual-core.html 
> <http://valgrind.org/docs/manual/manual-core.html>>, looks like valgrind 
> detects client switching stack only If a memory of > 2MB change in Stack 
> pointer register.
>         >
>         > I test with a larger max-stackframe as,
>         > valgrind  --leak-check=full --max-stackframe=242293216
>         >
>         >> Is it possible that since marker is only using 16k, the stack 
> pointer could have been in less than 2MB offset from current Stack Pointer?
>         >
>         > Maybe.
>         > But with io-threads (with adding valgrind_stack_deregister), the 
> valgrind only show some
>         > "Invalid read/write" about __gf_mem_invalidate.
>         > The only reason here I think is the stack size (16K) of marker 
> using.
>         >
>         > I have not used makecontext/swapcontext before, am i right?
>         > 1. without swapconext, the stack maybe (just an example)
>         >    --> io_stats-> quota-> marker-> io-threads ->.....
>         >
>         > 2. with swapcontext,
>         >    --> io_stats-> quota-> marker
>         >                swithto new stack -> io-threads
>         >
>         > After switchto new stack, the new stack size is 16K,
>         > Does it enough without io-threads?
>         >
>         > I don't know what's the behave of io-threads, does it call all 
> sub-xlator using the 16K ? or others?
>         >
>         >> It seems unlikely to me since we are allocating the stack from 
> heap.
>         >>
>         >> Did you try a run with the valgrind instrumentation, without 
> changing stack size ?
>         >
>         > OK.
>         > The following valgrind-without-stack-change.log is test with adding 
> valgrind_stack_deregister
>         > (and without io-threads).
>         >
>         > thanks,
>         > Kinglong Mee
>         >
>         >> None of this explains the crash though.. We had seen a memory 
> overrun crash in same code path on netbsd earlier but did not follow up then.
>         >> Will look further into it.
>         >>
>         >>
>         >>
>         >> On Thu, Jun 8, 2017 at 4:51 PM, Kinglong Mee 
> <[email protected] <mailto:[email protected]> 
> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>         >>
>         >>     Maybe it's my fault, I found valgrind can't parse context 
> switch(makecontext/swapcontext) by default.
>         >>     So, I test with the following patch (tells valgrind new stack 
> by VALGRIND_STACK_DEREGISTER).
>         >>     With it, only some "Invalid read/write" by 
> __gf_mem_invalidate, Does it right ??
>         >>     So, there is only one problem, if without io-threads, the 
> stack size is small for marker.
>         >>     Am I right?
>         >>
>         >>     Ps:
>         >>     valgrind-before.log is the log without the following patch, 
> the valgrind-after.log is with the patch.
>         >>
>         >>     ==35656== Invalid write of size 8
>         >>     ==35656==    at 0x4E8FFD4: __gf_mem_invalidate (mem-pool.c:278)
>         >>     ==35656==    by 0x4E90313: __gf_free (mem-pool.c:334)
>         >>     ==35656==    by 0x4EA4E5B: synctask_destroy (syncop.c:394)
>         >>     ==35656==    by 0x4EA4EDF: synctask_done (syncop.c:412)
>         >>     ==35656==    by 0x4EA58B3: synctask_switchto (syncop.c:673)
>         >>     ==35656==    by 0x4EA596B: syncenv_processor (syncop.c:704)
>         >>     ==35656==    by 0x60B2DC4: start_thread (in 
> /usr/lib64/libpthread-2.17.so <http://libpthread-2.17.so> 
> <http://libpthread-2.17.so>)
>         >>     ==35656==    by 0x67A873C: clone (in /usr/lib64/libc-2.17.so 
> <http://libc-2.17.so> <http://libc-2.17.so>)
>         >>     ==35656==  Address 0x1b104931 is 2,068,017 bytes inside a 
> block of size 2,097,224 alloc'd
>         >>     ==35656==    at 0x4C29975: calloc (vg_replace_malloc.c:711)
>         >>     ==35656==    by 0x4E8FA5E: __gf_calloc (mem-pool.c:117)
>         >>     ==35656==    by 0x4EA52F5: synctask_create (syncop.c:500)
>         >>     ==35656==    by 0x4EA55AE: synctask_new1 (syncop.c:576)
>         >>     ==35656==    by 0x143AE0D7: mq_synctask1 (marker-quota.c:1078)
>         >>     ==35656==    by 0x143AE199: mq_synctask (marker-quota.c:1097)
>         >>     ==35656==    by 0x143AE6F6: _mq_create_xattrs_txn 
> (marker-quota.c:1236)
>         >>     ==35656==    by 0x143AE82D: mq_create_xattrs_txn 
> (marker-quota.c:1253)
>         >>     ==35656==    by 0x143B0DCB: mq_inspect_directory_xattr 
> (marker-quota.c:2027)
>         >>     ==35656==    by 0x143B13A8: mq_xattr_state 
> (marker-quota.c:2117)
>         >>     ==35656==    by 0x143A6E80: marker_lookup_cbk (marker.c:2961)
>         >>     ==35656==    by 0x141811E0: up_lookup_cbk (upcall.c:753)
>         >>
>         >>     ----------------------- valgrind 
> ------------------------------------------
>         >>
>         >>     Don't forget install valgrind-devel.
>         >>
>         >>     diff --git a/libglusterfs/src/syncop.c 
> b/libglusterfs/src/syncop.c
>         >>     index 00a9b57..97b1de1 100644
>         >>     --- a/libglusterfs/src/syncop.c
>         >>     +++ b/libglusterfs/src/syncop.c
>         >>     @@ -10,6 +10,7 @@
>         >>
>         >>      #include "syncop.h"
>         >>      #include "libglusterfs-messages.h"
>         >>     +#include <valgrind/valgrind.h>
>         >>
>         >>      int
>         >>      syncopctx_setfsuid (void *uid)
>         >>     @@ -388,6 +389,8 @@ synctask_destroy (struct synctask *task)
>         >>              if (!task)
>         >>                      return;
>         >>
>         >>     +VALGRIND_STACK_DEREGISTER(task->valgrind_ret);
>         >>     +
>         >>              GF_FREE (task->stack);
>         >>
>         >>              if (task->opframe)
>         >>     @@ -509,6 +512,8 @@ synctask_create (struct syncenv *env, 
> size_t stacksize, sync
>         >>
>         >>              newtask->ctx.uc_stack.ss_sp   = newtask->stack;
>         >>
>         >>     +       newtask->valgrind_ret = 
> VALGRIND_STACK_REGISTER(newtask->stack, newtask-
>         >>     +
>         >>              makecontext (&newtask->ctx, (void (*)(void)) 
> synctask_wrap, 2, newtask)
>         >>
>         >>              newtask->state = SYNCTASK_INIT;
>         >>     diff --git a/libglusterfs/src/syncop.h 
> b/libglusterfs/src/syncop.h
>         >>     index c2387e6..247325b 100644
>         >>     --- a/libglusterfs/src/syncop.h
>         >>     +++ b/libglusterfs/src/syncop.h
>         >>     @@ -63,6 +63,7 @@ struct synctask {
>         >>              int                 woken;
>         >>              int                 slept;
>         >>              int                 ret;
>         >>     +       int                 valgrind_ret;
>         >>
>         >>              uid_t               uid;
>         >>              gid_t               gid;
>         >>     diff --git a/xlators/features/marker/src/marker-quota.c 
> b/xlators/features/marke
>         >>     index 902b8e5..f3d2507 100644
>         >>     --- a/xlators/features/marker/src/marker-quota.c
>         >>     +++ b/xlators/features/marker/src/marker-quota.c
>         >>     @@ -1075,7 +1075,7 @@ mq_synctask1 (xlator_t *this, 
> synctask_fn_t task, gf_boole
>         >>              }
>         >>
>         >>              if (spawn) {
>         >>     -                ret = synctask_new1 (this->ctx->env, 1024 * 
> 16, task,
>         >>     +                ret = synctask_new1 (this->ctx->env, 0, task,
>         >>                                            mq_synctask_cleanup, 
> NULL, args);
>         >>                      if (ret) {
>         >>                              gf_log (this->name, GF_LOG_ERROR, 
> "Failed to spawn "
>         >>
>         >>
>         >>     On 6/8/2017 19:02, Sanoj Unnikrishnan wrote:
>         >>     > I would still be worried about the Invalid read/write. IMO 
> whether an illegal access causes a crash depends on whether the page is 
> currently mapped.
>         >>     > So, it could so happen that there is a use after free / use 
> outside of bounds happening in the code  and  it turns out that this location 
> gets mapped in a different (unmapped) page when IO threads is not loaded.
>         >>     >
>         >>     > Could you please share the valgrind logs as well.
>         >>     >
>         >>     > On Wed, Jun 7, 2017 at 8:22 PM, Kinglong Mee 
> <[email protected] <mailto:[email protected]> 
> <mailto:[email protected] <mailto:[email protected]>> 
> <mailto:[email protected] <mailto:[email protected]> 
> <mailto:[email protected] <mailto:[email protected]>>>> wrote:
>         >>     >
>         >>     >     After deleting io-threads from the vols, quota operates 
> (list/set/modify) lets glusterfsd crash.
>         >>     >     I use it at CentOS 7 (CentOS Linux release 7.3.1611) 
> with glusterfs 3.8.12.
>         >>     >     It seems the stack corrupt, when testing with the 
> following diff, glusterfsd runs correctly.
>         >>     >
>         >>     >     There are two questions as,
>         >>     >     1. When using valgrind, it shows there are many "Invalid 
> read/write" when with io-threads.
>         >>     >        Why glusterfsd runs correctly with io-threads? but 
> crash without io-threads?
>         >>     >
>         >>     >     2. With the following diff, valgrind also shows many 
> "Invalid read/write" when without io-threads?
>         >>     >        but no any crash.
>         >>     >
>         >>     >     Any comments are welcome.
>         >>     >
>         >>     >     Revert http://review.gluster.org/11499 
> <http://review.gluster.org/11499> <http://review.gluster.org/11499 
> <http://review.gluster.org/11499>> <http://review.gluster.org/11499 
> <http://review.gluster.org/11499> <http://review.gluster.org/11499 
> <http://review.gluster.org/11499>>> seems better than the diff.
>         >>     >
>         >>     >     diff --git a/xlators/features/marker/src/marker-quota.c 
> b/xlators/features/marke
>         >>     >     index 902b8e5..f3d2507 100644
>         >>     >     --- a/xlators/features/marker/src/marker-quota.c
>         >>     >     +++ b/xlators/features/marker/src/marker-quota.c
>         >>     >     @@ -1075,7 +1075,7 @@ mq_synctask1 (xlator_t *this, 
> synctask_fn_t task, gf_boole
>         >>     >              }
>         >>     >
>         >>     >              if (spawn) {
>         >>     >     -                ret = synctask_new1 (this->ctx->env, 
> 1024 * 16, task,
>         >>     >     +                ret = synctask_new1 (this->ctx->env, 0, 
> task,
>         >>     >                                            
> mq_synctask_cleanup, NULL, args);
>         >>     >                      if (ret) {
>         >>     >                              gf_log (this->name, 
> GF_LOG_ERROR, "Failed to spawn "
>         >>     >
>         >>     >     -----------------------------------test steps 
> ----------------------------------
>         >>     >     1. gluster volume create gvtest node1:/test/ node2:/test/
>         >>     >     2. gluster volume start gvtest
>         >>     >     3. gluster volume quota enable gvtest
>         >>     >
>         >>     >     4. "deletes io-threads from all vols"
>         >>     >     5. reboot node1 and node2.
>         >>     >     6. sh quota-set.sh
>         >>     >
>         >>     >     # cat quota-set.sh
>         >>     >     gluster volume quota gvtest list
>         >>     >     gluster volume quota gvtest limit-usage / 10GB
>         >>     >     gluster volume quota gvtest limit-usage /1234 1GB
>         >>     >     gluster volume quota gvtest limit-usage /hello 1GB
>         >>     >     gluster volume quota gvtest limit-usage /test 1GB
>         >>     >     gluster volume quota gvtest limit-usage /xyz 1GB
>         >>     >     gluster volume quota gvtest list
>         >>     >     gluster volume quota gvtest remove /hello
>         >>     >     gluster volume quota gvtest remove /test
>         >>     >     gluster volume quota gvtest list
>         >>     >     gluster volume quota gvtest limit-usage /test 1GB
>         >>     >     gluster volume quota gvtest remove /xyz
>         >>     >     gluster volume quota gvtest list
>         >>     >
>         >>     >     -----------------------glusterfsd crash without the 
> diff--------------------------------
>         >>     >
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xf5)[0x7f6e1e950af1]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(gf_print_trace+0x21f)[0x7f6e1e956943]
>         >>     >     
> /usr/local/sbin/glusterfsd(glusterfsd_print_trace+0x1f)[0x409c83]
>         >>     >     /lib64/libc.so.6(+0x35250)[0x7f6e1d025250]
>         >>     >     /lib64/libc.so.6(gsignal+0x37)[0x7f6e1d0251d7]
>         >>     >     /lib64/libc.so.6(abort+0x148)[0x7f6e1d0268c8]
>         >>     >     /lib64/libc.so.6(+0x74f07)[0x7f6e1d064f07]
>         >>     >     /lib64/libc.so.6(+0x7baf5)[0x7f6e1d06baf5]
>         >>     >     /lib64/libc.so.6(+0x7c3e6)[0x7f6e1d06c3e6]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(__gf_free+0x311)[0x7f6e1e981327]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(synctask_destroy+0x82)[0x7f6e1e995c20]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(synctask_done+0x25)[0x7f6e1e995c47]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(synctask_switchto+0xcf)[0x7f6e1e996585]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(syncenv_processor+0x60)[0x7f6e1e99663d]
>         >>     >     /lib64/libpthread.so.0(+0x7dc5)[0x7f6e1d7a2dc5]
>         >>     >     /lib64/libc.so.6(clone+0x6d)[0x7f6e1d0e773d]
>         >>     >
>         >>     >     or
>         >>     >
>         >>     >     package-string: glusterfs 3.8.12
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xf5)[0x7fa15e623af1]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(gf_print_trace+0x21f)[0x7fa15e629943]
>         >>     >     
> /usr/local/sbin/glusterfsd(glusterfsd_print_trace+0x1f)[0x409c83]
>         >>     >     /lib64/libc.so.6(+0x35250)[0x7fa15ccf8250]
>         >>     >     /lib64/libc.so.6(gsignal+0x37)[0x7fa15ccf81d7]
>         >>     >     /lib64/libc.so.6(abort+0x148)[0x7fa15ccf98c8]
>         >>     >     /lib64/libc.so.6(+0x74f07)[0x7fa15cd37f07]
>         >>     >     /lib64/libc.so.6(+0x7dd4d)[0x7fa15cd40d4d]
>         >>     >     /lib64/libc.so.6(__libc_calloc+0xb4)[0x7fa15cd43a14]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(__gf_calloc+0xa7)[0x7fa15e653a5f]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(iobref_new+0x2b)[0x7fa15e65875a]
>         >>     >     
> /usr/local/lib/glusterfs/3.8.12/rpc-transport/socket.so(+0xa98c)[0x7fa153a8398c]
>         >>     >     
> /usr/local/lib/glusterfs/3.8.12/rpc-transport/socket.so(+0xacbc)[0x7fa153a83cbc]
>         >>     >     
> /usr/local/lib/glusterfs/3.8.12/rpc-transport/socket.so(+0xad10)[0x7fa153a83d10]
>         >>     >     
> /usr/local/lib/glusterfs/3.8.12/rpc-transport/socket.so(+0xb2a7)[0x7fa153a842a7]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(+0x97ea9)[0x7fa15e68eea9]
>         >>     >     
> /usr/local/lib/libglusterfs.so.0(+0x982c6)[0x7fa15e68f2c6]
>         >>     >     /lib64/libpthread.so.0(+0x7dc5)[0x7fa15d475dc5]
>         >>     >     /lib64/libc.so.6(clone+0x6d)[0x7fa15cdba73d]
>         >>     >
>         >>     >     _______________________________________________
>         >>     >     Gluster-devel mailing list
>         >>     >     [email protected] 
> <mailto:[email protected]> <mailto:[email protected] 
> <mailto:[email protected]>> <mailto:[email protected] 
> <mailto:[email protected]> <mailto:[email protected] 
> <mailto:[email protected]>>>
>         >>     >     http://lists.gluster.org/mailman/listinfo/gluster-devel 
> <http://lists.gluster.org/mailman/listinfo/gluster-devel> 
> <http://lists.gluster.org/mailman/listinfo/gluster-devel 
> <http://lists.gluster.org/mailman/listinfo/gluster-devel>> 
> <http://lists.gluster.org/mailman/listinfo/gluster-devel 
> <http://lists.gluster.org/mailman/listinfo/gluster-devel> 
> <http://lists.gluster.org/mailman/listinfo/gluster-devel 
> <http://lists.gluster.org/mailman/listinfo/gluster-devel>>>
>         >>     >
>         >>     >
>         >>
>         >>
>         >
> 
> 
> 
_______________________________________________
Gluster-devel mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] glusterfsd crash when using quota without io-threads

Reply via email to