Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 4/13/2018 1:36 AM, Philippe Gerum wrote: On 04/12/2018 07:56 PM, Steve Freyder wrote: On 4/12/2018 11:05 AM, Philippe Gerum wrote: On 04/12/2018 05:44 PM, Steve Freyder wrote: On 4/12/2018 5:23 AM, Philippe Gerum wrote: On 04/12/2018 11:31 AM, Philippe Gerum wrote: On 04/09/2018 01:01 AM, Steve Freyder wrote: On 4/2/2018 11:51 AM, Philippe Gerum wrote: On 04/02/2018 06:11 PM, Steve Freyder wrote: On 4/2/2018 10:20 AM, Philippe Gerum wrote: On 04/02/2018 04:54 PM, Steve Freyder wrote: On 4/2/2018 8:41 AM, Philippe Gerum wrote: On 04/01/2018 07:28 PM, Steve Freyder wrote: Greetings again. As I understand it, for each rt_queue there's supposed to be a "status file" located in the fuse filesystem underneath the "/run/xenomai/user/session/pid/alchemy/queues" directory, with the file name being the queue name. This used to contain very useful info about queue status, message counts, etc. I don't know when it broke or whether it's something I'm doing wrong but I'm now getting a "memory exhausted" message on the console when I attempt to do a "cat" on the status file. Here's a small C program that just creates a queue, and then does a pause to hold the accessor count non-zero. The resulting output (logged in via the system console): # sh qtest.sh + sleep 1 + ./qc --mem-pool-size=64M --session=mysession foo + find /run/xenomai /run/xenomai /run/xenomai/root /run/xenomai/root/mysession /run/xenomai/root/mysession/821 /run/xenomai/root/mysession/821/alchemy /run/xenomai/root/mysession/821/alchemy/tasks /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] /run/xenomai/root/mysession/821/alchemy/queues /run/xenomai/root/mysession/821/alchemy/queues/foo /run/xenomai/root/mysession/system /run/xenomai/root/mysession/system/threads /run/xenomai/root/mysession/system/heaps /run/xenomai/root/mysession/system/version + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' + cat /run/xenomai/root/mysession/821/alchemy/queues/foo memory exhausted At this point, it hangs, although SIGINT usually terminates it. I've seen some cases where SIGINT won't terminate it, and a reboot is required to clean things up. I see this message appears to be logged in the obstack error handler. I don't think I'm running out of memory, which makes me think "heap corruption". Not much of an analysis! I did try varying queue sizes and max message counts - no change. I can't reproduce this. I would suspect a rampant memory corruption too, although running the test code over valgrind (mercury build) did not reveal any issue. - which Xenomai version are you using? - cobalt / mercury ? - do you enable the shared heap when configuring ? (--enable-pshared) I'm using Cobalt. uname -a reports: Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri Mar 9 11:07:52 CST 2018 armv7l GNU/Linux Here is the config dump: CONFIG_XENO_PSHARED=1 Any chance you could have some leftover files in /dev/shm from aborted runs, which would steal RAM? I've been rebooting before each test run, but I'll keep that in mind for future testing. Sounds like I need to try rolling back to an older build, I have a 3.0.5 and a 3.0.3 build handy. The standalone test should work with the shared heap disabled, could you check it against a build configure with --disable-pshared? Thanks, Philippe, Sorry for the delay - our vendor had been doing all of our kernel and SDK builds so I had to do a lot of learning to get this all going. With the --disable-pshared in effect: /.g3l # ./qc --dump-config | grep SHARED based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 +0200) CONFIG_XENO_PSHARED is OFF /.g3l # ./qc foo & /.g3l # find /run/xenomai/ /run/xenomai/ /run/xenomai/root /run/xenomai/root/opus /run/xenomai/root/opus/3477 /run/xenomai/root/opus/3477/alchemy /run/xenomai/root/opus/3477/alchemy/tasks /run/xenomai/root/opus/3477/alchemy/tasks/qcreate3477 /run/xenomai/root/opus/3477/alchemy/queues /run/xenomai/root/opus/3477/alchemy/queues/foo /run/xenomai/root/opus/system /run/xenomai/root/opus/system/threads /run/xenomai/root/opus/system/heaps /run/xenomai/root/opus/system/version root@ICB-G3L:/.g3l # cat run/xenomai/root/opus/3477/alchemy/queues/foo [TYPE] [TOTALMEM] [USEDMEM] [QLIMIT] [MCOUNT] FIFO5344 324810 0 Perfect! What's the next step? I can reproduce this issue. I'm on it. The patch below should solve the problem for the registry, however this may have uncovered a bug in the "tlsf" allocator (once again), which should not have failed allocating memory. Two separate issues then. diff --git a/include/copperplate/registry-obstack.h b/include/copperplate/registry-obstack.h index fe192faf7..48e453bc3 100644 --- a/include/copperplate/registry-obstack.h +++ b/include/copperplate/registry-obstack.h @@ -29,11 +29,12 @@ struct threadobj; struct syncobj; /* - * Assume we may want fast allocation of private memory from real-time - * mode when growing the obstack. + * Obstac
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 04/12/2018 07:56 PM, Steve Freyder wrote: > On 4/12/2018 11:05 AM, Philippe Gerum wrote: >> On 04/12/2018 05:44 PM, Steve Freyder wrote: >>> On 4/12/2018 5:23 AM, Philippe Gerum wrote: On 04/12/2018 11:31 AM, Philippe Gerum wrote: > On 04/09/2018 01:01 AM, Steve Freyder wrote: >> On 4/2/2018 11:51 AM, Philippe Gerum wrote: >>> On 04/02/2018 06:11 PM, Steve Freyder wrote: On 4/2/2018 10:20 AM, Philippe Gerum wrote: > On 04/02/2018 04:54 PM, Steve Freyder wrote: >> On 4/2/2018 8:41 AM, Philippe Gerum wrote: >>> On 04/01/2018 07:28 PM, Steve Freyder wrote: Greetings again. As I understand it, for each rt_queue there's supposed to be a "status file" located in the fuse filesystem underneath the "/run/xenomai/user/session/pid/alchemy/queues" directory, with the file name being the queue name. This used to contain very useful info about queue status, message counts, etc. I don't know when it broke or whether it's something I'm doing wrong but I'm now getting a "memory exhausted" message on the console when I attempt to do a "cat" on the status file. Here's a small C program that just creates a queue, and then does a pause to hold the accessor count non-zero. >>> >>> The resulting output (logged in via the system console): # sh qtest.sh + sleep 1 + ./qc --mem-pool-size=64M --session=mysession foo + find /run/xenomai /run/xenomai /run/xenomai/root /run/xenomai/root/mysession /run/xenomai/root/mysession/821 /run/xenomai/root/mysession/821/alchemy /run/xenomai/root/mysession/821/alchemy/tasks /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] /run/xenomai/root/mysession/821/alchemy/queues /run/xenomai/root/mysession/821/alchemy/queues/foo /run/xenomai/root/mysession/system /run/xenomai/root/mysession/system/threads /run/xenomai/root/mysession/system/heaps /run/xenomai/root/mysession/system/version + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' + cat /run/xenomai/root/mysession/821/alchemy/queues/foo memory exhausted At this point, it hangs, although SIGINT usually terminates it. I've seen some cases where SIGINT won't terminate it, and a reboot is required to clean things up. I see this message appears to be logged in the obstack error handler. I don't think I'm running out of memory, which makes me think "heap corruption". Not much of an analysis! I did try varying queue sizes and max message counts - no change. >>> I can't reproduce this. I would suspect a rampant memory >>> corruption >>> too, >>> although running the test code over valgrind (mercury build) >>> did not >>> reveal any issue. >>> >>> - which Xenomai version are you using? >>> - cobalt / mercury ? >>> - do you enable the shared heap when configuring ? >>> (--enable-pshared) >>> >> I'm using Cobalt. uname -a reports: >> >> Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 >> SMP Fri >> Mar >> 9 11:07:52 CST 2018 armv7l GNU/Linux >> >> Here is the config dump: >> >> CONFIG_XENO_PSHARED=1 > Any chance you could have some leftover files in /dev/shm from > aborted > runs, which would steal RAM? > I've been rebooting before each test run, but I'll keep that in mind for future testing. Sounds like I need to try rolling back to an older build, I have a 3.0.5 and a 3.0.3 build handy. >>> The standalone test should work with the shared heap disabled, >>> could you >>> check it against a build configure with --disable-pshared? Thanks, >>> >> Philippe, >> >> Sorry for the delay - our vendor had been doing all of our kernel >> and SDK >> builds so I had to do a lot of learning to get this all going. >> >> With the --disable-pshared in effect: >> >> /.g3l # ./qc --dump-config | grep SHARED >> based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 >> +0200) >> CONFIG_XENO_PSHARED is OFF >> >> /.g3l # ./qc foo & >> /.g3l # find /run/xenomai/ >> /run/xenomai/ >> /run/xenomai/root >> /run/xenomai/root/opus >> /run/xenomai/root/opus/3477 >> /ru
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 4/12/2018 11:05 AM, Philippe Gerum wrote: On 04/12/2018 05:44 PM, Steve Freyder wrote: On 4/12/2018 5:23 AM, Philippe Gerum wrote: On 04/12/2018 11:31 AM, Philippe Gerum wrote: On 04/09/2018 01:01 AM, Steve Freyder wrote: On 4/2/2018 11:51 AM, Philippe Gerum wrote: On 04/02/2018 06:11 PM, Steve Freyder wrote: On 4/2/2018 10:20 AM, Philippe Gerum wrote: On 04/02/2018 04:54 PM, Steve Freyder wrote: On 4/2/2018 8:41 AM, Philippe Gerum wrote: On 04/01/2018 07:28 PM, Steve Freyder wrote: Greetings again. As I understand it, for each rt_queue there's supposed to be a "status file" located in the fuse filesystem underneath the "/run/xenomai/user/session/pid/alchemy/queues" directory, with the file name being the queue name. This used to contain very useful info about queue status, message counts, etc. I don't know when it broke or whether it's something I'm doing wrong but I'm now getting a "memory exhausted" message on the console when I attempt to do a "cat" on the status file. Here's a small C program that just creates a queue, and then does a pause to hold the accessor count non-zero. The resulting output (logged in via the system console): # sh qtest.sh + sleep 1 + ./qc --mem-pool-size=64M --session=mysession foo + find /run/xenomai /run/xenomai /run/xenomai/root /run/xenomai/root/mysession /run/xenomai/root/mysession/821 /run/xenomai/root/mysession/821/alchemy /run/xenomai/root/mysession/821/alchemy/tasks /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] /run/xenomai/root/mysession/821/alchemy/queues /run/xenomai/root/mysession/821/alchemy/queues/foo /run/xenomai/root/mysession/system /run/xenomai/root/mysession/system/threads /run/xenomai/root/mysession/system/heaps /run/xenomai/root/mysession/system/version + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' + cat /run/xenomai/root/mysession/821/alchemy/queues/foo memory exhausted At this point, it hangs, although SIGINT usually terminates it. I've seen some cases where SIGINT won't terminate it, and a reboot is required to clean things up. I see this message appears to be logged in the obstack error handler. I don't think I'm running out of memory, which makes me think "heap corruption". Not much of an analysis! I did try varying queue sizes and max message counts - no change. I can't reproduce this. I would suspect a rampant memory corruption too, although running the test code over valgrind (mercury build) did not reveal any issue. - which Xenomai version are you using? - cobalt / mercury ? - do you enable the shared heap when configuring ? (--enable-pshared) I'm using Cobalt. uname -a reports: Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri Mar 9 11:07:52 CST 2018 armv7l GNU/Linux Here is the config dump: CONFIG_XENO_PSHARED=1 Any chance you could have some leftover files in /dev/shm from aborted runs, which would steal RAM? I've been rebooting before each test run, but I'll keep that in mind for future testing. Sounds like I need to try rolling back to an older build, I have a 3.0.5 and a 3.0.3 build handy. The standalone test should work with the shared heap disabled, could you check it against a build configure with --disable-pshared? Thanks, Philippe, Sorry for the delay - our vendor had been doing all of our kernel and SDK builds so I had to do a lot of learning to get this all going. With the --disable-pshared in effect: /.g3l # ./qc --dump-config | grep SHARED based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 +0200) CONFIG_XENO_PSHARED is OFF /.g3l # ./qc foo & /.g3l # find /run/xenomai/ /run/xenomai/ /run/xenomai/root /run/xenomai/root/opus /run/xenomai/root/opus/3477 /run/xenomai/root/opus/3477/alchemy /run/xenomai/root/opus/3477/alchemy/tasks /run/xenomai/root/opus/3477/alchemy/tasks/qcreate3477 /run/xenomai/root/opus/3477/alchemy/queues /run/xenomai/root/opus/3477/alchemy/queues/foo /run/xenomai/root/opus/system /run/xenomai/root/opus/system/threads /run/xenomai/root/opus/system/heaps /run/xenomai/root/opus/system/version root@ICB-G3L:/.g3l # cat run/xenomai/root/opus/3477/alchemy/queues/foo [TYPE] [TOTALMEM] [USEDMEM] [QLIMIT] [MCOUNT] FIFO5344 324810 0 Perfect! What's the next step? I can reproduce this issue. I'm on it. The patch below should solve the problem for the registry, however this may have uncovered a bug in the "tlsf" allocator (once again), which should not have failed allocating memory. Two separate issues then. diff --git a/include/copperplate/registry-obstack.h b/include/copperplate/registry-obstack.h index fe192faf7..48e453bc3 100644 --- a/include/copperplate/registry-obstack.h +++ b/include/copperplate/registry-obstack.h @@ -29,11 +29,12 @@ struct threadobj; struct syncobj; /* - * Assume we may want fast allocation of private memory from real-time - * mode when growing the obstack. + * Obstacks are grown from handlers called by the fusefs server + * thread, which has no real-time requ
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 04/12/2018 05:44 PM, Steve Freyder wrote: > On 4/12/2018 5:23 AM, Philippe Gerum wrote: >> On 04/12/2018 11:31 AM, Philippe Gerum wrote: >>> On 04/09/2018 01:01 AM, Steve Freyder wrote: On 4/2/2018 11:51 AM, Philippe Gerum wrote: > On 04/02/2018 06:11 PM, Steve Freyder wrote: >> On 4/2/2018 10:20 AM, Philippe Gerum wrote: >>> On 04/02/2018 04:54 PM, Steve Freyder wrote: On 4/2/2018 8:41 AM, Philippe Gerum wrote: > On 04/01/2018 07:28 PM, Steve Freyder wrote: >> Greetings again. >> >> As I understand it, for each rt_queue there's supposed to be a >> "status file" located in the fuse filesystem underneath the >> "/run/xenomai/user/session/pid/alchemy/queues" directory, with >> the file name being the queue name. This used to contain very >> useful info about queue status, message counts, etc. I don't >> know >> when it broke or whether it's something I'm doing wrong but I'm >> now getting a "memory exhausted" message on the console when I >> attempt to do a "cat" on the status file. >> >> Here's a small C program that just creates a queue, and then does >> a pause to hold the accessor count non-zero. >> > > >> The resulting output (logged in via the system console): >> >> # sh qtest.sh >> + sleep 1 >> + ./qc --mem-pool-size=64M --session=mysession foo >> + find /run/xenomai >> /run/xenomai >> /run/xenomai/root >> /run/xenomai/root/mysession >> /run/xenomai/root/mysession/821 >> /run/xenomai/root/mysession/821/alchemy >> /run/xenomai/root/mysession/821/alchemy/tasks >> /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] >> /run/xenomai/root/mysession/821/alchemy/queues >> /run/xenomai/root/mysession/821/alchemy/queues/foo >> /run/xenomai/root/mysession/system >> /run/xenomai/root/mysession/system/threads >> /run/xenomai/root/mysession/system/heaps >> /run/xenomai/root/mysession/system/version >> + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' >> + cat /run/xenomai/root/mysession/821/alchemy/queues/foo >> memory exhausted >> >> At this point, it hangs, although SIGINT usually terminates it. >> >> I've seen some cases where SIGINT won't terminate it, and a >> reboot is >> required to clean things up. I see this message appears to be >> logged >> in the obstack error handler. I don't think I'm running out of >> memory, >> which makes me think "heap corruption". Not much of an analysis! >> I did >> try varying queue sizes and max message counts - no change. >> > I can't reproduce this. I would suspect a rampant memory > corruption > too, > although running the test code over valgrind (mercury build) > did not > reveal any issue. > > - which Xenomai version are you using? > - cobalt / mercury ? > - do you enable the shared heap when configuring ? > (--enable-pshared) > I'm using Cobalt. uname -a reports: Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri Mar 9 11:07:52 CST 2018 armv7l GNU/Linux Here is the config dump: CONFIG_XENO_PSHARED=1 >>> Any chance you could have some leftover files in /dev/shm from >>> aborted >>> runs, which would steal RAM? >>> >> I've been rebooting before each test run, but I'll keep that in >> mind for >> future testing. >> >> Sounds like I need to try rolling back to an older build, I have a >> 3.0.5 >> and a 3.0.3 build handy. > The standalone test should work with the shared heap disabled, > could you > check it against a build configure with --disable-pshared? Thanks, > Philippe, Sorry for the delay - our vendor had been doing all of our kernel and SDK builds so I had to do a lot of learning to get this all going. With the --disable-pshared in effect: /.g3l # ./qc --dump-config | grep SHARED based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 +0200) CONFIG_XENO_PSHARED is OFF /.g3l # ./qc foo & /.g3l # find /run/xenomai/ /run/xenomai/ /run/xenomai/root /run/xenomai/root/opus /run/xenomai/root/opus/3477 /run/xenomai/root/opus/3477/alchemy /run/xenomai/root/opus/3477/alchemy/tasks /run/xenomai/root/opus/3477/alchemy/tasks/qcreate3477 /run/xenomai/root/opus/3477/alchemy/queues /run/xenomai/root/opus/3477/alchemy/queues/foo /run/xenomai/root/opus/system /run/xenomai/root/opus/system/threads /run/xenomai/root/opus/system/heaps /run/xen
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 4/12/2018 5:23 AM, Philippe Gerum wrote: On 04/12/2018 11:31 AM, Philippe Gerum wrote: On 04/09/2018 01:01 AM, Steve Freyder wrote: On 4/2/2018 11:51 AM, Philippe Gerum wrote: On 04/02/2018 06:11 PM, Steve Freyder wrote: On 4/2/2018 10:20 AM, Philippe Gerum wrote: On 04/02/2018 04:54 PM, Steve Freyder wrote: On 4/2/2018 8:41 AM, Philippe Gerum wrote: On 04/01/2018 07:28 PM, Steve Freyder wrote: Greetings again. As I understand it, for each rt_queue there's supposed to be a "status file" located in the fuse filesystem underneath the "/run/xenomai/user/session/pid/alchemy/queues" directory, with the file name being the queue name. This used to contain very useful info about queue status, message counts, etc. I don't know when it broke or whether it's something I'm doing wrong but I'm now getting a "memory exhausted" message on the console when I attempt to do a "cat" on the status file. Here's a small C program that just creates a queue, and then does a pause to hold the accessor count non-zero. The resulting output (logged in via the system console): # sh qtest.sh + sleep 1 + ./qc --mem-pool-size=64M --session=mysession foo + find /run/xenomai /run/xenomai /run/xenomai/root /run/xenomai/root/mysession /run/xenomai/root/mysession/821 /run/xenomai/root/mysession/821/alchemy /run/xenomai/root/mysession/821/alchemy/tasks /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] /run/xenomai/root/mysession/821/alchemy/queues /run/xenomai/root/mysession/821/alchemy/queues/foo /run/xenomai/root/mysession/system /run/xenomai/root/mysession/system/threads /run/xenomai/root/mysession/system/heaps /run/xenomai/root/mysession/system/version + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' + cat /run/xenomai/root/mysession/821/alchemy/queues/foo memory exhausted At this point, it hangs, although SIGINT usually terminates it. I've seen some cases where SIGINT won't terminate it, and a reboot is required to clean things up. I see this message appears to be logged in the obstack error handler. I don't think I'm running out of memory, which makes me think "heap corruption". Not much of an analysis! I did try varying queue sizes and max message counts - no change. I can't reproduce this. I would suspect a rampant memory corruption too, although running the test code over valgrind (mercury build) did not reveal any issue. - which Xenomai version are you using? - cobalt / mercury ? - do you enable the shared heap when configuring ? (--enable-pshared) I'm using Cobalt. uname -a reports: Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri Mar 9 11:07:52 CST 2018 armv7l GNU/Linux Here is the config dump: CONFIG_XENO_PSHARED=1 Any chance you could have some leftover files in /dev/shm from aborted runs, which would steal RAM? I've been rebooting before each test run, but I'll keep that in mind for future testing. Sounds like I need to try rolling back to an older build, I have a 3.0.5 and a 3.0.3 build handy. The standalone test should work with the shared heap disabled, could you check it against a build configure with --disable-pshared? Thanks, Philippe, Sorry for the delay - our vendor had been doing all of our kernel and SDK builds so I had to do a lot of learning to get this all going. With the --disable-pshared in effect: /.g3l # ./qc --dump-config | grep SHARED based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 +0200) CONFIG_XENO_PSHARED is OFF /.g3l # ./qc foo & /.g3l # find /run/xenomai/ /run/xenomai/ /run/xenomai/root /run/xenomai/root/opus /run/xenomai/root/opus/3477 /run/xenomai/root/opus/3477/alchemy /run/xenomai/root/opus/3477/alchemy/tasks /run/xenomai/root/opus/3477/alchemy/tasks/qcreate3477 /run/xenomai/root/opus/3477/alchemy/queues /run/xenomai/root/opus/3477/alchemy/queues/foo /run/xenomai/root/opus/system /run/xenomai/root/opus/system/threads /run/xenomai/root/opus/system/heaps /run/xenomai/root/opus/system/version root@ICB-G3L:/.g3l # cat run/xenomai/root/opus/3477/alchemy/queues/foo [TYPE] [TOTALMEM] [USEDMEM] [QLIMIT] [MCOUNT] FIFO5344 324810 0 Perfect! What's the next step? I can reproduce this issue. I'm on it. The patch below should solve the problem for the registry, however this may have uncovered a bug in the "tlsf" allocator (once again), which should not have failed allocating memory. Two separate issues then. diff --git a/include/copperplate/registry-obstack.h b/include/copperplate/registry-obstack.h index fe192faf7..48e453bc3 100644 --- a/include/copperplate/registry-obstack.h +++ b/include/copperplate/registry-obstack.h @@ -29,11 +29,12 @@ struct threadobj; struct syncobj; /* - * Assume we may want fast allocation of private memory from real-time - * mode when growing the obstack. + * Obstacks are grown from handlers called by the fusefs server + * thread, which has no real-time requirement: malloc/free is fine for + * memory management. */ -#define obstack_chunk_allocp
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 04/12/2018 11:31 AM, Philippe Gerum wrote: > On 04/09/2018 01:01 AM, Steve Freyder wrote: >> On 4/2/2018 11:51 AM, Philippe Gerum wrote: >>> On 04/02/2018 06:11 PM, Steve Freyder wrote: On 4/2/2018 10:20 AM, Philippe Gerum wrote: > On 04/02/2018 04:54 PM, Steve Freyder wrote: >> On 4/2/2018 8:41 AM, Philippe Gerum wrote: >>> On 04/01/2018 07:28 PM, Steve Freyder wrote: Greetings again. As I understand it, for each rt_queue there's supposed to be a "status file" located in the fuse filesystem underneath the "/run/xenomai/user/session/pid/alchemy/queues" directory, with the file name being the queue name. This used to contain very useful info about queue status, message counts, etc. I don't know when it broke or whether it's something I'm doing wrong but I'm now getting a "memory exhausted" message on the console when I attempt to do a "cat" on the status file. Here's a small C program that just creates a queue, and then does a pause to hold the accessor count non-zero. >>> >>> The resulting output (logged in via the system console): # sh qtest.sh + sleep 1 + ./qc --mem-pool-size=64M --session=mysession foo + find /run/xenomai /run/xenomai /run/xenomai/root /run/xenomai/root/mysession /run/xenomai/root/mysession/821 /run/xenomai/root/mysession/821/alchemy /run/xenomai/root/mysession/821/alchemy/tasks /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] /run/xenomai/root/mysession/821/alchemy/queues /run/xenomai/root/mysession/821/alchemy/queues/foo /run/xenomai/root/mysession/system /run/xenomai/root/mysession/system/threads /run/xenomai/root/mysession/system/heaps /run/xenomai/root/mysession/system/version + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' + cat /run/xenomai/root/mysession/821/alchemy/queues/foo memory exhausted At this point, it hangs, although SIGINT usually terminates it. I've seen some cases where SIGINT won't terminate it, and a reboot is required to clean things up. I see this message appears to be logged in the obstack error handler. I don't think I'm running out of memory, which makes me think "heap corruption". Not much of an analysis! I did try varying queue sizes and max message counts - no change. >>> I can't reproduce this. I would suspect a rampant memory corruption >>> too, >>> although running the test code over valgrind (mercury build) did not >>> reveal any issue. >>> >>> - which Xenomai version are you using? >>> - cobalt / mercury ? >>> - do you enable the shared heap when configuring ? (--enable-pshared) >>> >> I'm using Cobalt. uname -a reports: >> >> Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri >> Mar >> 9 11:07:52 CST 2018 armv7l GNU/Linux >> >> Here is the config dump: >> >> CONFIG_XENO_PSHARED=1 > Any chance you could have some leftover files in /dev/shm from aborted > runs, which would steal RAM? > I've been rebooting before each test run, but I'll keep that in mind for future testing. Sounds like I need to try rolling back to an older build, I have a 3.0.5 and a 3.0.3 build handy. >>> The standalone test should work with the shared heap disabled, could you >>> check it against a build configure with --disable-pshared? Thanks, >>> >> Philippe, >> >> Sorry for the delay - our vendor had been doing all of our kernel and SDK >> builds so I had to do a lot of learning to get this all going. >> >> With the --disable-pshared in effect: >> >> /.g3l # ./qc --dump-config | grep SHARED >> based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 +0200) >> CONFIG_XENO_PSHARED is OFF >> >> /.g3l # ./qc foo & >> /.g3l # find /run/xenomai/ >> /run/xenomai/ >> /run/xenomai/root >> /run/xenomai/root/opus >> /run/xenomai/root/opus/3477 >> /run/xenomai/root/opus/3477/alchemy >> /run/xenomai/root/opus/3477/alchemy/tasks >> /run/xenomai/root/opus/3477/alchemy/tasks/qcreate3477 >> /run/xenomai/root/opus/3477/alchemy/queues >> /run/xenomai/root/opus/3477/alchemy/queues/foo >> /run/xenomai/root/opus/system >> /run/xenomai/root/opus/system/threads >> /run/xenomai/root/opus/system/heaps >> /run/xenomai/root/opus/system/version >> root@ICB-G3L:/.g3l # cat run/xenomai/root/opus/3477/alchemy/queues/foo >> [TYPE] [TOTALMEM] [USEDMEM] [QLIMIT] [MCOUNT] >> FIFO 5344 3248 10 0 >> >> Perfect! >> >> What's the next step? >> > > I can reproduce this issue. I'm on it. > The patch below should solve the problem for the registry, however this may have uncovered
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 04/09/2018 01:01 AM, Steve Freyder wrote: > On 4/2/2018 11:51 AM, Philippe Gerum wrote: >> On 04/02/2018 06:11 PM, Steve Freyder wrote: >>> On 4/2/2018 10:20 AM, Philippe Gerum wrote: On 04/02/2018 04:54 PM, Steve Freyder wrote: > On 4/2/2018 8:41 AM, Philippe Gerum wrote: >> On 04/01/2018 07:28 PM, Steve Freyder wrote: >>> Greetings again. >>> >>> As I understand it, for each rt_queue there's supposed to be a >>> "status file" located in the fuse filesystem underneath the >>> "/run/xenomai/user/session/pid/alchemy/queues" directory, with >>> the file name being the queue name. This used to contain very >>> useful info about queue status, message counts, etc. I don't know >>> when it broke or whether it's something I'm doing wrong but I'm >>> now getting a "memory exhausted" message on the console when I >>> attempt to do a "cat" on the status file. >>> >>> Here's a small C program that just creates a queue, and then does >>> a pause to hold the accessor count non-zero. >>> >> >> >>> The resulting output (logged in via the system console): >>> >>> # sh qtest.sh >>> + sleep 1 >>> + ./qc --mem-pool-size=64M --session=mysession foo >>> + find /run/xenomai >>> /run/xenomai >>> /run/xenomai/root >>> /run/xenomai/root/mysession >>> /run/xenomai/root/mysession/821 >>> /run/xenomai/root/mysession/821/alchemy >>> /run/xenomai/root/mysession/821/alchemy/tasks >>> /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] >>> /run/xenomai/root/mysession/821/alchemy/queues >>> /run/xenomai/root/mysession/821/alchemy/queues/foo >>> /run/xenomai/root/mysession/system >>> /run/xenomai/root/mysession/system/threads >>> /run/xenomai/root/mysession/system/heaps >>> /run/xenomai/root/mysession/system/version >>> + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' >>> + cat /run/xenomai/root/mysession/821/alchemy/queues/foo >>> memory exhausted >>> >>> At this point, it hangs, although SIGINT usually terminates it. >>> >>> I've seen some cases where SIGINT won't terminate it, and a >>> reboot is >>> required to clean things up. I see this message appears to be >>> logged >>> in the obstack error handler. I don't think I'm running out of >>> memory, >>> which makes me think "heap corruption". Not much of an analysis! >>> I did >>> try varying queue sizes and max message counts - no change. >>> >> I can't reproduce this. I would suspect a rampant memory corruption >> too, >> although running the test code over valgrind (mercury build) did not >> reveal any issue. >> >> - which Xenomai version are you using? >> - cobalt / mercury ? >> - do you enable the shared heap when configuring ? (--enable-pshared) >> > I'm using Cobalt. uname -a reports: > > Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri > Mar > 9 11:07:52 CST 2018 armv7l GNU/Linux > > Here is the config dump: > > CONFIG_XENO_PSHARED=1 Any chance you could have some leftover files in /dev/shm from aborted runs, which would steal RAM? >>> I've been rebooting before each test run, but I'll keep that in mind for >>> future testing. >>> >>> Sounds like I need to try rolling back to an older build, I have a 3.0.5 >>> and a 3.0.3 build handy. >> The standalone test should work with the shared heap disabled, could you >> check it against a build configure with --disable-pshared? Thanks, >> > Philippe, > > Sorry for the delay - our vendor had been doing all of our kernel and SDK > builds so I had to do a lot of learning to get this all going. > > With the --disable-pshared in effect: > > /.g3l # ./qc --dump-config | grep SHARED > based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 +0200) > CONFIG_XENO_PSHARED is OFF > > /.g3l # ./qc foo & > /.g3l # find /run/xenomai/ > /run/xenomai/ > /run/xenomai/root > /run/xenomai/root/opus > /run/xenomai/root/opus/3477 > /run/xenomai/root/opus/3477/alchemy > /run/xenomai/root/opus/3477/alchemy/tasks > /run/xenomai/root/opus/3477/alchemy/tasks/qcreate3477 > /run/xenomai/root/opus/3477/alchemy/queues > /run/xenomai/root/opus/3477/alchemy/queues/foo > /run/xenomai/root/opus/system > /run/xenomai/root/opus/system/threads > /run/xenomai/root/opus/system/heaps > /run/xenomai/root/opus/system/version > root@ICB-G3L:/.g3l # cat run/xenomai/root/opus/3477/alchemy/queues/foo > [TYPE] [TOTALMEM] [USEDMEM] [QLIMIT] [MCOUNT] > FIFO 5344 3248 10 0 > > Perfect! > > What's the next step? > I can reproduce this issue. I'm on it. -- Philippe. ___ Xenomai mailing list Xenomai@xenomai.org https://xenomai.org/mailman/listinfo/xenomai
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 04/09/2018 01:01 AM, Steve Freyder wrote: > On 4/2/2018 11:51 AM, Philippe Gerum wrote: >> On 04/02/2018 06:11 PM, Steve Freyder wrote: >>> On 4/2/2018 10:20 AM, Philippe Gerum wrote: On 04/02/2018 04:54 PM, Steve Freyder wrote: > On 4/2/2018 8:41 AM, Philippe Gerum wrote: >> On 04/01/2018 07:28 PM, Steve Freyder wrote: >>> Greetings again. >>> >>> As I understand it, for each rt_queue there's supposed to be a >>> "status file" located in the fuse filesystem underneath the >>> "/run/xenomai/user/session/pid/alchemy/queues" directory, with >>> the file name being the queue name. This used to contain very >>> useful info about queue status, message counts, etc. I don't know >>> when it broke or whether it's something I'm doing wrong but I'm >>> now getting a "memory exhausted" message on the console when I >>> attempt to do a "cat" on the status file. >>> >>> Here's a small C program that just creates a queue, and then does >>> a pause to hold the accessor count non-zero. >>> >> >> >>> The resulting output (logged in via the system console): >>> >>> # sh qtest.sh >>> + sleep 1 >>> + ./qc --mem-pool-size=64M --session=mysession foo >>> + find /run/xenomai >>> /run/xenomai >>> /run/xenomai/root >>> /run/xenomai/root/mysession >>> /run/xenomai/root/mysession/821 >>> /run/xenomai/root/mysession/821/alchemy >>> /run/xenomai/root/mysession/821/alchemy/tasks >>> /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] >>> /run/xenomai/root/mysession/821/alchemy/queues >>> /run/xenomai/root/mysession/821/alchemy/queues/foo >>> /run/xenomai/root/mysession/system >>> /run/xenomai/root/mysession/system/threads >>> /run/xenomai/root/mysession/system/heaps >>> /run/xenomai/root/mysession/system/version >>> + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' >>> + cat /run/xenomai/root/mysession/821/alchemy/queues/foo >>> memory exhausted >>> >>> At this point, it hangs, although SIGINT usually terminates it. >>> >>> I've seen some cases where SIGINT won't terminate it, and a >>> reboot is >>> required to clean things up. I see this message appears to be >>> logged >>> in the obstack error handler. I don't think I'm running out of >>> memory, >>> which makes me think "heap corruption". Not much of an analysis! >>> I did >>> try varying queue sizes and max message counts - no change. >>> >> I can't reproduce this. I would suspect a rampant memory corruption >> too, >> although running the test code over valgrind (mercury build) did not >> reveal any issue. >> >> - which Xenomai version are you using? >> - cobalt / mercury ? >> - do you enable the shared heap when configuring ? (--enable-pshared) >> > I'm using Cobalt. uname -a reports: > > Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri > Mar > 9 11:07:52 CST 2018 armv7l GNU/Linux > > Here is the config dump: > > CONFIG_XENO_PSHARED=1 Any chance you could have some leftover files in /dev/shm from aborted runs, which would steal RAM? >>> I've been rebooting before each test run, but I'll keep that in mind for >>> future testing. >>> >>> Sounds like I need to try rolling back to an older build, I have a 3.0.5 >>> and a 3.0.3 build handy. >> The standalone test should work with the shared heap disabled, could you >> check it against a build configure with --disable-pshared? Thanks, >> > Philippe, > > Sorry for the delay - our vendor had been doing all of our kernel and SDK > builds so I had to do a lot of learning to get this all going. > > With the --disable-pshared in effect: > > /.g3l # ./qc --dump-config | grep SHARED > based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 +0200) > CONFIG_XENO_PSHARED is OFF > > /.g3l # ./qc foo & > /.g3l # find /run/xenomai/ > /run/xenomai/ > /run/xenomai/root > /run/xenomai/root/opus > /run/xenomai/root/opus/3477 > /run/xenomai/root/opus/3477/alchemy > /run/xenomai/root/opus/3477/alchemy/tasks > /run/xenomai/root/opus/3477/alchemy/tasks/qcreate3477 > /run/xenomai/root/opus/3477/alchemy/queues > /run/xenomai/root/opus/3477/alchemy/queues/foo > /run/xenomai/root/opus/system > /run/xenomai/root/opus/system/threads > /run/xenomai/root/opus/system/heaps > /run/xenomai/root/opus/system/version > root@ICB-G3L:/.g3l # cat run/xenomai/root/opus/3477/alchemy/queues/foo > [TYPE] [TOTALMEM] [USEDMEM] [QLIMIT] [MCOUNT] > FIFO 5344 3248 10 0 > > Perfect! > > What's the next step? We need to get to the bottom of this issue, because we just can't release 3.0.7 with a bug in the pshared allocator. I could not reproduce this bug last time I tried using the test snippet, but I did not have your full config settings then, so I need to redo the whole test using the same configuration
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 4/2/2018 11:51 AM, Philippe Gerum wrote: On 04/02/2018 06:11 PM, Steve Freyder wrote: On 4/2/2018 10:20 AM, Philippe Gerum wrote: On 04/02/2018 04:54 PM, Steve Freyder wrote: On 4/2/2018 8:41 AM, Philippe Gerum wrote: On 04/01/2018 07:28 PM, Steve Freyder wrote: Greetings again. As I understand it, for each rt_queue there's supposed to be a "status file" located in the fuse filesystem underneath the "/run/xenomai/user/session/pid/alchemy/queues" directory, with the file name being the queue name. This used to contain very useful info about queue status, message counts, etc. I don't know when it broke or whether it's something I'm doing wrong but I'm now getting a "memory exhausted" message on the console when I attempt to do a "cat" on the status file. Here's a small C program that just creates a queue, and then does a pause to hold the accessor count non-zero. The resulting output (logged in via the system console): # sh qtest.sh + sleep 1 + ./qc --mem-pool-size=64M --session=mysession foo + find /run/xenomai /run/xenomai /run/xenomai/root /run/xenomai/root/mysession /run/xenomai/root/mysession/821 /run/xenomai/root/mysession/821/alchemy /run/xenomai/root/mysession/821/alchemy/tasks /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] /run/xenomai/root/mysession/821/alchemy/queues /run/xenomai/root/mysession/821/alchemy/queues/foo /run/xenomai/root/mysession/system /run/xenomai/root/mysession/system/threads /run/xenomai/root/mysession/system/heaps /run/xenomai/root/mysession/system/version + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' + cat /run/xenomai/root/mysession/821/alchemy/queues/foo memory exhausted At this point, it hangs, although SIGINT usually terminates it. I've seen some cases where SIGINT won't terminate it, and a reboot is required to clean things up. I see this message appears to be logged in the obstack error handler. I don't think I'm running out of memory, which makes me think "heap corruption". Not much of an analysis! I did try varying queue sizes and max message counts - no change. I can't reproduce this. I would suspect a rampant memory corruption too, although running the test code over valgrind (mercury build) did not reveal any issue. - which Xenomai version are you using? - cobalt / mercury ? - do you enable the shared heap when configuring ? (--enable-pshared) I'm using Cobalt. uname -a reports: Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri Mar 9 11:07:52 CST 2018 armv7l GNU/Linux Here is the config dump: CONFIG_XENO_PSHARED=1 Any chance you could have some leftover files in /dev/shm from aborted runs, which would steal RAM? I've been rebooting before each test run, but I'll keep that in mind for future testing. Sounds like I need to try rolling back to an older build, I have a 3.0.5 and a 3.0.3 build handy. The standalone test should work with the shared heap disabled, could you check it against a build configure with --disable-pshared? Thanks, Philippe, Sorry for the delay - our vendor had been doing all of our kernel and SDK builds so I had to do a lot of learning to get this all going. With the --disable-pshared in effect: /.g3l # ./qc --dump-config | grep SHARED based on Xenomai/cobalt v3.0.6 -- #6e34bb5 (2018-04-01 10:50:59 +0200) CONFIG_XENO_PSHARED is OFF /.g3l # ./qc foo & /.g3l # find /run/xenomai/ /run/xenomai/ /run/xenomai/root /run/xenomai/root/opus /run/xenomai/root/opus/3477 /run/xenomai/root/opus/3477/alchemy /run/xenomai/root/opus/3477/alchemy/tasks /run/xenomai/root/opus/3477/alchemy/tasks/qcreate3477 /run/xenomai/root/opus/3477/alchemy/queues /run/xenomai/root/opus/3477/alchemy/queues/foo /run/xenomai/root/opus/system /run/xenomai/root/opus/system/threads /run/xenomai/root/opus/system/heaps /run/xenomai/root/opus/system/version root@ICB-G3L:/.g3l # cat run/xenomai/root/opus/3477/alchemy/queues/foo [TYPE] [TOTALMEM] [USEDMEM] [QLIMIT] [MCOUNT] FIFO5344 324810 0 Perfect! What's the next step? Best, Steve ___ Xenomai mailing list Xenomai@xenomai.org https://xenomai.org/mailman/listinfo/xenomai
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 04/02/2018 06:11 PM, Steve Freyder wrote: > On 4/2/2018 10:20 AM, Philippe Gerum wrote: >> On 04/02/2018 04:54 PM, Steve Freyder wrote: >>> On 4/2/2018 8:41 AM, Philippe Gerum wrote: On 04/01/2018 07:28 PM, Steve Freyder wrote: > Greetings again. > > As I understand it, for each rt_queue there's supposed to be a > "status file" located in the fuse filesystem underneath the > "/run/xenomai/user/session/pid/alchemy/queues" directory, with > the file name being the queue name. This used to contain very > useful info about queue status, message counts, etc. I don't know > when it broke or whether it's something I'm doing wrong but I'm > now getting a "memory exhausted" message on the console when I > attempt to do a "cat" on the status file. > > Here's a small C program that just creates a queue, and then does > a pause to hold the accessor count non-zero. > > The resulting output (logged in via the system console): > > # sh qtest.sh > + sleep 1 > + ./qc --mem-pool-size=64M --session=mysession foo > + find /run/xenomai > /run/xenomai > /run/xenomai/root > /run/xenomai/root/mysession > /run/xenomai/root/mysession/821 > /run/xenomai/root/mysession/821/alchemy > /run/xenomai/root/mysession/821/alchemy/tasks > /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] > /run/xenomai/root/mysession/821/alchemy/queues > /run/xenomai/root/mysession/821/alchemy/queues/foo > /run/xenomai/root/mysession/system > /run/xenomai/root/mysession/system/threads > /run/xenomai/root/mysession/system/heaps > /run/xenomai/root/mysession/system/version > + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' > + cat /run/xenomai/root/mysession/821/alchemy/queues/foo > memory exhausted > > At this point, it hangs, although SIGINT usually terminates it. > > I've seen some cases where SIGINT won't terminate it, and a reboot is > required to clean things up. I see this message appears to be logged > in the obstack error handler. I don't think I'm running out of > memory, > which makes me think "heap corruption". Not much of an analysis! > I did > try varying queue sizes and max message counts - no change. > I can't reproduce this. I would suspect a rampant memory corruption too, although running the test code over valgrind (mercury build) did not reveal any issue. - which Xenomai version are you using? - cobalt / mercury ? - do you enable the shared heap when configuring ? (--enable-pshared) >>> I'm using Cobalt. uname -a reports: >>> >>> Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri Mar >>> 9 11:07:52 CST 2018 armv7l GNU/Linux >>> >>> Here is the config dump: >>> >>> CONFIG_XENO_PSHARED=1 >> Any chance you could have some leftover files in /dev/shm from aborted >> runs, which would steal RAM? >> > I've been rebooting before each test run, but I'll keep that in mind for > future testing. > > Sounds like I need to try rolling back to an older build, I have a 3.0.5 > and a 3.0.3 build handy. The standalone test should work with the shared heap disabled, could you check it against a build configure with --disable-pshared? Thanks, -- Philippe. ___ Xenomai mailing list Xenomai@xenomai.org https://xenomai.org/mailman/listinfo/xenomai
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 4/2/2018 10:20 AM, Philippe Gerum wrote: On 04/02/2018 04:54 PM, Steve Freyder wrote: On 4/2/2018 8:41 AM, Philippe Gerum wrote: On 04/01/2018 07:28 PM, Steve Freyder wrote: Greetings again. As I understand it, for each rt_queue there's supposed to be a "status file" located in the fuse filesystem underneath the "/run/xenomai/user/session/pid/alchemy/queues" directory, with the file name being the queue name. This used to contain very useful info about queue status, message counts, etc. I don't know when it broke or whether it's something I'm doing wrong but I'm now getting a "memory exhausted" message on the console when I attempt to do a "cat" on the status file. Here's a small C program that just creates a queue, and then does a pause to hold the accessor count non-zero. The resulting output (logged in via the system console): # sh qtest.sh + sleep 1 + ./qc --mem-pool-size=64M --session=mysession foo + find /run/xenomai /run/xenomai /run/xenomai/root /run/xenomai/root/mysession /run/xenomai/root/mysession/821 /run/xenomai/root/mysession/821/alchemy /run/xenomai/root/mysession/821/alchemy/tasks /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] /run/xenomai/root/mysession/821/alchemy/queues /run/xenomai/root/mysession/821/alchemy/queues/foo /run/xenomai/root/mysession/system /run/xenomai/root/mysession/system/threads /run/xenomai/root/mysession/system/heaps /run/xenomai/root/mysession/system/version + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' + cat /run/xenomai/root/mysession/821/alchemy/queues/foo memory exhausted At this point, it hangs, although SIGINT usually terminates it. I've seen some cases where SIGINT won't terminate it, and a reboot is required to clean things up. I see this message appears to be logged in the obstack error handler. I don't think I'm running out of memory, which makes me think "heap corruption". Not much of an analysis! I did try varying queue sizes and max message counts - no change. I can't reproduce this. I would suspect a rampant memory corruption too, although running the test code over valgrind (mercury build) did not reveal any issue. - which Xenomai version are you using? - cobalt / mercury ? - do you enable the shared heap when configuring ? (--enable-pshared) I'm using Cobalt. uname -a reports: Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri Mar 9 11:07:52 CST 2018 armv7l GNU/Linux Here is the config dump: CONFIG_XENO_PSHARED=1 Any chance you could have some leftover files in /dev/shm from aborted runs, which would steal RAM? I've been rebooting before each test run, but I'll keep that in mind for future testing. Sounds like I need to try rolling back to an older build, I have a 3.0.5 and a 3.0.3 build handy. ___ Xenomai mailing list Xenomai@xenomai.org https://xenomai.org/mailman/listinfo/xenomai
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 04/02/2018 04:54 PM, Steve Freyder wrote: > On 4/2/2018 8:41 AM, Philippe Gerum wrote: >> On 04/01/2018 07:28 PM, Steve Freyder wrote: >>> Greetings again. >>> >>> As I understand it, for each rt_queue there's supposed to be a >>> "status file" located in the fuse filesystem underneath the >>> "/run/xenomai/user/session/pid/alchemy/queues" directory, with >>> the file name being the queue name. This used to contain very >>> useful info about queue status, message counts, etc. I don't know >>> when it broke or whether it's something I'm doing wrong but I'm >>> now getting a "memory exhausted" message on the console when I >>> attempt to do a "cat" on the status file. >>> >>> Here's a small C program that just creates a queue, and then does >>> a pause to hold the accessor count non-zero. >>> >> >> >>> The resulting output (logged in via the system console): >>> >>> # sh qtest.sh >>> + sleep 1 >>> + ./qc --mem-pool-size=64M --session=mysession foo >>> + find /run/xenomai >>> /run/xenomai >>> /run/xenomai/root >>> /run/xenomai/root/mysession >>> /run/xenomai/root/mysession/821 >>> /run/xenomai/root/mysession/821/alchemy >>> /run/xenomai/root/mysession/821/alchemy/tasks >>> /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] >>> /run/xenomai/root/mysession/821/alchemy/queues >>> /run/xenomai/root/mysession/821/alchemy/queues/foo >>> /run/xenomai/root/mysession/system >>> /run/xenomai/root/mysession/system/threads >>> /run/xenomai/root/mysession/system/heaps >>> /run/xenomai/root/mysession/system/version >>> + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' >>> + cat /run/xenomai/root/mysession/821/alchemy/queues/foo >>> memory exhausted >>> >>> At this point, it hangs, although SIGINT usually terminates it. >>> >>> I've seen some cases where SIGINT won't terminate it, and a reboot is >>> required to clean things up. I see this message appears to be logged >>> in the obstack error handler. I don't think I'm running out of memory, >>> which makes me think "heap corruption". Not much of an analysis! I did >>> try varying queue sizes and max message counts - no change. >>> >> I can't reproduce this. I would suspect a rampant memory corruption too, >> although running the test code over valgrind (mercury build) did not >> reveal any issue. >> >> - which Xenomai version are you using? >> - cobalt / mercury ? >> - do you enable the shared heap when configuring ? (--enable-pshared) >> > > I'm using Cobalt. uname -a reports: > > Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri Mar > 9 11:07:52 CST 2018 armv7l GNU/Linux > > Here is the config dump: > > CONFIG_XENO_PSHARED=1 Any chance you could have some leftover files in /dev/shm from aborted runs, which would steal RAM? -- Philippe. ___ Xenomai mailing list Xenomai@xenomai.org https://xenomai.org/mailman/listinfo/xenomai
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 4/2/2018 8:41 AM, Philippe Gerum wrote: On 04/01/2018 07:28 PM, Steve Freyder wrote: Greetings again. As I understand it, for each rt_queue there's supposed to be a "status file" located in the fuse filesystem underneath the "/run/xenomai/user/session/pid/alchemy/queues" directory, with the file name being the queue name. This used to contain very useful info about queue status, message counts, etc. I don't know when it broke or whether it's something I'm doing wrong but I'm now getting a "memory exhausted" message on the console when I attempt to do a "cat" on the status file. Here's a small C program that just creates a queue, and then does a pause to hold the accessor count non-zero. The resulting output (logged in via the system console): # sh qtest.sh + sleep 1 + ./qc --mem-pool-size=64M --session=mysession foo + find /run/xenomai /run/xenomai /run/xenomai/root /run/xenomai/root/mysession /run/xenomai/root/mysession/821 /run/xenomai/root/mysession/821/alchemy /run/xenomai/root/mysession/821/alchemy/tasks /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] /run/xenomai/root/mysession/821/alchemy/queues /run/xenomai/root/mysession/821/alchemy/queues/foo /run/xenomai/root/mysession/system /run/xenomai/root/mysession/system/threads /run/xenomai/root/mysession/system/heaps /run/xenomai/root/mysession/system/version + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' + cat /run/xenomai/root/mysession/821/alchemy/queues/foo memory exhausted At this point, it hangs, although SIGINT usually terminates it. I've seen some cases where SIGINT won't terminate it, and a reboot is required to clean things up. I see this message appears to be logged in the obstack error handler. I don't think I'm running out of memory, which makes me think "heap corruption". Not much of an analysis! I did try varying queue sizes and max message counts - no change. I can't reproduce this. I would suspect a rampant memory corruption too, although running the test code over valgrind (mercury build) did not reveal any issue. - which Xenomai version are you using? - cobalt / mercury ? - do you enable the shared heap when configuring ? (--enable-pshared) I'm using Cobalt. uname -a reports: Linux sdftest 4.1.18_C01571-15S00-00.000.zimg+83fdace666 #2 SMP Fri Mar 9 11:07:52 CST 2018 armv7l GNU/Linux Here is the config dump: CONFIG_MMU=1 CONFIG_SMP=1 CONFIG_XENO_BUILD_ARGS=" '--build=x86_64-linux' '--host=arm-emac-linux-gnueabi' '--target=arm-emac-linux-gnueabi' '--with-core=cobalt' '--enable-pshared' '--enable-smp' '--prefix=/usr' '--exec_prefix=/usr' '--includedir=/usr/include/xenomai' '--enable-registry' 'build_alias=x86_64-linux' 'host_alias=arm-emac-linux-gnueabi' 'target_alias=arm-emac-linux-gnueabi' 'CC=arm-emac-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=softfp --sysroot=/home/developer/oe/build_c01571-15/tmp/sysroots/c01571-15' 'CFLAGS=-march=armv7-a' 'LDFLAGS=-D_FILE_OFFSET_BITS=64 -I/home/developer/oe/build_c01571-15/tmp/sysroots/c01571-15/usr/include/fuse -lfuse -pthread' 'CPPFLAGS=' 'CPP=arm-emac-linux-gnueabi-gcc -E --sysroot=/home/developer/oe/build_c01571-15/tmp/sysroots/c01571-15 -march=armv7-a -mfpu=neon -mfloat-abi=softfp' 'PKG_CONFIG_PATH=/home/developer/oe/build_c01571-15/tmp/sysroots/c01571-15/usr/lib/pkgconfig:/home/developer/oe/build_c01571-15/tmp/sysroots/c01571-15/usr/share/pkgconfig' 'PKG_CONFIG_LIBDIR=/home/developer/oe/build_c01571-15/tmp/sysroots/c01571-15/usr/lib/pkgconfig'" CONFIG_XENO_BUILD_STRING="x86_64-pc-linux-gnu" CONFIG_XENO_COBALT=1 CONFIG_XENO_COMPILER="gcc version 5.3.0 (GCC) " CONFIG_XENO_DEFAULT_PERIOD=100 CONFIG_XENO_FORTIFY=1 CONFIG_XENO_HOST_STRING="arm-emac-linux-gnueabi" CONFIG_XENO_LORES_CLOCK_DISABLED=1 CONFIG_XENO_PREFIX="/usr" CONFIG_XENO_PSHARED=1 CONFIG_XENO_RAW_CLOCK_ENABLED=1 CONFIG_XENO_REGISTRY=1 CONFIG_XENO_REGISTRY_ROOT="/var/run/xenomai" CONFIG_XENO_REVISION_LEVEL=6 CONFIG_XENO_SANITY=1 CONFIG_XENO_TLSF=1 CONFIG_XENO_TLS_MODEL="initial-exec" CONFIG_XENO_UAPI_LEVEL=14 CONFIG_XENO_VERSION_MAJOR=3 CONFIG_XENO_VERSION_MINOR=0 CONFIG_XENO_VERSION_NAME="Stellar Parallax" CONFIG_XENO_VERSION_STRING="3.0.6" --- CONFIG_XENO_ASYNC_CANCEL is OFF CONFIG_XENO_COPPERPLATE_CLOCK_RESTRICTED is OFF CONFIG_XENO_DEBUG is OFF CONFIG_XENO_DEBUG_FULL is OFF CONFIG_XENO_LIBS_DLOPEN is OFF CONFIG_XENO_MERCURY is OFF CONFIG_XENO_VALGRIND_API is OFF CONFIG_XENO_WORKAROUND_CONDVAR_PI is OFF CONFIG_XENO_X86_VSYSCALL is OFF --- PTHREAD_STACK_DEFAULT=65536 AUTOMATIC_BOOTSTRAP=1 Best regards, Steve ___ Xenomai mailing list Xenomai@xenomai.org https://xenomai.org/mailman/listinfo/xenomai
Re: [Xenomai] Possible Xenomai fuse filesystem/registry queue status files issue?
On 04/01/2018 07:28 PM, Steve Freyder wrote: > Greetings again. > > As I understand it, for each rt_queue there's supposed to be a > "status file" located in the fuse filesystem underneath the > "/run/xenomai/user/session/pid/alchemy/queues" directory, with > the file name being the queue name. This used to contain very > useful info about queue status, message counts, etc. I don't know > when it broke or whether it's something I'm doing wrong but I'm > now getting a "memory exhausted" message on the console when I > attempt to do a "cat" on the status file. > > Here's a small C program that just creates a queue, and then does > a pause to hold the accessor count non-zero. > > The resulting output (logged in via the system console): > > # sh qtest.sh > + sleep 1 > + ./qc --mem-pool-size=64M --session=mysession foo > + find /run/xenomai > /run/xenomai > /run/xenomai/root > /run/xenomai/root/mysession > /run/xenomai/root/mysession/821 > /run/xenomai/root/mysession/821/alchemy > /run/xenomai/root/mysession/821/alchemy/tasks > /run/xenomai/root/mysession/821/alchemy/tasks/task@1[821] > /run/xenomai/root/mysession/821/alchemy/queues > /run/xenomai/root/mysession/821/alchemy/queues/foo > /run/xenomai/root/mysession/system > /run/xenomai/root/mysession/system/threads > /run/xenomai/root/mysession/system/heaps > /run/xenomai/root/mysession/system/version > + qfile='/run/xenomai/*/*/*/alchemy/queues/foo' > + cat /run/xenomai/root/mysession/821/alchemy/queues/foo > memory exhausted > > At this point, it hangs, although SIGINT usually terminates it. > > I've seen some cases where SIGINT won't terminate it, and a reboot is > required to clean things up. I see this message appears to be logged > in the obstack error handler. I don't think I'm running out of memory, > which makes me think "heap corruption". Not much of an analysis! I did > try varying queue sizes and max message counts - no change. > I can't reproduce this. I would suspect a rampant memory corruption too, although running the test code over valgrind (mercury build) did not reveal any issue. - which Xenomai version are you using? - cobalt / mercury ? - do you enable the shared heap when configuring ? (--enable-pshared) -- Philippe. ___ Xenomai mailing list Xenomai@xenomai.org https://xenomai.org/mailman/listinfo/xenomai