Hi Harvey, Thanks for the response. I think the biggest question in my mind is - Ok, so perhaps I have a synchronization problem that rears it's head once in a while. But is this really that much of a problem which may cause both processes to stop ?
A sample here and there once in a while that does not display, because it is malformed does not bother me. The processes stopping - does. I can not see how this could be causing the processes to stop. However . . . I honestly do not know one way or the other. On Sun, Aug 23, 2015 at 8:43 AM, Harvey White <[email protected]> wrote: > On Sun, 23 Aug 2015 08:25:02 -0700, you wrote: > > >HI Przemek, > > > >*Since this involves two processes that as you say stop simultaneously,* > >> * I'd suspect a latent synchronization bug. You don't say how you* > >> * interlock your shared memory, but one possibility is that your > reader* > >> * code gets stuck because you overwrite the data while it's reading it.* > >> * Debugging this type of thing is tricky, but maybe write a state* > >> * machine that lights some LEDs that show the phases of your* > >> * synchronization process, and wait to see where it's stuck.* > > > > > >Currently, I have no synchronization. At one point I was using a byte in > >shared memory as a binary stopgap, but after a while it was not working > >predictably. Now, I'm re-reading documentation on POSIX semaphores, and > >creating a semaphore in shared memory, instead of using a system wide > >resource. > > Then you have two things that happen with no predictable time > relationship to each other at all. > > You could be writing part of a multibyte message when trying to read > that message with another process. > > A binary semaphore controls access to the shared (message) resource. > Checking the binary semaphore generally involves turning off > interrupts so that the other process can't grab control during the > check code. If you have two separate processors, you still need to > deal with the same thing, not so much interrupts, but permission to > access. The semaphore read/write must be atomic, and the access must > be negotiated between the two processors (generally happens in > hardware for two processors, happens in software for two processes > running on the same processor). > > > >*I'd definitely look at this malformation---it could be the smoke from* > >> * the real fire. Or not. In any case, this one should be easier to* > >> * find---just wait for the message, inspect the data in firebug, and* > >> * write a checker routine, inspecting your outgoing data, that watches* > >> * for this type of distortion. * > > > > > >The first thing that comes to mind here, which I forgot to add to my post > >last night is that I am not zeroing out the shared memory file before > >usage. I know this is bad . . .but am not convinced this is what the > >problem is. However since it is / can be a one line of code fix. I will do > >so. The odd thing here is that I get maybe 1-2 notifications an hour - If > >that. Then it is inside the actual json object ( string pointer - e.g. > char > >*buffer ) - not outside. > > > >What does all this mean to me. The first impression that I get out of this > >is that it is a synchronization issue. I'm still not convinced though . . > . > > > > analyze the code to see what happens if one process is writing while > the other is reading. > > The error rate may be just a measure of how frequently this happens. > > Harvey > > > >Also, for what it's worth. I'm using mmap() and not file open(), read(), > >write(). So the code is very fast. > > > >On Sun, Aug 23, 2015 at 6:40 AM, Przemek Klosowski < > >[email protected]> wrote: > > > >> On Sun, Aug 23, 2015 at 1:31 AM, William Hermans <[email protected]> > >> wrote: > >> > So I have a problem with some code I've been working on for the last > few > >> > months. The code, which is compiled into two separate processes > suddenly > >> > stops working. No error, nothing in dmesg, nothing in any file in > >> /var/log > >> > period. It did however occur to me that since rsyslog is likely or > >> possible > >> > disabled. > >> > > >> > What my code does is read from the CAN peripheral. Form extended > packets > >> out > >> > of the CAN frames( NMEA 2000 fastpackets ), and then writes the data > >> into a > >> > POSIX shared memory file ( /dev/shm/file ). > >> > >> Since this involves two processes that as you say stop simultaneously, > >> I'd suspect a latent synchronization bug. You don't say how you > >> interlock your shared memory, but one possibility is that your reader > >> code gets stuck because you overwrite the data while it's reading it. > >> Debugging this type of thing is tricky, but maybe write a state > >> machine that lights some LEDs that show the phases of your > >> synchronization process, and wait to see where it's stuck. > >> > >> > The second process simply reads > >> > from the file, and shuffles the data out over a websocket in json / > human > >> > readable form. The data on the webside of things is tested accurate, > >> > although I do occasionally get a malformed json object warning from > >> firefox > >> > firebug. > >> > >> I'd definitely look at this malformation---it could be the smoke from > >> the real fire. Or not. In any case, this one should be easier to > >> find---just wait for the message, inspect the data in firebug, and > >> write a checker routine, inspecting your outgoing data, that watches > >> for this type of distortion. > >> > >> -- > >> For more options, visit http://beagleboard.org/discuss > >> --- > >> You received this message because you are subscribed to the Google > Groups > >> "BeagleBoard" group. > >> To unsubscribe from this group and stop receiving emails from it, send > an > >> email to [email protected]. > >> For more options, visit https://groups.google.com/d/optout. > >> > > -- > For more options, visit http://beagleboard.org/discuss > --- > You received this message because you are subscribed to the Google Groups > "BeagleBoard" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- For more options, visit http://beagleboard.org/discuss --- You received this message because you are subscribed to the Google Groups "BeagleBoard" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
