Hi Harvey,

Thanks for the response. I think the biggest question in my mind is - Ok,
so perhaps I have a synchronization problem that rears it's head once in a
while. But is this really that much of a problem which may cause both
processes to stop ?

A sample here and there once in a while that does not display, because it
is malformed does not bother me. The processes stopping - does. I can not
see how this could be causing the processes to stop. However . . . I
honestly do not know one way or the other.

On Sun, Aug 23, 2015 at 8:43 AM, Harvey White <[email protected]>
wrote:

> On Sun, 23 Aug 2015 08:25:02 -0700, you wrote:
>
> >HI Przemek,
> >
> >*Since this involves two processes that as you say stop simultaneously,*
> >> * I'd suspect a latent synchronization bug. You don't say how you*
> >> * interlock your shared memory,  but one possibility is that your
> reader*
> >> * code gets stuck because you overwrite the data while it's reading it.*
> >> * Debugging this type of thing is tricky, but maybe write a state*
> >> * machine that lights some LEDs that show the phases of your*
> >> * synchronization process, and wait to see where it's stuck.*
> >
> >
> >Currently, I have no synchronization. At one point I was using a byte in
> >shared memory as a binary stopgap, but after a while it was not working
> >predictably. Now, I'm re-reading documentation on POSIX semaphores, and
> >creating a semaphore in shared memory, instead of using a system wide
> >resource.
>
> Then you have two things that happen with no predictable time
> relationship to each other at all.
>
> You could be writing part of a multibyte message when trying to read
> that message with another process.
>
> A binary semaphore controls access to the shared (message) resource.
> Checking the binary semaphore generally involves turning off
> interrupts so that the other process can't grab control during the
> check code.  If you have two separate processors, you still need to
> deal with the same thing, not so much interrupts, but permission to
> access.  The semaphore read/write must be atomic, and the access must
> be negotiated between the two processors (generally happens in
> hardware for two processors, happens in software for two processes
> running on the same processor).
> >
> >*I'd definitely look at this malformation---it could be the smoke from*
> >> * the real fire. Or not. In any case, this one should be easier to*
> >> * find---just wait for the message, inspect the data in firebug, and*
> >> * write a checker routine, inspecting your outgoing data, that watches*
> >> * for this type of distortion. *
> >
> >
> >The first thing that comes to mind here, which I forgot to add to my post
> >last night is that I am not zeroing out the shared memory file before
> >usage. I know this is bad . . .but am not convinced this is what the
> >problem is. However since it is / can be a one line of code fix. I will do
> >so. The odd thing here is that I get maybe 1-2 notifications an hour - If
> >that. Then it is inside the actual json object ( string pointer - e.g.
> char
> >*buffer ) - not outside.
> >
> >What does all this mean to me. The first impression that I get out of this
> >is that it is a synchronization issue. I'm still not convinced though . .
> .
> >
>
> analyze the code to see what happens if one process is writing while
> the other is reading.
>
> The error rate may be just a measure of how frequently this happens.
>
> Harvey
>
>
> >Also, for what it's worth. I'm using mmap() and not file open(), read(),
> >write(). So the code is very fast.
> >
> >On Sun, Aug 23, 2015 at 6:40 AM, Przemek Klosowski <
> >[email protected]> wrote:
> >
> >> On Sun, Aug 23, 2015 at 1:31 AM, William Hermans <[email protected]>
> >> wrote:
> >> > So I have a problem with some code I've been working on for the last
> few
> >> > months. The code, which is compiled into two separate processes
> suddenly
> >> > stops working. No error, nothing in dmesg, nothing in any file in
> >> /var/log
> >> > period. It did however occur to me that since rsyslog is likely or
> >> possible
> >> > disabled.
> >> >
> >> > What my code does is read from the CAN peripheral. Form extended
> packets
> >> out
> >> > of the CAN frames( NMEA 2000 fastpackets ), and then writes the data
> >> into a
> >> > POSIX shared memory file ( /dev/shm/file ).
> >>
> >> Since this involves two processes that as you say stop simultaneously,
> >> I'd suspect a latent synchronization bug. You don't say how you
> >> interlock your shared memory,  but one possibility is that your reader
> >> code gets stuck because you overwrite the data while it's reading it.
> >> Debugging this type of thing is tricky, but maybe write a state
> >> machine that lights some LEDs that show the phases of your
> >> synchronization process, and wait to see where it's stuck.
> >>
> >> > The second process simply reads
> >> > from the file, and shuffles the data out over a websocket in json /
> human
> >> > readable form. The data on the webside of things is tested accurate,
> >> > although I do occasionally get a malformed json object warning from
> >> firefox
> >> > firebug.
> >>
> >> I'd definitely look at this malformation---it could be the smoke from
> >> the real fire. Or not. In any case, this one should be easier to
> >> find---just wait for the message, inspect the data in firebug, and
> >> write a checker routine, inspecting your outgoing data, that watches
> >> for this type of distortion.
> >>
> >> --
> >> For more options, visit http://beagleboard.org/discuss
> >> ---
> >> You received this message because you are subscribed to the Google
> Groups
> >> "BeagleBoard" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to [email protected].
> >> For more options, visit https://groups.google.com/d/optout.
> >>
>
> --
> For more options, visit http://beagleboard.org/discuss
> ---
> You received this message because you are subscribed to the Google Groups
> "BeagleBoard" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to