On Sun, 23 Aug 2015 08:52:53 -0700, you wrote:

>Hi Harvey,
>
>Thanks for the response. I think the biggest question in my mind is - Ok,
>so perhaps I have a synchronization problem that rears it's head once in a
>while. But is this really that much of a problem which may cause both
>processes to stop ?
>
>A sample here and there once in a while that does not display, because it
>is malformed does not bother me. The processes stopping - does. I can not
>see how this could be causing the processes to stop. However . . . I
>honestly do not know one way or the other.

Process A: while process B is busy, wait, then read from process B

Process B: while process A is busy, wait, then read from process A

Classic deadlock.

Process A: wait for permission to read special area, read, then wait
outside that permission area.  No restrictions on process B except
when accessing special area (which happens infrequently) .

Process B: wait for permission to read special area, read, then wait
outside that permission area.  No restrictions on process A except
when accessing special area (which happens infrequently) .

Since the waiting is outside that special area, and the processes are
not allowed to hog the special area (and block the other process),
then neither process can block the other except for a very brief time.

The implication is that the process check and access special area
takes a very small time, and the wait/do something else part takes a
longer time.

Harvey

>On Sun, Aug 23, 2015 at 8:43 AM, Harvey White <[email protected]>
>wrote:
>
>> On Sun, 23 Aug 2015 08:25:02 -0700, you wrote:
>>
>> >HI Przemek,
>> >
>> >*Since this involves two processes that as you say stop simultaneously,*
>> >> * I'd suspect a latent synchronization bug. You don't say how you*
>> >> * interlock your shared memory,  but one possibility is that your
>> reader*
>> >> * code gets stuck because you overwrite the data while it's reading it.*
>> >> * Debugging this type of thing is tricky, but maybe write a state*
>> >> * machine that lights some LEDs that show the phases of your*
>> >> * synchronization process, and wait to see where it's stuck.*
>> >
>> >
>> >Currently, I have no synchronization. At one point I was using a byte in
>> >shared memory as a binary stopgap, but after a while it was not working
>> >predictably. Now, I'm re-reading documentation on POSIX semaphores, and
>> >creating a semaphore in shared memory, instead of using a system wide
>> >resource.
>>
>> Then you have two things that happen with no predictable time
>> relationship to each other at all.
>>
>> You could be writing part of a multibyte message when trying to read
>> that message with another process.
>>
>> A binary semaphore controls access to the shared (message) resource.
>> Checking the binary semaphore generally involves turning off
>> interrupts so that the other process can't grab control during the
>> check code.  If you have two separate processors, you still need to
>> deal with the same thing, not so much interrupts, but permission to
>> access.  The semaphore read/write must be atomic, and the access must
>> be negotiated between the two processors (generally happens in
>> hardware for two processors, happens in software for two processes
>> running on the same processor).
>> >
>> >*I'd definitely look at this malformation---it could be the smoke from*
>> >> * the real fire. Or not. In any case, this one should be easier to*
>> >> * find---just wait for the message, inspect the data in firebug, and*
>> >> * write a checker routine, inspecting your outgoing data, that watches*
>> >> * for this type of distortion. *
>> >
>> >
>> >The first thing that comes to mind here, which I forgot to add to my post
>> >last night is that I am not zeroing out the shared memory file before
>> >usage. I know this is bad . . .but am not convinced this is what the
>> >problem is. However since it is / can be a one line of code fix. I will do
>> >so. The odd thing here is that I get maybe 1-2 notifications an hour - If
>> >that. Then it is inside the actual json object ( string pointer - e.g.
>> char
>> >*buffer ) - not outside.
>> >
>> >What does all this mean to me. The first impression that I get out of this
>> >is that it is a synchronization issue. I'm still not convinced though . .
>> .
>> >
>>
>> analyze the code to see what happens if one process is writing while
>> the other is reading.
>>
>> The error rate may be just a measure of how frequently this happens.
>>
>> Harvey
>>
>>
>> >Also, for what it's worth. I'm using mmap() and not file open(), read(),
>> >write(). So the code is very fast.
>> >
>> >On Sun, Aug 23, 2015 at 6:40 AM, Przemek Klosowski <
>> >[email protected]> wrote:
>> >
>> >> On Sun, Aug 23, 2015 at 1:31 AM, William Hermans <[email protected]>
>> >> wrote:
>> >> > So I have a problem with some code I've been working on for the last
>> few
>> >> > months. The code, which is compiled into two separate processes
>> suddenly
>> >> > stops working. No error, nothing in dmesg, nothing in any file in
>> >> /var/log
>> >> > period. It did however occur to me that since rsyslog is likely or
>> >> possible
>> >> > disabled.
>> >> >
>> >> > What my code does is read from the CAN peripheral. Form extended
>> packets
>> >> out
>> >> > of the CAN frames( NMEA 2000 fastpackets ), and then writes the data
>> >> into a
>> >> > POSIX shared memory file ( /dev/shm/file ).
>> >>
>> >> Since this involves two processes that as you say stop simultaneously,
>> >> I'd suspect a latent synchronization bug. You don't say how you
>> >> interlock your shared memory,  but one possibility is that your reader
>> >> code gets stuck because you overwrite the data while it's reading it.
>> >> Debugging this type of thing is tricky, but maybe write a state
>> >> machine that lights some LEDs that show the phases of your
>> >> synchronization process, and wait to see where it's stuck.
>> >>
>> >> > The second process simply reads
>> >> > from the file, and shuffles the data out over a websocket in json /
>> human
>> >> > readable form. The data on the webside of things is tested accurate,
>> >> > although I do occasionally get a malformed json object warning from
>> >> firefox
>> >> > firebug.
>> >>
>> >> I'd definitely look at this malformation---it could be the smoke from
>> >> the real fire. Or not. In any case, this one should be easier to
>> >> find---just wait for the message, inspect the data in firebug, and
>> >> write a checker routine, inspecting your outgoing data, that watches
>> >> for this type of distortion.
>> >>
>> >> --
>> >> For more options, visit http://beagleboard.org/discuss
>> >> ---
>> >> You received this message because you are subscribed to the Google
>> Groups
>> >> "BeagleBoard" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send
>> an
>> >> email to [email protected].
>> >> For more options, visit https://groups.google.com/d/optout.
>> >>
>>
>> --
>> For more options, visit http://beagleboard.org/discuss
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "BeagleBoard" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to