On Sun, 23 Aug 2015 11:44:13 -0700, you wrote:

>Ok. In my case however -
>
>Process A writes to shared memory only.
>Process B Reads from shared memory only.

Ok, so that eliminates one form of data corruption.
>
>As it stands Process B starts off with a variable set to 0x00. then
>compares this to a byte position in the file. When Process B first starts,
>this comparison will always fail. Process B then copies the contents of the
>file, sets the variable to this value to the value at the byte position.
>Then sends the data out over a websocket.

Ok:
1) what stops process A from writing to the shared buffer if process B
is reading it?

2) what keeps B from getting an incomplete or inaccurate value from
process A for the byte position?  is it a byte variable or is it an
integer?  Does the processor write this as an integer in one
uninterruptible process?

3) if both A and B access Internet devices (over the same interface
I'd guess), what stops the data collision between process A and
process B?  What protects that Internet resource?  What is the result
if both A and B read a status register at the same time (in the
hardware)?

Harvey



>
>On the next iteration of the loop cycle. Process B then reads this value
>again, makes the comparison - which will likely succeed. The loop cycle
>then continues until this comparison fails again. Where the logic process
>repeats. It's pretty simple - Or so I thought.
>
>The reasoning for this development model is simple. Code segregation. Code
>in process B does not play well with the code in process A. They're both
>accessing network devices, and when it happen simultaneously - Data gets
>lost. Which happens more often than not.
>
>On Sun, Aug 23, 2015 at 9:39 AM, Harvey White <[email protected]>
>wrote:
>
>> On Sun, 23 Aug 2015 08:52:53 -0700, you wrote:
>>
>> >Hi Harvey,
>> >
>> >Thanks for the response. I think the biggest question in my mind is - Ok,
>> >so perhaps I have a synchronization problem that rears it's head once in a
>> >while. But is this really that much of a problem which may cause both
>> >processes to stop ?
>> >
>> >A sample here and there once in a while that does not display, because it
>> >is malformed does not bother me. The processes stopping - does. I can not
>> >see how this could be causing the processes to stop. However . . . I
>> >honestly do not know one way or the other.
>>
>> Process A: while process B is busy, wait, then read from process B
>>
>> Process B: while process A is busy, wait, then read from process A
>>
>> Classic deadlock.
>>
>> Process A: wait for permission to read special area, read, then wait
>> outside that permission area.  No restrictions on process B except
>> when accessing special area (which happens infrequently) .
>>
>> Process B: wait for permission to read special area, read, then wait
>> outside that permission area.  No restrictions on process A except
>> when accessing special area (which happens infrequently) .
>>
>> Since the waiting is outside that special area, and the processes are
>> not allowed to hog the special area (and block the other process),
>> then neither process can block the other except for a very brief time.
>>
>> The implication is that the process check and access special area
>> takes a very small time, and the wait/do something else part takes a
>> longer time.
>>
>> Harvey
>>
>> >On Sun, Aug 23, 2015 at 8:43 AM, Harvey White <[email protected]>
>> >wrote:
>> >
>> >> On Sun, 23 Aug 2015 08:25:02 -0700, you wrote:
>> >>
>> >> >HI Przemek,
>> >> >
>> >> >*Since this involves two processes that as you say stop
>> simultaneously,*
>> >> >> * I'd suspect a latent synchronization bug. You don't say how you*
>> >> >> * interlock your shared memory,  but one possibility is that your
>> >> reader*
>> >> >> * code gets stuck because you overwrite the data while it's reading
>> it.*
>> >> >> * Debugging this type of thing is tricky, but maybe write a state*
>> >> >> * machine that lights some LEDs that show the phases of your*
>> >> >> * synchronization process, and wait to see where it's stuck.*
>> >> >
>> >> >
>> >> >Currently, I have no synchronization. At one point I was using a byte
>> in
>> >> >shared memory as a binary stopgap, but after a while it was not working
>> >> >predictably. Now, I'm re-reading documentation on POSIX semaphores, and
>> >> >creating a semaphore in shared memory, instead of using a system wide
>> >> >resource.
>> >>
>> >> Then you have two things that happen with no predictable time
>> >> relationship to each other at all.
>> >>
>> >> You could be writing part of a multibyte message when trying to read
>> >> that message with another process.
>> >>
>> >> A binary semaphore controls access to the shared (message) resource.
>> >> Checking the binary semaphore generally involves turning off
>> >> interrupts so that the other process can't grab control during the
>> >> check code.  If you have two separate processors, you still need to
>> >> deal with the same thing, not so much interrupts, but permission to
>> >> access.  The semaphore read/write must be atomic, and the access must
>> >> be negotiated between the two processors (generally happens in
>> >> hardware for two processors, happens in software for two processes
>> >> running on the same processor).
>> >> >
>> >> >*I'd definitely look at this malformation---it could be the smoke from*
>> >> >> * the real fire. Or not. In any case, this one should be easier to*
>> >> >> * find---just wait for the message, inspect the data in firebug, and*
>> >> >> * write a checker routine, inspecting your outgoing data, that
>> watches*
>> >> >> * for this type of distortion. *
>> >> >
>> >> >
>> >> >The first thing that comes to mind here, which I forgot to add to my
>> post
>> >> >last night is that I am not zeroing out the shared memory file before
>> >> >usage. I know this is bad . . .but am not convinced this is what the
>> >> >problem is. However since it is / can be a one line of code fix. I
>> will do
>> >> >so. The odd thing here is that I get maybe 1-2 notifications an hour -
>> If
>> >> >that. Then it is inside the actual json object ( string pointer - e.g.
>> >> char
>> >> >*buffer ) - not outside.
>> >> >
>> >> >What does all this mean to me. The first impression that I get out of
>> this
>> >> >is that it is a synchronization issue. I'm still not convinced though
>> . .
>> >> .
>> >> >
>> >>
>> >> analyze the code to see what happens if one process is writing while
>> >> the other is reading.
>> >>
>> >> The error rate may be just a measure of how frequently this happens.
>> >>
>> >> Harvey
>> >>
>> >>
>> >> >Also, for what it's worth. I'm using mmap() and not file open(),
>> read(),
>> >> >write(). So the code is very fast.
>> >> >
>> >> >On Sun, Aug 23, 2015 at 6:40 AM, Przemek Klosowski <
>> >> >[email protected]> wrote:
>> >> >
>> >> >> On Sun, Aug 23, 2015 at 1:31 AM, William Hermans <[email protected]>
>> >> >> wrote:
>> >> >> > So I have a problem with some code I've been working on for the
>> last
>> >> few
>> >> >> > months. The code, which is compiled into two separate processes
>> >> suddenly
>> >> >> > stops working. No error, nothing in dmesg, nothing in any file in
>> >> >> /var/log
>> >> >> > period. It did however occur to me that since rsyslog is likely or
>> >> >> possible
>> >> >> > disabled.
>> >> >> >
>> >> >> > What my code does is read from the CAN peripheral. Form extended
>> >> packets
>> >> >> out
>> >> >> > of the CAN frames( NMEA 2000 fastpackets ), and then writes the
>> data
>> >> >> into a
>> >> >> > POSIX shared memory file ( /dev/shm/file ).
>> >> >>
>> >> >> Since this involves two processes that as you say stop
>> simultaneously,
>> >> >> I'd suspect a latent synchronization bug. You don't say how you
>> >> >> interlock your shared memory,  but one possibility is that your
>> reader
>> >> >> code gets stuck because you overwrite the data while it's reading it.
>> >> >> Debugging this type of thing is tricky, but maybe write a state
>> >> >> machine that lights some LEDs that show the phases of your
>> >> >> synchronization process, and wait to see where it's stuck.
>> >> >>
>> >> >> > The second process simply reads
>> >> >> > from the file, and shuffles the data out over a websocket in json /
>> >> human
>> >> >> > readable form. The data on the webside of things is tested
>> accurate,
>> >> >> > although I do occasionally get a malformed json object warning from
>> >> >> firefox
>> >> >> > firebug.
>> >> >>
>> >> >> I'd definitely look at this malformation---it could be the smoke from
>> >> >> the real fire. Or not. In any case, this one should be easier to
>> >> >> find---just wait for the message, inspect the data in firebug, and
>> >> >> write a checker routine, inspecting your outgoing data, that watches
>> >> >> for this type of distortion.
>> >> >>
>> >> >> --
>> >> >> For more options, visit http://beagleboard.org/discuss
>> >> >> ---
>> >> >> You received this message because you are subscribed to the Google
>> >> Groups
>> >> >> "BeagleBoard" group.
>> >> >> To unsubscribe from this group and stop receiving emails from it,
>> send
>> >> an
>> >> >> email to [email protected].
>> >> >> For more options, visit https://groups.google.com/d/optout.
>> >> >>
>> >>
>> >> --
>> >> For more options, visit http://beagleboard.org/discuss
>> >> ---
>> >> You received this message because you are subscribed to the Google
>> Groups
>> >> "BeagleBoard" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send
>> an
>> >> email to [email protected].
>> >> For more options, visit https://groups.google.com/d/optout.
>> >>
>>
>> --
>> For more options, visit http://beagleboard.org/discuss
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "BeagleBoard" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to