On Sun, 23 Aug 2015 13:42:29 -0700, you wrote:

>>
>> *1) what stops process A from writing to the shared buffer if process B*
>> * is reading it?*
>
>
>Nothing. I assume that writes are slower, or at most as fast as reads. Both
>reads, and writes are done using a mmap'd pointer.

Murphy says that you cannot guarantee this.
>
>*2) what keeps B from getting an incomplete or inaccurate value from*
>> * process A for the byte position?  is it a byte variable or is it an*
>> * integer?  Does the processor write this as an integer in one*
>> * uninterruptible process?*
>>
>
>Aside from the fact that the byte position I'm testing here is a source ID,
>of two different devices. Nothing. They do come in - in order one after the
>other however. This is not permanent however. When I start tracking more
>data, for one set of data this will still work. But not for other sets of
>data. Write / read type is  char. No way really to get this wrong as with
>gcc -Wall, gcc will warn. I have no errors or warning when compiling.
>

OK, assumption is that they are sequential.  Depends on what the
process switching time is and phase of the moon.  My paranoid
assumption is that they are not necessarily sequential and can occur
at any time in relationship to each other.

char is ok, you don't get corrupted values, but you may get the "last"
value rather than the current one unless you interlock the two tasks.



>3) if both A and B access Internet devices (over the same interface
>I'd guess), what stops the data collision between process A and
>process B?  What protects that Internet resource?  What is the result
>if both A and B read a status register at the same time (in the
>hardware)?
>
>No. I guess more correctly they are socket devices. Both using Linux
>network sockets. socketcan for CANBus, and standard Linux sockets for
>ethernet. The web libraries I did not write. It's libmongoose.
>

Ok, do you know if these functions are thread safe?

I think that's what's giving you problems, the programming is not
thread aware or thread safe.

Harvey

>On Sun, Aug 23, 2015 at 1:06 PM, Harvey White <[email protected]>
>wrote:
>
>> On Sun, 23 Aug 2015 11:44:13 -0700, you wrote:
>>
>> >Ok. In my case however -
>> >
>> >Process A writes to shared memory only.
>> >Process B Reads from shared memory only.
>>
>> Ok, so that eliminates one form of data corruption.
>> >
>> >As it stands Process B starts off with a variable set to 0x00. then
>> >compares this to a byte position in the file. When Process B first starts,
>> >this comparison will always fail. Process B then copies the contents of
>> the
>> >file, sets the variable to this value to the value at the byte position.
>> >Then sends the data out over a websocket.
>>
>> Ok:
>> 1) what stops process A from writing to the shared buffer if process B
>> is reading it?
>>
>> 2) what keeps B from getting an incomplete or inaccurate value from
>> process A for the byte position?  is it a byte variable or is it an
>> integer?  Does the processor write this as an integer in one
>> uninterruptible process?
>>
>> 3) if both A and B access Internet devices (over the same interface
>> I'd guess), what stops the data collision between process A and
>> process B?  What protects that Internet resource?  What is the result
>> if both A and B read a status register at the same time (in the
>> hardware)?
>>
>> Harvey
>>
>>
>>
>> >
>> >On the next iteration of the loop cycle. Process B then reads this value
>> >again, makes the comparison - which will likely succeed. The loop cycle
>> >then continues until this comparison fails again. Where the logic process
>> >repeats. It's pretty simple - Or so I thought.
>> >
>> >The reasoning for this development model is simple. Code segregation. Code
>> >in process B does not play well with the code in process A. They're both
>> >accessing network devices, and when it happen simultaneously - Data gets
>> >lost. Which happens more often than not.
>> >
>> >On Sun, Aug 23, 2015 at 9:39 AM, Harvey White <[email protected]>
>> >wrote:
>> >
>> >> On Sun, 23 Aug 2015 08:52:53 -0700, you wrote:
>> >>
>> >> >Hi Harvey,
>> >> >
>> >> >Thanks for the response. I think the biggest question in my mind is -
>> Ok,
>> >> >so perhaps I have a synchronization problem that rears it's head once
>> in a
>> >> >while. But is this really that much of a problem which may cause both
>> >> >processes to stop ?
>> >> >
>> >> >A sample here and there once in a while that does not display, because
>> it
>> >> >is malformed does not bother me. The processes stopping - does. I can
>> not
>> >> >see how this could be causing the processes to stop. However . . . I
>> >> >honestly do not know one way or the other.
>> >>
>> >> Process A: while process B is busy, wait, then read from process B
>> >>
>> >> Process B: while process A is busy, wait, then read from process A
>> >>
>> >> Classic deadlock.
>> >>
>> >> Process A: wait for permission to read special area, read, then wait
>> >> outside that permission area.  No restrictions on process B except
>> >> when accessing special area (which happens infrequently) .
>> >>
>> >> Process B: wait for permission to read special area, read, then wait
>> >> outside that permission area.  No restrictions on process A except
>> >> when accessing special area (which happens infrequently) .
>> >>
>> >> Since the waiting is outside that special area, and the processes are
>> >> not allowed to hog the special area (and block the other process),
>> >> then neither process can block the other except for a very brief time.
>> >>
>> >> The implication is that the process check and access special area
>> >> takes a very small time, and the wait/do something else part takes a
>> >> longer time.
>> >>
>> >> Harvey
>> >>
>> >> >On Sun, Aug 23, 2015 at 8:43 AM, Harvey White <[email protected]>
>> >> >wrote:
>> >> >
>> >> >> On Sun, 23 Aug 2015 08:25:02 -0700, you wrote:
>> >> >>
>> >> >> >HI Przemek,
>> >> >> >
>> >> >> >*Since this involves two processes that as you say stop
>> >> simultaneously,*
>> >> >> >> * I'd suspect a latent synchronization bug. You don't say how you*
>> >> >> >> * interlock your shared memory,  but one possibility is that your
>> >> >> reader*
>> >> >> >> * code gets stuck because you overwrite the data while it's
>> reading
>> >> it.*
>> >> >> >> * Debugging this type of thing is tricky, but maybe write a state*
>> >> >> >> * machine that lights some LEDs that show the phases of your*
>> >> >> >> * synchronization process, and wait to see where it's stuck.*
>> >> >> >
>> >> >> >
>> >> >> >Currently, I have no synchronization. At one point I was using a
>> byte
>> >> in
>> >> >> >shared memory as a binary stopgap, but after a while it was not
>> working
>> >> >> >predictably. Now, I'm re-reading documentation on POSIX semaphores,
>> and
>> >> >> >creating a semaphore in shared memory, instead of using a system
>> wide
>> >> >> >resource.
>> >> >>
>> >> >> Then you have two things that happen with no predictable time
>> >> >> relationship to each other at all.
>> >> >>
>> >> >> You could be writing part of a multibyte message when trying to read
>> >> >> that message with another process.
>> >> >>
>> >> >> A binary semaphore controls access to the shared (message) resource.
>> >> >> Checking the binary semaphore generally involves turning off
>> >> >> interrupts so that the other process can't grab control during the
>> >> >> check code.  If you have two separate processors, you still need to
>> >> >> deal with the same thing, not so much interrupts, but permission to
>> >> >> access.  The semaphore read/write must be atomic, and the access must
>> >> >> be negotiated between the two processors (generally happens in
>> >> >> hardware for two processors, happens in software for two processes
>> >> >> running on the same processor).
>> >> >> >
>> >> >> >*I'd definitely look at this malformation---it could be the smoke
>> from*
>> >> >> >> * the real fire. Or not. In any case, this one should be easier
>> to*
>> >> >> >> * find---just wait for the message, inspect the data in firebug,
>> and*
>> >> >> >> * write a checker routine, inspecting your outgoing data, that
>> >> watches*
>> >> >> >> * for this type of distortion. *
>> >> >> >
>> >> >> >
>> >> >> >The first thing that comes to mind here, which I forgot to add to my
>> >> post
>> >> >> >last night is that I am not zeroing out the shared memory file
>> before
>> >> >> >usage. I know this is bad . . .but am not convinced this is what the
>> >> >> >problem is. However since it is / can be a one line of code fix. I
>> >> will do
>> >> >> >so. The odd thing here is that I get maybe 1-2 notifications an
>> hour -
>> >> If
>> >> >> >that. Then it is inside the actual json object ( string pointer -
>> e.g.
>> >> >> char
>> >> >> >*buffer ) - not outside.
>> >> >> >
>> >> >> >What does all this mean to me. The first impression that I get out
>> of
>> >> this
>> >> >> >is that it is a synchronization issue. I'm still not convinced
>> though
>> >> . .
>> >> >> .
>> >> >> >
>> >> >>
>> >> >> analyze the code to see what happens if one process is writing while
>> >> >> the other is reading.
>> >> >>
>> >> >> The error rate may be just a measure of how frequently this happens.
>> >> >>
>> >> >> Harvey
>> >> >>
>> >> >>
>> >> >> >Also, for what it's worth. I'm using mmap() and not file open(),
>> >> read(),
>> >> >> >write(). So the code is very fast.
>> >> >> >
>> >> >> >On Sun, Aug 23, 2015 at 6:40 AM, Przemek Klosowski <
>> >> >> >[email protected]> wrote:
>> >> >> >
>> >> >> >> On Sun, Aug 23, 2015 at 1:31 AM, William Hermans <
>> [email protected]>
>> >> >> >> wrote:
>> >> >> >> > So I have a problem with some code I've been working on for the
>> >> last
>> >> >> few
>> >> >> >> > months. The code, which is compiled into two separate processes
>> >> >> suddenly
>> >> >> >> > stops working. No error, nothing in dmesg, nothing in any file
>> in
>> >> >> >> /var/log
>> >> >> >> > period. It did however occur to me that since rsyslog is likely
>> or
>> >> >> >> possible
>> >> >> >> > disabled.
>> >> >> >> >
>> >> >> >> > What my code does is read from the CAN peripheral. Form extended
>> >> >> packets
>> >> >> >> out
>> >> >> >> > of the CAN frames( NMEA 2000 fastpackets ), and then writes the
>> >> data
>> >> >> >> into a
>> >> >> >> > POSIX shared memory file ( /dev/shm/file ).
>> >> >> >>
>> >> >> >> Since this involves two processes that as you say stop
>> >> simultaneously,
>> >> >> >> I'd suspect a latent synchronization bug. You don't say how you
>> >> >> >> interlock your shared memory,  but one possibility is that your
>> >> reader
>> >> >> >> code gets stuck because you overwrite the data while it's reading
>> it.
>> >> >> >> Debugging this type of thing is tricky, but maybe write a state
>> >> >> >> machine that lights some LEDs that show the phases of your
>> >> >> >> synchronization process, and wait to see where it's stuck.
>> >> >> >>
>> >> >> >> > The second process simply reads
>> >> >> >> > from the file, and shuffles the data out over a websocket in
>> json /
>> >> >> human
>> >> >> >> > readable form. The data on the webside of things is tested
>> >> accurate,
>> >> >> >> > although I do occasionally get a malformed json object warning
>> from
>> >> >> >> firefox
>> >> >> >> > firebug.
>> >> >> >>
>> >> >> >> I'd definitely look at this malformation---it could be the smoke
>> from
>> >> >> >> the real fire. Or not. In any case, this one should be easier to
>> >> >> >> find---just wait for the message, inspect the data in firebug, and
>> >> >> >> write a checker routine, inspecting your outgoing data, that
>> watches
>> >> >> >> for this type of distortion.
>> >> >> >>
>> >> >> >> --
>> >> >> >> For more options, visit http://beagleboard.org/discuss
>> >> >> >> ---
>> >> >> >> You received this message because you are subscribed to the Google
>> >> >> Groups
>> >> >> >> "BeagleBoard" group.
>> >> >> >> To unsubscribe from this group and stop receiving emails from it,
>> >> send
>> >> >> an
>> >> >> >> email to [email protected].
>> >> >> >> For more options, visit https://groups.google.com/d/optout.
>> >> >> >>
>> >> >>
>> >> >> --
>> >> >> For more options, visit http://beagleboard.org/discuss
>> >> >> ---
>> >> >> You received this message because you are subscribed to the Google
>> >> Groups
>> >> >> "BeagleBoard" group.
>> >> >> To unsubscribe from this group and stop receiving emails from it,
>> send
>> >> an
>> >> >> email to [email protected].
>> >> >> For more options, visit https://groups.google.com/d/optout.
>> >> >>
>> >>
>> >> --
>> >> For more options, visit http://beagleboard.org/discuss
>> >> ---
>> >> You received this message because you are subscribed to the Google
>> Groups
>> >> "BeagleBoard" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send
>> an
>> >> email to [email protected].
>> >> For more options, visit https://groups.google.com/d/optout.
>> >>
>>
>> --
>> For more options, visit http://beagleboard.org/discuss
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "BeagleBoard" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to