On Sun, 23 Aug 2015 08:25:02 -0700, you wrote:

>HI Przemek,
>
>*Since this involves two processes that as you say stop simultaneously,*
>> * I'd suspect a latent synchronization bug. You don't say how you*
>> * interlock your shared memory,  but one possibility is that your reader*
>> * code gets stuck because you overwrite the data while it's reading it.*
>> * Debugging this type of thing is tricky, but maybe write a state*
>> * machine that lights some LEDs that show the phases of your*
>> * synchronization process, and wait to see where it's stuck.*
>
>
>Currently, I have no synchronization. At one point I was using a byte in
>shared memory as a binary stopgap, but after a while it was not working
>predictably. Now, I'm re-reading documentation on POSIX semaphores, and
>creating a semaphore in shared memory, instead of using a system wide
>resource.

Then you have two things that happen with no predictable time
relationship to each other at all.

You could be writing part of a multibyte message when trying to read
that message with another process.

A binary semaphore controls access to the shared (message) resource.
Checking the binary semaphore generally involves turning off
interrupts so that the other process can't grab control during the
check code.  If you have two separate processors, you still need to
deal with the same thing, not so much interrupts, but permission to
access.  The semaphore read/write must be atomic, and the access must
be negotiated between the two processors (generally happens in
hardware for two processors, happens in software for two processes
running on the same processor).
>
>*I'd definitely look at this malformation---it could be the smoke from*
>> * the real fire. Or not. In any case, this one should be easier to*
>> * find---just wait for the message, inspect the data in firebug, and*
>> * write a checker routine, inspecting your outgoing data, that watches*
>> * for this type of distortion. *
>
>
>The first thing that comes to mind here, which I forgot to add to my post
>last night is that I am not zeroing out the shared memory file before
>usage. I know this is bad . . .but am not convinced this is what the
>problem is. However since it is / can be a one line of code fix. I will do
>so. The odd thing here is that I get maybe 1-2 notifications an hour - If
>that. Then it is inside the actual json object ( string pointer - e.g. char
>*buffer ) - not outside.
>
>What does all this mean to me. The first impression that I get out of this
>is that it is a synchronization issue. I'm still not convinced though . . .
>

analyze the code to see what happens if one process is writing while
the other is reading.  

The error rate may be just a measure of how frequently this happens.

Harvey


>Also, for what it's worth. I'm using mmap() and not file open(), read(),
>write(). So the code is very fast.
>
>On Sun, Aug 23, 2015 at 6:40 AM, Przemek Klosowski <
>[email protected]> wrote:
>
>> On Sun, Aug 23, 2015 at 1:31 AM, William Hermans <[email protected]>
>> wrote:
>> > So I have a problem with some code I've been working on for the last few
>> > months. The code, which is compiled into two separate processes suddenly
>> > stops working. No error, nothing in dmesg, nothing in any file in
>> /var/log
>> > period. It did however occur to me that since rsyslog is likely or
>> possible
>> > disabled.
>> >
>> > What my code does is read from the CAN peripheral. Form extended packets
>> out
>> > of the CAN frames( NMEA 2000 fastpackets ), and then writes the data
>> into a
>> > POSIX shared memory file ( /dev/shm/file ).
>>
>> Since this involves two processes that as you say stop simultaneously,
>> I'd suspect a latent synchronization bug. You don't say how you
>> interlock your shared memory,  but one possibility is that your reader
>> code gets stuck because you overwrite the data while it's reading it.
>> Debugging this type of thing is tricky, but maybe write a state
>> machine that lights some LEDs that show the phases of your
>> synchronization process, and wait to see where it's stuck.
>>
>> > The second process simply reads
>> > from the file, and shuffles the data out over a websocket in json / human
>> > readable form. The data on the webside of things is tested accurate,
>> > although I do occasionally get a malformed json object warning from
>> firefox
>> > firebug.
>>
>> I'd definitely look at this malformation---it could be the smoke from
>> the real fire. Or not. In any case, this one should be easier to
>> find---just wait for the message, inspect the data in firebug, and
>> write a checker routine, inspecting your outgoing data, that watches
>> for this type of distortion.
>>
>> --
>> For more options, visit http://beagleboard.org/discuss
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "BeagleBoard" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to