Am 01.08.2013 um 11:14 schrieb Frank Tkalcevic <[email protected]>:
> Isn't this just the fact that the buffers can't be shared? If you want a > command buffer don't you need to edit the linuxcnc.nml file and allocate a > buffer for you process with a unique name? I think there are in fact two separate issues at work: - the buffer may get overwritten before the command is consumed - the serial number mechanism is more than dubious; afaict the sender has no guarantee that the echoed serial was in fact originated by that very sender, and - if I read the scenario correctly - it might get lost altogether, leading to a command timeout making the buffer a queue might help with the first issue, but likely not with the second issue; meaning - you make it a queue, you might see less of the error but it doesnt necessarily get fixed Sascha's proposed solution to the uniqueness requirement will work in the local (everything on one CPU) case, but it breaks as soon as senders are not in the same memory space; to retain the option of distributed setup, a unique originator id is needed, in which case the absolute value of the serial becomes meaningless as the unique id of a message is the tuple (originator id, serial). I rule out an RPC server for retrieving a globally-unique serial, this doesnt scale. I think the next step is to isolate this behavior into a test case and verify the sequence where the described scenario happens; everything else goes from there -m > > I had a problem where I was using Axis + Linuxcncrsh. After running for a > couple of hours my job stopped working properly. It looked like it was > executing one line of gcode every 2 seconds. Axis and linuxcncrsh use the > same named buffer. I've never had the problem running axis and keystick at > the same time - they use different named buffers (although I think that > changed recently ). > > -----Original Message----- > From: Michael Haberler [mailto:[email protected]] > Sent: Thursday, 1 August 2013 5:56 PM > To: EMC developers > Subject: [Emc-developers] Fwd: [emc:bugs] #328 Problem using AXIS together > with HALUI > > for some reason tracker entries arent fowarded to emc-developers anymore? > > this is a rather colossal one > > -m > > Anfang der weitergeleiteten Nachricht: > >> Von: "Michael Haberler" <[email protected]> >> Betreff: [emc:bugs] #328 Problem using AXIS together with HALUI >> Datum: 01. August 2013 08:54:50 MESZ >> An: "[emc:bugs] " <[email protected]> Antwort an: "[emc:bugs] " >> <[email protected]> >> >> Sascha, >> >> if your analysis is correct (and I do not doubt it is) your work > identifies and removes a fundamental design defect from intertask > communication in LinuxCNC. >> >> I did run into the problem before in a different context: I tried to > implement a remote procedure call over NML using the existing (fishy) > interaction pattern, and that failed for the reasons you describe; at the > time I did not drill down like you did - congratulations for the stamina in > identifying and fixing this! >> >> To verify, is the scenario as follows?: >> >> task A writes a command+local serial into the command buffer the >> command consumer is late in processing the command task B goes ahead >> and also writes a command to the buffer with its own serial B's >> command gets executed and 'acknowledged' >> the acknowlegdement for A's command necessarily fails due to serial >> mismatch, or lack of an ack altogether If this is the case, then I would > see more than option to address the issue: >> >> as you have done, mutate the buffer into a queue *) stay with the >> single command buffer, but make producers aware of a command being >> present but not yet finished so the write does not happen before >> completion the globally uniqe serial surely helps, but afaict alone it is > not sufficient to protect against the problem I am positive you are onto > something fundamental here. Before adopting the patch my suggestion would > be: >> >> we need to create a test case which identifies the current problem, even > if only for regression tests. This might be a bit involved given all the > moving parts. It might be easier to replicate the code sequence into small > test programs and verify that the overwrite happens. >> >> the underlying problem is an unprotected write to a shared resource, >> and there might be more than one way to fix it (see above); it might >> make sense to adopt a method which is less of a wholesale change (I >> dont suggest it exists; just saying - it might make sense to >> investigate if there is a single fix to the issue at a different >> level) >> >> it might make sense to reconsider the usage of serials altogether. What > they are for is correlating origin and command for replies; hence a tuple > (local serial, originator id) would do the same trick; provided the > originator ID is known to be unique. The reason why I am raising this is - a > globally unique serial generator either needs shared memory, or - in the > distributed case - a RPC-type service. Note full distribution of components > is a goal for the future replacement of NML by zmq and protobuf which I'm > working on (or rather would like to work once I have the RTOS work out the > door). >> Interested how you see this. >> >> best regards, and congratulations >> >> Michael >> >> *) I would think the queue would protect against overwriting of commands > alone; NB the possibly clash in non-unique serial numbers still exists: > meaning the combination of single command buffer, write protect if command > present, and serial/unique originator tuples might do the trick too. As a > matter of style, I think your patch - making the command input a queue > proper - is more up to the point though. >> >> [bugs:#328] Problem using AXIS together with HALUI >> >> Status: open >> Created: Wed Jul 31, 2013 10:19 PM UTC by Sascha Ittner Last Updated: >> Wed Jul 31, 2013 10:19 PM UTC >> Owner: nobody >> >> I've found a problem in using AXIS together with HALUI, but it seems a > general problem when two or more user interfaces are used the same time. >> >> If I use HALUI excessively I've found that axis is blocked some times. > Even more problematic for me is the fact that in some cases the HALUI action > get lost. After investigating the involved code I've found the reason. I'll > try to describe my thoughts: >> >> The communication between UI and EMCTASK is done by shared mem buffers > (cmd + status). If a command is issued, the command object is copied to the > cmd buffer (including a command serial number). Then the sender (the UI) > waits for the reception and optionaly for the completion of the command. The > wait is relized by waiting for the appearance of the issued serial number in > echo_serial_number of the status buffer. The serial number generator is > local to the ui process and no sync methods (except from a semaphore on the > buffer read/write functions) is involved. This causes two problems: >> >> the first issued command (by AXIS for example) can get overwritten by >> the second one (e.g. HALUI) before EMCTASK has processed it even if both > commands are processed correctly, one of the wait functions will time out > because the serial number does not match. >> As I needed a quick (hopefully not (so) dirty) fix I've created a patch > that allows to generate the serial number in an atomic way on the cms write > and also configures emcCommand as a queue instead of a buffer. >> >> I will enjoy your feedback >> >> Sascha >> >> Sent from sourceforge.net because you indicated interest in >> https://sourceforge.net/p/emc/bugs/328/ >> >> To unsubscribe from further messages, please visit >> https://sourceforge.net/auth/subscriptions/ >> > > ---------------------------------------------------------------------------- > -- > Get your SQL database under version control now! > Version control is standard for application code, but databases havent > caught up. So what steps can you take to put your SQL databases under > version control? Why should you start doing it? Read more to find out. > http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk > _______________________________________________ > Emc-developers mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/emc-developers > > > ------------------------------------------------------------------------------ > Get your SQL database under version control now! > Version control is standard for application code, but databases havent > caught up. So what steps can you take to put your SQL databases under > version control? Why should you start doing it? Read more to find out. > http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk > _______________________________________________ > Emc-developers mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/emc-developers ------------------------------------------------------------------------------ Get your SQL database under version control now! Version control is standard for application code, but databases havent caught up. So what steps can you take to put your SQL databases under version control? Why should you start doing it? Read more to find out. http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk _______________________________________________ Emc-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/emc-developers
