Hi Neil, Le 24 mai 2017 7:42 PM, "Lisa Nguyen" <lisa.ngu...@linaro.org> a écrit :
On 24 May 2017 at 17:02, Neil Williams <codeh...@debian.org> wrote: > On Fri, 19 May 2017 17:02:14 +0100 > Neil Williams <codeh...@debian.org> wrote: > >> On Fri, 19 May 2017 16:48:11 +0100 >> Steve McIntyre <steve.mcint...@linaro.org> wrote: >> >> > Hi folks! >> > >> > On Wed, May 17, 2017 at 03:05:41PM +0100, Neil Williams wrote: >> > >On Thu, 27 Apr 2017 08:19:19 +0100 >> > >Neil Williams <codeh...@debian.org> wrote: >> > > >> > >> > I've just run a local test with an AEP inside lxc on my local >> > machine. As far as I can see, there's nothing particularly magic >> > going on here. The only problem I see is Lisa's config file >> > pointing at the wrong device file. arm-probe needs a ttyACM-style >> > device to talk to. Using: >> > >> > # lxc-device -n lxc-aep-test-174524 add /dev/ttyACM0 >> > >> > I create that device in my container. I build libwebsockets and the >> > arm-probe software in the container, then specify /dev/ttyACM0 in >> > the AEP config file. I can run it just fine: >> > >> > root@lxc-aep-test-174524:/arm-probe# ./arm-probe/arm-probe -C >> > panda-aep.cfg -l10 -x # configuration: panda-aep.cfg >> > # config_name: pandaboard >> > # trigger: 0.400000V (hyst 0.200000V) 0.000000W (hyst 0.200000W) >> > 400us Configuration: pandaboard >> > # date: Fri, 19 May 2017 16:29:50 +0100 >> > # host: lxc-aep-test-174524 >> > # >> > + /dev/ttyACM0 >> > Starting... >> > sending start to 0 >> > # VDD_ALL VDD ROOT #ff0000 SoC >> > # >> > # >> > time VDD(V) VDD(A) VDD(W) >> > 0.000500 5.11 0.0474 0.24196 >> > 0.000600 5.11 0.0364 0.18572 >> > 0.000700 5.11 0.0314 0.16012 >> > 0.000800 5.10 0.0544 0.27734 >> > 0.000900 5.10 0.0234 0.11923 >> > 0.001000 5.11 0.0304 0.15505 >> > ... >> > >> > I don't have any problems running things and getting output here. >> > >> > I *have* seen two real bugs here while trying to get things running, >> > though: >> > >> > 1. If the device specified in the config file doesn't exist, or is >> > the wrong type of device, or (maybe) there is any other kind of >> > problem with it, you get *no* useful feedback to say there's a >> > problem. Running things under strace will show the background >> > libarmep process attempt to use the device specified, but >> > there's no error handling. :-( >> > >> > 2. The "-x" option says that the arm-probe program is meant to exit >> > when you've done capturing, but it just sits there forever when >> > I'm testing. I've wrapped it using the "timeout" command to work >> > around that for now. >> > >> > If I knew where to file those bugs, I would, but it's really not >> > obvious. They're really easy to reproduce, I hope... >> > >> > In terms of the /dev/ttyACM0 creation, the lxc-device man page says >> > that it creates devices based on their existing entries on the >> > host. Double-check that the host (dispatcher) has an appropriate >> > /dev/ttyACM0 if you're still seeing problems? >> >> Steve was using staging-panda03 with the ARM Energy Probe which I'd >> been using for the tests of the new code to ensure that /dev/ttyACM0 >> can be attached to the LXC. >> >> That panda and AEP will shortly return to staging and then the changes >> to LAVA and the required changes to the test definition can be >> available for the 2017.6 release. > > OK. staging-panda03 is back and has been running tests. This is what > we've learnt so far: > > 0: This does not appear to be an LXC issue. Running the commands > manually on the worker with the same LXC on the same worker does return > data from the probe. > > 1: Running the same commands in "headless" mode shows that the probe > software starts successfully but something within the protocol parser > or sampler fails to retrieve data. What do you mean by headless mode? > > 2: The websockets dependency is completely unnecessary and has been > disabled in the build I've been testing: > https://git.linaro.org/lava-team/arm-probe.git/ Yes. I do the same. aepd is only useful for the web interface. > > 3: We've added a *lot* of debug to the arm-probe code > (https://staging.validation.linaro.org/scheduler/job/174969 which was > run using > https://git.linaro.org/lava-team/arm-probe.git/commit/?id=9b 2958e3045da77d7db25a7cfe48359211aa4cf1) > but are not much closer to identifying the precise problem with the > code. However, I am satisfied that this is a problem in the arm-probe > software when being run in automation. Can you give details about "this is a problem in arm probe software when being run in automation"? Do you mean workload automation? > > 4: the arm-probe code is appallingly difficult to read and debug. It > also seems unnecessarily complex. > > 5: I plan to remove a lot of the debug from the cloned arm-probe > repository (which has also had a few fixes to compile with gcc6) but > I'm running out of time to work on the arm-probe software myself. > > Someone needs to update the arm-probe software: > > a) to remove websockets as a compile-time option as this only bloats > the build in automation where a web based UI is impossible anyway. I've > done this by brute force in my cloned repo, I just patched out the > dependency. > > b) improve the code to have comments and output about what is happening > and why when verbose mode is used. > > c) Identify what is preventing the software from receiving data from > the probe when run in automation. > > d) the config file still needs fixes to allow for changes in the device > node name from one probe to another. > > -- CC'ing Vincent, so he can read Neil's and Steve's comments above and respond (if he has anything to say) while I'm on holiday until early June. > > > Neil Williams > ============= > http://www.linux.codehelp.co.uk/ > > > _______________________________________________ > linaro-validation mailing list > linaro-validation@lists.linaro.org > https://lists.linaro.org/mailman/listinfo/linaro-validation >
_______________________________________________ linaro-validation mailing list linaro-validation@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-validation