On 6 June 2017 at 14:03, Neil Williams <neil.willi...@linaro.org> wrote: > On 6 June 2017 at 12:53, Vincent Guittot <vincent.guit...@linaro.org> wrote: >> On 6 June 2017 at 13:38, Neil Williams <neil.willi...@linaro.org> wrote: >>> This problem has been resolved inside the arm-probe configuration, it >>> is not a fault within LAVA. There was a concern that the probe was not >>> showing data output because of a theoretical problem of running >>> daemonized instead of with a controlling terminal. The actual problem >>> was that the probe software is running more slowly than expected and >>> extending the runtime of the utility allows the probe to output data. >>> https://staging.validation.linaro.org/scheduler/job/175033#L2038 >>> https://git.linaro.org/lava-team/refactoring.git/commit/?id=7916e6c3db5188e2c2e96da0b666a36ab3e8ffeb >> >> ok so the 2seconds for timeout was your problem > > That and the problem with the config file.
ok > >>> (The verbose option was later dropped to output only the interesting data.) >>> >>> The configuration file in the git repo needs to be modified. >>> >>> https://git.linaro.org/lava-team/refactoring.git/tree/testdefs/aep-config?id=e08f0bed2c3561421bc2f430ab2e38f1b659e2fd >> >> can you point out the modification you did that has been needed ? I >> can't see any obvious difference except using /dev/ttyACM0 instead of >> /dev/serial/by-id/usb-NXP_SEMICOND_ARM_Energy_Probe_S_NO44440001-if00. >> Is it the difference ? > > Yes, because inside the LXC, /dev/serial/by-id does not get created > (there is no udev support for that inside containers). > >> What about using 2 AEPs ? > > That would have to be fixed either in the test shell definitions (e.g. > using parameters passed through the test job) or within the arm-probe > code itself. I have no idea at this stage whether the arm-probe > software can cope with multiple probes - in LAVA that would likely arm-probe supports multi AEP and we are using with multi AEPs with the mtk8173 evb. arm-probe just rely of the config file to get the path of the AEP. I have put the content of the config file below: # arm-probe configuration file # # setup name mt8173-evb # <device path> /dev/serial/by-id/usb-NXP_SEMICOND_ARM_Energy_Probe_S_NO81730001-if00 VDD_CA57_0 0.500000 1 -0.179000 13.363000 -0.000000 0.163300 0 SoC/A57/Cache A57_CACHE #ff0000 SoC VDD_CA57_1 0.100000 2 -0.179000 13.363000 -0.000000 0.163300 0 SoC/A57/Core0 A57_CORE #ff0000 SoC VDD_CA57_2 0.100000 3 -0.179000 13.363000 -0.000000 0.163300 0 SoC/A57/Core1 A57_CORE #ff0000 SoC /dev/serial/by-id/usb-NXP_SEMICOND_ARM_Energy_Probe_S_NO81730000-if00 VDD_CA53_0 0.500000 1 -0.179000 13.363000 -0.000000 0.163300 0 SoC/A53/Cache A53_CACHE #ff0000 SoC VDD_CA53_1 0.100000 2 -0.179000 13.363000 -0.000000 0.163300 0 SoC/A53/Core0 A53_CORE #ff0000 SoC VDD_CA53_2 0.100000 3 -0.179000 13.363000 -0.000000 0.163300 0 SoC/A53/Core1 A53_CORE #ff0000 SoC > need secondary connections and MultiNode to separate the output. Is it something that Lisa can do by herself or does it need some changes from your side ? Regards, Vincent > > The syntax of the arm-probe configuration file does not make this easy > but that section could be patched to use a more sane structure. That > isn't related to the LAVA support though. > >>> >>> >>> On 29 May 2017 at 16:45, Vincent Guittot <vincent.guit...@linaro.org> wrote: >>>> On 25 May 2017 at 10:03, Neil Williams <codeh...@debian.org> wrote: >>>>> On Wed, 24 May 2017 21:07:45 +0200 >>>>> Vincent Guittot <vincent.guit...@linaro.org> wrote: >>>>> >>>>>> Hi Neil, >>>>>> >>>>>> Le 24 mai 2017 7:42 PM, "Lisa Nguyen" <lisa.ngu...@linaro.org> a >>>>>> écrit : >>>>>> >>>>>> On 24 May 2017 at 17:02, Neil Williams <codeh...@debian.org> wrote: >>>>>> > On Fri, 19 May 2017 17:02:14 +0100 >>>>>> > Neil Williams <codeh...@debian.org> wrote: >>>>>> > >>>>>> >> On Fri, 19 May 2017 16:48:11 +0100 >>>>>> >> Steve McIntyre <steve.mcint...@linaro.org> wrote: >>>>>> >> >>>>>> >> > Hi folks! >>>>>> >> > >>>>>> >> > On Wed, May 17, 2017 at 03:05:41PM +0100, Neil Williams wrote: >>>>>> >> > >On Thu, 27 Apr 2017 08:19:19 +0100 >>>>>> >> > >Neil Williams <codeh...@debian.org> wrote: >>>>>> >> > > >>>>>> >> > >>>>>> >> > I've just run a local test with an AEP inside lxc on my local >>>>>> >> > machine. As far as I can see, there's nothing particularly magic >>>>>> >> > going on here. The only problem I see is Lisa's config file >>>>>> >> > pointing at the wrong device file. arm-probe needs a ttyACM-style >>>>>> >> > device to talk to. Using: >>>>>> >> > >>>>>> >> > # lxc-device -n lxc-aep-test-174524 add /dev/ttyACM0 >>>>>> >> > >>>>>> >> > I create that device in my container. I build libwebsockets and >>>>>> >> > the arm-probe software in the container, then >>>>>> >> > specify /dev/ttyACM0 in the AEP config file. I can run it just >>>>>> >> > fine: >>>>>> >> > >>>>>> >> > root@lxc-aep-test-174524:/arm-probe# ./arm-probe/arm-probe -C >>>>>> >> > panda-aep.cfg -l10 -x # configuration: panda-aep.cfg >>>>>> >> > # config_name: pandaboard >>>>>> >> > # trigger: 0.400000V (hyst 0.200000V) 0.000000W (hyst 0.200000W) >>>>>> >> > 400us Configuration: pandaboard >>>>>> >> > # date: Fri, 19 May 2017 16:29:50 +0100 >>>>>> >> > # host: lxc-aep-test-174524 >>>>>> >> > # >>>>>> >> > + /dev/ttyACM0 >>>>>> >> > Starting... >>>>>> >> > sending start to 0 >>>>>> >> > # VDD_ALL VDD ROOT #ff0000 SoC >>>>>> >> > # >>>>>> >> > # >>>>>> >> > time VDD(V) VDD(A) VDD(W) >>>>>> >> > 0.000500 5.11 0.0474 0.24196 >>>>>> >> > 0.000600 5.11 0.0364 0.18572 >>>>>> >> > 0.000700 5.11 0.0314 0.16012 >>>>>> >> > 0.000800 5.10 0.0544 0.27734 >>>>>> >> > 0.000900 5.10 0.0234 0.11923 >>>>>> >> > 0.001000 5.11 0.0304 0.15505 >>>>>> >> > ... >>>>>> >> > >>>>>> >> > I don't have any problems running things and getting output here. >>>>>> >> > >>>>>> >> > I *have* seen two real bugs here while trying to get things >>>>>> >> > running, though: >>>>>> >> > >>>>>> >> > 1. If the device specified in the config file doesn't exist, or >>>>>> >> > is the wrong type of device, or (maybe) there is any other kind >>>>>> >> > of problem with it, you get *no* useful feedback to say there's a >>>>>> >> > problem. Running things under strace will show the background >>>>>> >> > libarmep process attempt to use the device specified, but >>>>>> >> > there's no error handling. :-( >>>>>> >> > >>>>>> >> > 2. The "-x" option says that the arm-probe program is meant to >>>>>> >> > exit when you've done capturing, but it just sits there forever >>>>>> >> > when I'm testing. I've wrapped it using the "timeout" command to >>>>>> >> > work around that for now. >>>>>> >> > >>>>>> >> > If I knew where to file those bugs, I would, but it's really not >>>>>> >> > obvious. They're really easy to reproduce, I hope... >>>>>> >> > >>>>>> >> > In terms of the /dev/ttyACM0 creation, the lxc-device man page >>>>>> >> > says that it creates devices based on their existing entries on >>>>>> >> > the host. Double-check that the host (dispatcher) has an >>>>>> >> > appropriate /dev/ttyACM0 if you're still seeing problems? >>>>>> >> >>>>>> >> Steve was using staging-panda03 with the ARM Energy Probe which I'd >>>>>> >> been using for the tests of the new code to ensure >>>>>> >> that /dev/ttyACM0 can be attached to the LXC. >>>>>> >> >>>>>> >> That panda and AEP will shortly return to staging and then the >>>>>> >> changes to LAVA and the required changes to the test definition >>>>>> >> can be available for the 2017.6 release. >>>>>> > >>>>>> > OK. staging-panda03 is back and has been running tests. This is what >>>>>> > we've learnt so far: >>>>>> > >>>>>> > 0: This does not appear to be an LXC issue. Running the commands >>>>>> > manually on the worker with the same LXC on the same worker does >>>>>> > return data from the probe. >>>>>> > >>>>>> > 1: Running the same commands in "headless" mode shows that the probe >>>>>> > software starts successfully but something within the protocol >>>>>> > parser or sampler fails to retrieve data. >>>>>> >>>>>> >>>>>> What do you mean by headless mode? >>>>> >>>>> With no controlling terminal. >>>>> >>>>> LAVA runs as a daemon and forks processes to run the tests. This does >>>>> not usually cause issues and is fundamental to automation. When I run >>>>> the same commands in an LXC as a user logged into the machine, I get >>>>> output. When I run the commands from a daemon, the output is not seen. >>>> >>>> even when you redirect the output to a file ? >>>> >>>> On workload automation, arm_probe is called in a dedicated process >>>> with subprocess.Popen and we are able to get data in the file. >>>> Just wonder what could be the difference in lava case >>>> >>>>> >>>>>> > >>>>>> > 2: The websockets dependency is completely unnecessary and has been >>>>>> > disabled in the build I've been testing: >>>>>> > https://git.linaro.org/lava-team/arm-probe.git/ >>>>>> >>>>>> >>>>>> Yes. I do the same. aepd is only useful for the web interface. >>>>>> >>>>>> >>>>>> > >>>>>> > 3: We've added a *lot* of debug to the arm-probe code >>>>>> > (https://staging.validation.linaro.org/scheduler/job/174969 which >>>>>> > was run using >>>>>> > https://git.linaro.org/lava-team/arm-probe.git/commit/?id=9b >>>>>> 2958e3045da77d7db25a7cfe48359211aa4cf1) >>>>>> > but are not much closer to identifying the precise problem with the >>>>>> > code. However, I am satisfied that this is a problem in the >>>>>> > arm-probe software when being run in automation. >>>>>> >>>>>> >>>>>> Can you give details about "this is a problem in arm probe software >>>>>> when being run in automation"? Do you mean workload automation? >>>>> >>>>> No. Not workload automation - that is a specific test framework which >>>>> can use LAVA. I'm talking about the process of running tests on behalf >>>>> of users without the users being logged in or interacting with the >>>>> shell. >>>> >>>> ok. Just to be sure about the context >>>> >>>>> >>>>>> > >>>>>> > 4: the arm-probe code is appallingly difficult to read and debug. It >>>>>> > also seems unnecessarily complex. >>>>>> > >>>>>> > 5: I plan to remove a lot of the debug from the cloned arm-probe >>>>>> > repository (which has also had a few fixes to compile with gcc6) but >>>>>> > I'm running out of time to work on the arm-probe software myself. >>>>>> > >>>>>> > Someone needs to update the arm-probe software: >>>>>> > >>>>>> > a) to remove websockets as a compile-time option as this only bloats >>>>>> > the build in automation where a web based UI is impossible anyway. >>>>>> > I've done this by brute force in my cloned repo, I just patched out >>>>>> > the dependency. >>>>>> > >>>>>> > b) improve the code to have comments and output about what is >>>>>> > happening and why when verbose mode is used. >>>>>> > >>>>>> > c) Identify what is preventing the software from receiving data from >>>>>> > the probe when run in automation. >>>>>> > >>>>>> > d) the config file still needs fixes to allow for changes in the >>>>>> > device node name from one probe to another. >>>>>> > >>>>>> > -- >>>>>> >>>>>> CC'ing Vincent, so he can read Neil's and Steve's comments above and >>>>>> respond (if he has anything to say) while I'm on holiday until early >>>>>> June. >>>>> >>>>> Steve & I are also on annual leave next week. >>>>> >>>>> -- >>>>> >>>>> >>>>> Neil Williams >>>>> ============= >>>>> http://www.linux.codehelp.co.uk/ >>>>> >>> >>> >>> >>> -- >>> >>> Neil Williams >>> ============= >>> neil.willi...@linaro.org >>> http://www.linux.codehelp.co.uk/ > > > > -- > > Neil Williams > ============= > neil.willi...@linaro.org > http://www.linux.codehelp.co.uk/ _______________________________________________ linaro-validation mailing list linaro-validation@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-validation