On 6 June 2017 at 14:03, Neil Williams <neil.willi...@linaro.org> wrote:
> On 6 June 2017 at 12:53, Vincent Guittot <vincent.guit...@linaro.org> wrote:
>> On 6 June 2017 at 13:38, Neil Williams <neil.willi...@linaro.org> wrote:
>>> This problem has been resolved inside the arm-probe configuration, it
>>> is not a fault within LAVA. There was a concern that the probe was not
>>> showing data output because of a theoretical problem of running
>>> daemonized instead of with a controlling terminal. The actual problem
>>> was that the probe software is running more slowly than expected and
>>> extending the runtime of the utility allows the probe to output data.
>>> https://staging.validation.linaro.org/scheduler/job/175033#L2038
>>> https://git.linaro.org/lava-team/refactoring.git/commit/?id=7916e6c3db5188e2c2e96da0b666a36ab3e8ffeb
>>
>> ok so the 2seconds for timeout was your problem
>
> That and the problem with the config file.

ok

>
>>> (The verbose option was later dropped to output only the interesting data.)
>>>
>>> The configuration file in the git repo needs to be modified.
>>>
>>> https://git.linaro.org/lava-team/refactoring.git/tree/testdefs/aep-config?id=e08f0bed2c3561421bc2f430ab2e38f1b659e2fd
>>
>> can you point out the modification you did that has been needed ? I
>> can't see any obvious difference except using /dev/ttyACM0 instead of
>> /dev/serial/by-id/usb-NXP_SEMICOND_ARM_Energy_Probe_S_NO44440001-if00.
>> Is it the difference ?
>
> Yes, because inside the LXC, /dev/serial/by-id does not get created
> (there is no udev support for that inside containers).
>
>> What about using 2 AEPs ?
>
> That would have to be fixed either in the test shell definitions (e.g.
> using parameters passed through the test job) or within the arm-probe
> code itself. I have no idea at this stage whether the arm-probe
> software can cope with multiple probes - in LAVA that would likely

arm-probe supports multi AEP and we are using with multi AEPs with the
mtk8173 evb.
arm-probe just rely of the config file to get the path of the AEP. I
have put the content of the config file below:

# arm-probe configuration file
#
# setup name
mt8173-evb

# <device path>
/dev/serial/by-id/usb-NXP_SEMICOND_ARM_Energy_Probe_S_NO81730001-if00
 VDD_CA57_0 0.500000 1 -0.179000 13.363000 -0.000000 0.163300 0
SoC/A57/Cache A57_CACHE #ff0000 SoC
 VDD_CA57_1 0.100000 2 -0.179000 13.363000 -0.000000 0.163300 0
SoC/A57/Core0 A57_CORE #ff0000 SoC
 VDD_CA57_2 0.100000 3 -0.179000 13.363000 -0.000000 0.163300 0
SoC/A57/Core1 A57_CORE #ff0000 SoC

/dev/serial/by-id/usb-NXP_SEMICOND_ARM_Energy_Probe_S_NO81730000-if00
 VDD_CA53_0 0.500000 1 -0.179000 13.363000 -0.000000 0.163300 0
SoC/A53/Cache A53_CACHE #ff0000 SoC
 VDD_CA53_1 0.100000 2 -0.179000 13.363000 -0.000000 0.163300 0
SoC/A53/Core0 A53_CORE #ff0000 SoC
 VDD_CA53_2 0.100000 3 -0.179000 13.363000 -0.000000 0.163300 0
SoC/A53/Core1 A53_CORE #ff0000 SoC

> need secondary connections and MultiNode to separate the output.

Is it something that Lisa can do by herself or does it need some
changes from your side ?

Regards,
Vincent

>
> The syntax of the arm-probe configuration file does not make this easy
> but that section could be patched to use a more sane structure. That
> isn't related to the LAVA support though.
>
>>>
>>>
>>> On 29 May 2017 at 16:45, Vincent Guittot <vincent.guit...@linaro.org> wrote:
>>>> On 25 May 2017 at 10:03, Neil Williams <codeh...@debian.org> wrote:
>>>>> On Wed, 24 May 2017 21:07:45 +0200
>>>>> Vincent Guittot <vincent.guit...@linaro.org> wrote:
>>>>>
>>>>>> Hi Neil,
>>>>>>
>>>>>> Le 24 mai 2017 7:42 PM, "Lisa Nguyen" <lisa.ngu...@linaro.org> a
>>>>>> écrit :
>>>>>>
>>>>>> On 24 May 2017 at 17:02, Neil Williams <codeh...@debian.org> wrote:
>>>>>> > On Fri, 19 May 2017 17:02:14 +0100
>>>>>> > Neil Williams <codeh...@debian.org> wrote:
>>>>>> >
>>>>>> >> On Fri, 19 May 2017 16:48:11 +0100
>>>>>> >> Steve McIntyre <steve.mcint...@linaro.org> wrote:
>>>>>> >>
>>>>>> >> > Hi folks!
>>>>>> >> >
>>>>>> >> > On Wed, May 17, 2017 at 03:05:41PM +0100, Neil Williams wrote:
>>>>>> >> > >On Thu, 27 Apr 2017 08:19:19 +0100
>>>>>> >> > >Neil Williams <codeh...@debian.org> wrote:
>>>>>> >> > >
>>>>>> >> >
>>>>>> >> > I've just run a local test with an AEP inside lxc on my local
>>>>>> >> > machine. As far as I can see, there's nothing particularly magic
>>>>>> >> > going on here. The only problem I see is Lisa's config file
>>>>>> >> > pointing at the wrong device file. arm-probe needs a ttyACM-style
>>>>>> >> > device to talk to. Using:
>>>>>> >> >
>>>>>> >> > # lxc-device -n lxc-aep-test-174524 add /dev/ttyACM0
>>>>>> >> >
>>>>>> >> > I create that device in my container. I build libwebsockets and
>>>>>> >> > the arm-probe software in the container, then
>>>>>> >> > specify /dev/ttyACM0 in the AEP config file. I can run it just
>>>>>> >> > fine:
>>>>>> >> >
>>>>>> >> > root@lxc-aep-test-174524:/arm-probe# ./arm-probe/arm-probe -C
>>>>>> >> > panda-aep.cfg -l10 -x # configuration: panda-aep.cfg
>>>>>> >> > # config_name: pandaboard
>>>>>> >> > # trigger: 0.400000V (hyst 0.200000V) 0.000000W (hyst 0.200000W)
>>>>>> >> > 400us Configuration: pandaboard
>>>>>> >> > # date: Fri, 19 May 2017 16:29:50 +0100
>>>>>> >> > # host: lxc-aep-test-174524
>>>>>> >> > #
>>>>>> >> > + /dev/ttyACM0
>>>>>> >> > Starting...
>>>>>> >> > sending start to 0
>>>>>> >> > # VDD_ALL       VDD     ROOT    #ff0000 SoC
>>>>>> >> > #
>>>>>> >> > #
>>>>>> >> > time  VDD(V) VDD(A) VDD(W)
>>>>>> >> > 0.000500  5.11 0.0474 0.24196
>>>>>> >> > 0.000600  5.11 0.0364 0.18572
>>>>>> >> > 0.000700  5.11 0.0314 0.16012
>>>>>> >> > 0.000800  5.10 0.0544 0.27734
>>>>>> >> > 0.000900  5.10 0.0234 0.11923
>>>>>> >> > 0.001000  5.11 0.0304 0.15505
>>>>>> >> > ...
>>>>>> >> >
>>>>>> >> > I don't have any problems running things and getting output here.
>>>>>> >> >
>>>>>> >> > I *have* seen two real bugs here while trying to get things
>>>>>> >> > running, though:
>>>>>> >> >
>>>>>> >> >  1. If the device specified in the config file doesn't exist, or
>>>>>> >> > is the wrong type of device, or (maybe) there is any other kind
>>>>>> >> > of problem with it, you get *no* useful feedback to say there's a
>>>>>> >> >     problem. Running things under strace will show the background
>>>>>> >> >     libarmep process attempt to use the device specified, but
>>>>>> >> > there's no error handling. :-(
>>>>>> >> >
>>>>>> >> > 2. The "-x" option says that the arm-probe program is meant to
>>>>>> >> > exit when you've done capturing, but it just sits there forever
>>>>>> >> > when I'm testing. I've wrapped it using the "timeout" command to
>>>>>> >> > work around that for now.
>>>>>> >> >
>>>>>> >> > If I knew where to file those bugs, I would, but it's really not
>>>>>> >> > obvious. They're really easy to reproduce, I hope...
>>>>>> >> >
>>>>>> >> > In terms of the /dev/ttyACM0 creation, the lxc-device man page
>>>>>> >> > says that it creates devices based on their existing entries on
>>>>>> >> > the host. Double-check that the host (dispatcher) has an
>>>>>> >> > appropriate /dev/ttyACM0 if you're still seeing problems?
>>>>>> >>
>>>>>> >> Steve was using staging-panda03 with the ARM Energy Probe which I'd
>>>>>> >> been using for the tests of the new code to ensure
>>>>>> >> that /dev/ttyACM0 can be attached to the LXC.
>>>>>> >>
>>>>>> >> That panda and AEP will shortly return to staging and then the
>>>>>> >> changes to LAVA and the required changes to the test definition
>>>>>> >> can be available for the 2017.6 release.
>>>>>> >
>>>>>> > OK. staging-panda03 is back and has been running tests. This is what
>>>>>> > we've learnt so far:
>>>>>> >
>>>>>> > 0: This does not appear to be an LXC issue. Running the commands
>>>>>> > manually on the worker with the same LXC on the same worker does
>>>>>> > return data from the probe.
>>>>>> >
>>>>>> > 1: Running the same commands in "headless" mode shows that the probe
>>>>>> > software starts successfully but something within the protocol
>>>>>> > parser or sampler fails to retrieve data.
>>>>>>
>>>>>>
>>>>>> What do you mean by headless mode?
>>>>>
>>>>> With no controlling terminal.
>>>>>
>>>>> LAVA runs as a daemon and forks processes to run the tests. This does
>>>>> not usually cause issues and is fundamental to automation. When I run
>>>>> the same commands in an LXC as a user logged into the machine, I get
>>>>> output. When I run the commands from a daemon, the output is not seen.
>>>>
>>>> even when you redirect the output to a file ?
>>>>
>>>> On workload automation, arm_probe is called in a dedicated process
>>>> with subprocess.Popen and we are able to get data in the file.
>>>> Just wonder what could be the difference in lava case
>>>>
>>>>>
>>>>>> >
>>>>>> > 2: The websockets dependency is completely unnecessary and has been
>>>>>> > disabled in the build I've been testing:
>>>>>> > https://git.linaro.org/lava-team/arm-probe.git/
>>>>>>
>>>>>>
>>>>>> Yes. I do the same. aepd is only useful for the web interface.
>>>>>>
>>>>>>
>>>>>> >
>>>>>> > 3: We've added a *lot* of debug to the arm-probe code
>>>>>> > (https://staging.validation.linaro.org/scheduler/job/174969 which
>>>>>> > was run using
>>>>>> > https://git.linaro.org/lava-team/arm-probe.git/commit/?id=9b
>>>>>> 2958e3045da77d7db25a7cfe48359211aa4cf1)
>>>>>> > but are not much closer to identifying the precise problem with the
>>>>>> > code. However, I am satisfied that this is a problem in the
>>>>>> > arm-probe software when being run in automation.
>>>>>>
>>>>>>
>>>>>> Can you give details about "this is a problem in arm probe software
>>>>>> when being run in automation"? Do you mean workload automation?
>>>>>
>>>>> No. Not workload automation - that is a specific test framework which
>>>>> can use LAVA. I'm talking about the process of running tests on behalf
>>>>> of users without the users being logged in or interacting with the
>>>>> shell.
>>>>
>>>> ok. Just to be sure about the context
>>>>
>>>>>
>>>>>> >
>>>>>> > 4: the arm-probe code is appallingly difficult to read and debug. It
>>>>>> > also seems unnecessarily complex.
>>>>>> >
>>>>>> > 5: I plan to remove a lot of the debug from the cloned arm-probe
>>>>>> > repository (which has also had a few fixes to compile with gcc6) but
>>>>>> > I'm running out of time to work on the arm-probe software myself.
>>>>>> >
>>>>>> > Someone needs to update the arm-probe software:
>>>>>> >
>>>>>> > a) to remove websockets as a compile-time option as this only bloats
>>>>>> > the build in automation where a web based UI is impossible anyway.
>>>>>> > I've done this by brute force in my cloned repo, I just patched out
>>>>>> > the dependency.
>>>>>> >
>>>>>> > b) improve the code to have comments and output about what is
>>>>>> > happening and why when verbose mode is used.
>>>>>> >
>>>>>> > c) Identify what is preventing the software from receiving data from
>>>>>> > the probe when run in automation.
>>>>>> >
>>>>>> > d) the config file still needs fixes to allow for changes in the
>>>>>> > device node name from one probe to another.
>>>>>> >
>>>>>> > --
>>>>>>
>>>>>> CC'ing Vincent, so he can read Neil's and Steve's comments above and
>>>>>> respond (if he has anything to say) while I'm on holiday until early
>>>>>> June.
>>>>>
>>>>> Steve & I are also on annual leave next week.
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> Neil Williams
>>>>> =============
>>>>> http://www.linux.codehelp.co.uk/
>>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Neil Williams
>>> =============
>>> neil.willi...@linaro.org
>>> http://www.linux.codehelp.co.uk/
>
>
>
> --
>
> Neil Williams
> =============
> neil.willi...@linaro.org
> http://www.linux.codehelp.co.uk/
_______________________________________________
linaro-validation mailing list
linaro-validation@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-validation

Reply via email to