Hi Dan,
Thanks for your comments.
> -----Original Message-----
> From: Dan Williams [mailto:[email protected]]
> Sent: Sunday, July 8, 2018 3:56 AM
> To: Qi, Fuli/斉 福利 <[email protected]>
> Cc: linux-nvdimm <[email protected]>
> Subject: Re: [ndctl PATCH v3] ndctl, test: add a new unit test for monitor
>
> On Fri, Jul 6, 2018 at 10:35 PM, QI Fuli <[email protected]> wrote:
> > Add a new unit test to test the following options of the monitor command.
> > --dimm
> > --bus
> > --region
> > --namespace
> > --logfile
> > --config-file
> >
> > Based-on-patch-by: Yasunori Goto <[email protected]>
> > Signed-off-by: QI Fuli <[email protected]>
> > ---
> > v2 -> v3:
> > - Add filter_obj instead of hard-coded values
> > v1 -> v2:
> > - Add init()
> > - Add get_filter_dimm() to get the filter dimms by ndctl list command
> > instead of hard-cording
> > - Add sleep to call_notify()
> >
> > test/Makefile.am | 3 +-
> > test/monitor.sh | 134
> > +++++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 136 insertions(+), 1 deletion(-) create mode 100755
> > test/monitor.sh
>
> This test is still failing for me, here is the tail of the
> test/test-suite.log output:
>
> + sync
> + sleep 3
> + check_result
> ++ cat /tmp/tmp.dT3TJ4l5IE
> + jlog='nmem0: no smart support
>
My idea of [--dimm option] unit test is try to get the first dimm on bus
nfit_test.0, then let
monitor to run [-d] option with it, the same as:
ndctl monitor -b nfit_test.0 -d nmem0
but in your environment, since the "nmem0" didn't support smart, then the
monitor outputted the following error message and stopped.
> no dimms to monitor'
> ++ jq .dimm.dev
> ++ sort
> ++ uniq
> ++ sed -e ':loop; N; $!b loop; s/\n/:/g'
> ++ sed 's/\"//g'
> parse error: Invalid literal at line 1, column 6
> + notify_dimms=
> + [[ '' == '' ]]
> + stop_monitor
> + kill 39414
> ./monitor.sh: line 48: kill: (39414) - No such process
> ++ err 48
> +++ basename ./monitor.sh
> ++ echo test/monitor.sh: failed at line 48
> test/monitor.sh: failed at line 48
> ++ '[' -n '' ']'
> ++ exit 1
> FAIL monitor.sh (exit status: 1)
>
> I notice errors around get_filter_dimm(). Why is that function needed, it
> seems
> redundant now that $filter_obj is being used?
>
I am sorry that the "filter_dimm" is not a suitable name, maybe changing it to
"monitor_dimm" will be better to understand.
This function is used to get dimms to be monitored from "ndctl list -D -b
nfit_test.0 $1".
After "./smart-notify nfit_test.0", get "notify dimms" from output
notifications.
Then compare the "monitor dimms" with the "notify dimms", if same that means
the unit test option works well.
I will add the a filter to make sure that the "monitor dimms" support smart
event in next version.
> Hmm, if I just delete get_filter_dimm() I get this;
>
> ++ jq .dimm.dev
> ++ sort
> ++ uniq
> ++ sed -e ':loop; N; $!b loop; s/\n/:/g'
> ++ sed 's/\"//g'
> + notify_dimms=nmem3
> + [[ '' == nmem3 ]]
> ++ err 34
> +++ basename ./monitor.sh
> ++ echo test/monitor.sh: failed at line 34
> test/monitor.sh: failed at line 34
> ++ '[' -n '' ']'
> ++ exit 1
> FAIL monitor.sh (exit status: 1)
>
> I assume this test is passing for you? Can you try running it inside a
> virtual machine
> that has a virtual NVDIMM so you can debug the collisions between the
> nfit_test.0
> bus objects and the ACPI.NFIT ones?
>
I passed this unit test on a virtual machine, which has a virtual NVDIMM.
Here is the buses and dimms list of my test environment.
$ sudo ndctl list -BD
[
{
"provider":"nfit_test.0",
"dev":"ndbus1",
"scrub_state":"idle",
"dimms":[
{
"dev":"nmem1",
"id":"cdab-0a-07e0-feffffff",
"handle":1,
"phys_id":1
},
{
"dev":"nmem3",
"id":"cdab-0a-07e0-fefeffff",
"handle":257,
"phys_id":3
},
{
"dev":"nmem0",
"id":"cdab-0a-07e0-ffffffff",
"handle":0,
"phys_id":0
},
{
"dev":"nmem2",
"id":"cdab-0a-07e0-fffeffff",
"handle":256,
"phys_id":2
}
]
},
{
"provider":"e820",
"dev":"ndbus0"
},
{
"provider":"nfit_test.1",
"dev":"ndbus2",
"scrub_state":"idle",
"dimms":[
{
"dev":"nmem5",
"id":"cdab-0a-07e0-fefffeff",
"handle":65537,
"phys_id":0,
"flag_failed_map":true
},
{
"dev":"nmem4",
"id":"cdab-0a-07e0-fffffeff",
"handle":65536,
"phys_id":0,
"flag_failed_save":true,
"flag_failed_arm":true,
"flag_failed_restore":true,
"flag_failed_flush":true,
"flag_smart_event":true
}
]
}
]
> Here is my test environment where this unit test is failing:
>
> https://gist.github.com/djbw/144dc5ddaf5e935ad58bd66a702a5ea8
>
> >
> > diff --git a/test/Makefile.am b/test/Makefile.am index
> > cd451e9..8c76462 100644
> > --- a/test/Makefile.am
> > +++ b/test/Makefile.am
> > @@ -21,7 +21,8 @@ TESTS =\
> > btt-pad-compat.sh \
> > firmware-update.sh \
> > ack-shutdown-count-set \
> > - rescan-partitions.sh
> > + rescan-partitions.sh \
> > + monitor.sh
> >
> > check_PROGRAMS =\
> > libndctl \
> > diff --git a/test/monitor.sh b/test/monitor.sh new file mode 100755
> > index 0000000..4a7d733
> > --- /dev/null
> > +++ b/test/monitor.sh
> > @@ -0,0 +1,134 @@
> > +#!/bin/bash -Ex
> > +
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright(c) 2018, FUJITSU LIMITED. All rights reserved.
> > +
> > +rc=77
> > +logfile=""
> > +conf_file=""
> > +filter_dimms=""
> > +filter_obj=""
> > +monitor_pid=65536
> > +
> > +. ./common
> > +
> > +trap 'err $LINENO' ERR
> > +
> > +check_min_kver "4.15" || do_skip "kernel $KVER may not support monitor
> > service"
> > +
> > +start_monitor()
> > +{
> > + logfile=$(mktemp)
> > + $NDCTL monitor -l $logfile $1 &
> > + monitor_pid=$!
> > + truncate --size 0 $logfile #remove startup log }
> > +
> > +get_filter_dimm()
> > +{
> > + jlist=$($NDCTL list -D -b $NFIT_TEST_BUS0 $1)
> > + filter_dimms=$(jq '.[]."dev"?, ."dev"?' <<<$jlist | sort |
> > +uniq | sed -e ':loop; N; $!b loop; s/\n/:/g' | sed 's/\"//g') }
>
> Can we just remove this?
As I mentioned above, I don't think so.
>
> > +
> > +call_notify()
> > +{
> > + ./smart-notify $NFIT_TEST_BUS0
> > + sync; sleep 3
> > +}
> > +
> > +check_result()
> > +{
> > + jlog=$(cat $logfile)
> > + notify_dimms=$(jq ."dimm"."dev" <<<$jlog | sort | uniq | sed
> > +-e ':loop; N; $!b loop; s/\n/:/g' | sed 's/\"//g')
>
> Can you add a comment here about that sed script is trying to do? If it's
> just filtering
> json I'd rather it just jq directly.
These sed scripts are used to format "monitor dimms" and "notify dimms".
The first one is task to replace "\n" with ":", and the second one is used to
remove ' " '.
After these processes, the format will become "nmem0:nmem1:nmem2".
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm