Hi Dan,
Thanks for your comments.

> -----Original Message-----
> From: Dan Williams [mailto:[email protected]]
> Sent: Sunday, July 8, 2018 3:56 AM
> To: Qi, Fuli/斉 福利 <[email protected]>
> Cc: linux-nvdimm <[email protected]>
> Subject: Re: [ndctl PATCH v3] ndctl, test: add a new unit test for monitor
> 
> On Fri, Jul 6, 2018 at 10:35 PM, QI Fuli <[email protected]> wrote:
> > Add a new unit test to test the following options of the monitor command.
> >    --dimm
> >    --bus
> >    --region
> >    --namespace
> >    --logfile
> >    --config-file
> >
> > Based-on-patch-by: Yasunori Goto <[email protected]>
> > Signed-off-by: QI Fuli <[email protected]>
> > ---
> > v2 -> v3:
> >  - Add filter_obj instead of hard-coded values
> > v1 -> v2:
> >  - Add init()
> >  - Add get_filter_dimm() to get the filter dimms by ndctl list command
> >    instead of hard-cording
> >  - Add sleep to call_notify()
> >
> >  test/Makefile.am |   3 +-
> >  test/monitor.sh  | 134
> > +++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 136 insertions(+), 1 deletion(-)  create mode 100755
> > test/monitor.sh
> 
> This test is still failing for me, here is the tail of the 
> test/test-suite.log output:
> 
> + sync
> + sleep 3
> + check_result
> ++ cat /tmp/tmp.dT3TJ4l5IE
> + jlog='nmem0: no smart support
> 

My idea of [--dimm option] unit test is try to get the first dimm on bus 
nfit_test.0, then let
monitor to run [-d] option with it, the same as:
        ndctl monitor -b nfit_test.0 -d nmem0
but in your environment, since the "nmem0" didn't support smart, then the 
monitor outputted the following error message and stopped.

> no dimms to monitor'

> ++ jq .dimm.dev
> ++ sort
> ++ uniq
> ++ sed -e ':loop; N; $!b loop; s/\n/:/g'
> ++ sed 's/\"//g'
> parse error: Invalid literal at line 1, column 6
> + notify_dimms=
> + [[ '' == '' ]]
> + stop_monitor
> + kill 39414
> ./monitor.sh: line 48: kill: (39414) - No such process
> ++ err 48
> +++ basename ./monitor.sh
> ++ echo test/monitor.sh: failed at line 48
> test/monitor.sh: failed at line 48
> ++ '[' -n '' ']'
> ++ exit 1
> FAIL monitor.sh (exit status: 1)
> 
> I notice errors around get_filter_dimm(). Why is that function needed, it 
> seems
> redundant now that $filter_obj is being used?
> 

I am sorry that the "filter_dimm" is not a suitable name, maybe changing it to 
"monitor_dimm" will be better to understand.
This function is used to get dimms to be monitored from "ndctl list -D -b 
nfit_test.0 $1".
After "./smart-notify nfit_test.0", get "notify dimms" from output 
notifications.
Then compare the "monitor dimms" with the "notify dimms", if same that means 
the unit test option works well.
I will add the a filter to make sure that the "monitor dimms" support smart 
event in next version.

> Hmm, if I just delete get_filter_dimm() I get this;
> 
> ++ jq .dimm.dev
> ++ sort
> ++ uniq
> ++ sed -e ':loop; N; $!b loop; s/\n/:/g'
> ++ sed 's/\"//g'
> + notify_dimms=nmem3
> + [[ '' == nmem3 ]]
> ++ err 34
> +++ basename ./monitor.sh
> ++ echo test/monitor.sh: failed at line 34
> test/monitor.sh: failed at line 34
> ++ '[' -n '' ']'
> ++ exit 1
> FAIL monitor.sh (exit status: 1)
> 
> I assume this test is passing for you? Can you try running it inside a 
> virtual machine
> that has a virtual NVDIMM so you can debug the collisions between the 
> nfit_test.0
> bus objects and the ACPI.NFIT ones?
> 

I passed this unit test on a virtual machine, which has a virtual NVDIMM.
Here is the buses and dimms list of my test environment.

$ sudo ndctl list -BD
[
  {
    "provider":"nfit_test.0",
    "dev":"ndbus1",
    "scrub_state":"idle",
    "dimms":[
      {
        "dev":"nmem1",
        "id":"cdab-0a-07e0-feffffff",
        "handle":1,
        "phys_id":1
      },
      {
        "dev":"nmem3",
        "id":"cdab-0a-07e0-fefeffff",
        "handle":257,
        "phys_id":3
      },
      {
        "dev":"nmem0",
        "id":"cdab-0a-07e0-ffffffff",
        "handle":0,
        "phys_id":0
      },
      {
        "dev":"nmem2",
        "id":"cdab-0a-07e0-fffeffff",
        "handle":256,
        "phys_id":2
      }
    ]
  },
  {
    "provider":"e820",
    "dev":"ndbus0"
  },
  {
    "provider":"nfit_test.1",
    "dev":"ndbus2",
    "scrub_state":"idle",
    "dimms":[
      {
        "dev":"nmem5",
        "id":"cdab-0a-07e0-fefffeff",
        "handle":65537,
        "phys_id":0,
        "flag_failed_map":true
      },
      {
        "dev":"nmem4",
        "id":"cdab-0a-07e0-fffffeff",
        "handle":65536,
        "phys_id":0,
        "flag_failed_save":true,
        "flag_failed_arm":true,
        "flag_failed_restore":true,
        "flag_failed_flush":true,
        "flag_smart_event":true
      }
    ]
  }
]

> Here is my test environment where this unit test is failing:
> 
> https://gist.github.com/djbw/144dc5ddaf5e935ad58bd66a702a5ea8
> 
> >
> > diff --git a/test/Makefile.am b/test/Makefile.am index
> > cd451e9..8c76462 100644
> > --- a/test/Makefile.am
> > +++ b/test/Makefile.am
> > @@ -21,7 +21,8 @@ TESTS =\
> >         btt-pad-compat.sh \
> >         firmware-update.sh \
> >         ack-shutdown-count-set \
> > -       rescan-partitions.sh
> > +       rescan-partitions.sh \
> > +       monitor.sh
> >
> >  check_PROGRAMS =\
> >         libndctl \
> > diff --git a/test/monitor.sh b/test/monitor.sh new file mode 100755
> > index 0000000..4a7d733
> > --- /dev/null
> > +++ b/test/monitor.sh
> > @@ -0,0 +1,134 @@
> > +#!/bin/bash -Ex
> > +
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright(c) 2018, FUJITSU LIMITED. All rights reserved.
> > +
> > +rc=77
> > +logfile=""
> > +conf_file=""
> > +filter_dimms=""
> > +filter_obj=""
> > +monitor_pid=65536
> > +
> > +. ./common
> > +
> > +trap 'err $LINENO' ERR
> > +
> > +check_min_kver "4.15" || do_skip "kernel $KVER may not support monitor 
> > service"
> > +
> > +start_monitor()
> > +{
> > +       logfile=$(mktemp)
> > +       $NDCTL monitor -l $logfile $1 &
> > +       monitor_pid=$!
> > +       truncate --size 0 $logfile #remove startup log }
> > +
> > +get_filter_dimm()
> > +{
> > +       jlist=$($NDCTL list -D -b $NFIT_TEST_BUS0 $1)
> > +       filter_dimms=$(jq '.[]."dev"?, ."dev"?' <<<$jlist | sort |
> > +uniq | sed -e ':loop; N; $!b loop; s/\n/:/g' | sed 's/\"//g') }
> 
> Can we just remove this?

As I mentioned above, I don't think so.

> 
> > +
> > +call_notify()
> > +{
> > +       ./smart-notify $NFIT_TEST_BUS0
> > +       sync; sleep 3
> > +}
> > +
> > +check_result()
> > +{
> > +       jlog=$(cat $logfile)
> > +       notify_dimms=$(jq ."dimm"."dev" <<<$jlog | sort | uniq | sed
> > +-e ':loop; N; $!b loop; s/\n/:/g' | sed 's/\"//g')
> 
> Can you add a comment here about that sed script is trying to do? If it's 
> just filtering
> json I'd rather it just jq directly.

These sed scripts are used to format "monitor dimms" and "notify dimms".
The first one is task to replace "\n" with ":", and the second one is used to 
remove ' " '.
After these processes, the format will become "nmem0:nmem1:nmem2".
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to