【TS CUBIC CARD】お利用確認

2021-04-30 Thread TS CUBIC CARD、ENEOS カード
ENEOSカード、TS CUBIC CARDカードのお客様
いつもご利用いただき、ありがとうございます。
このたび、ご本人様のご利用かどうかを確認させていただきたいお取引がありましたので、誠に勝手ながら、カードのご利用を一部制限させていただき、ご連絡させていただきました。
つきましては、以下へアクセスの上、カードのご利用確認にご協力をお願い致します。
下記専用URLにアクセスいただき。
https://www.my.ts3catb.com.bhdhh.com/
お客様にはご迷惑、ご心配をお掛けし、誠に申し訳ございません。
何卒ご理解いただきたくお願い申しあげます。

┏━┓
 ■本メールは送信専用のため、こちらのメールアドレスにご返信いただいても
  対応はいたしかねますのでご了承ください。
  なお、本メールについてお心当たりがない場合には、
  お手数ですが、下記お問い合わせ先までお電話にて連絡をお願いいたします。
 
 ■発行:TS CUBIC CARD「ティーエスキュービックカード」
  https://tscubic.com/
 トヨタファイナンス株式会社
 〒451-6014 愛知県名古屋市西区牛島町6番1号
 ■本メールについてのお問い合わせ:
●TOYOTA, DAIHATSU, ジェームス, トヨタレンタカー FDCの
  TS CUBIC CARD, TS CUBIC VIEW CARDをお持ちの方はこちら
  インフォメーションデスク
  [ 東京 ] 03−5617−2511
  [名古屋] 052−239−2511
(9:00〜17:30 年中無休 年末年始除く)
●上記以外のカード会員さまは、お手持ちのカード券面裏に記載の
  カードに関するお問い合わせ電話番号におかけください
 
┗━┛
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


[bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot

2021-04-30 Thread Yi Zhang
Hi

With the latest Linux tree, my DCPMM server boot failed with the
bellow panic log, pls help check it, let me know if you need any test
for it.

[   15.882889] BUG: unable to handle page fault for address: ffa8
[   15.889761] #PF: supervisor read access in kernel mode
[   15.894900] #PF: error_code(0x) - not-present page
[   15.900039] PGD fc2813067 P4D fc2813067 PUD fc2815067 PMD 0
[   15.905697] Oops:  [#1] SMP NOPTI
[   15.909364] CPU: 22 PID: 1024 Comm: systemd-udevd Tainted: G
  I   5.12.0+ #1
[   15.917448] Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS
2.10.0 11/12/2020
[   15.925013] RIP: 0010:nfit_get_smbios_id+0x6e/0xf0 [nfit]
[   15.930413] Code: b1 f3 49 8b 84 24 c0 00 00 00 49 8d 8c 24 c0 00
00 00 48 8d 50 a0 48 39 c1 75 0f eb 49 48 8b 42 60 48 8d 50 a0 48 39
c1 74 3c <48> 8b 42 08 48 85 c0 75 04 48 8b 42 10 39 58 04 75 e1 0f b7
50 2c
[   15.949160] RSP: 0018:9c28c284bb10 EFLAGS: 00010286
[   15.954383] RAX:  RBX: 0020 RCX: 897b832d8cd8
[   15.961507] RDX: ffa0 RSI: 9c28c284bb46 RDI: 897b832d8c98
[   15.968631] RBP: 9c28c284bb46 R08: c08c982c R09: 9c28c284bb6c
[   15.975763] R10: 0058 R11: 897b4bfb0aee R12: 897b832d8c18
[   15.982888] R13: 897b832d8c98 R14: 897b4bfb0038 R15: 897b4bfb1800
[   15.990021] FS:  7fa1960ab180() GS:898a7ff8()
knlGS:
[   15.998107] CS:  0010 DS:  ES:  CR0: 80050033
[   16.003854] CR2: ffa8 CR3: 0001086ac004 CR4: 007706e0
[   16.010984] DR0:  DR1:  DR2: 
[   16.018119] DR3:  DR6: fffe0ff0 DR7: 0400
[   16.025249] PKRU: 5554
[   16.027954] Call Trace:
[   16.030407]  skx_get_nvdimm_info+0x56/0x130 [skx_edac]
[   16.035546]  skx_get_dimm_config+0x1f5/0x213 [skx_edac]
[   16.040770]  skx_register_mci+0x132/0x1c0 [skx_edac]
[   16.045737]  ? skx_show_retry_rd_err_log+0x190/0x190 [skx_edac]
[   16.051657]  skx_init+0x344/0xe87 [skx_edac]
[   16.055930]  ? skx_adxl_get+0x179/0x179 [skx_edac]
[   16.060722]  do_one_initcall+0x41/0x1d0
[   16.064560]  ? __cond_resched+0x15/0x30
[   16.068399]  ? kmem_cache_alloc_trace+0x3d/0x420
[   16.073019]  do_init_module+0x5a/0x240
[   16.076771]  load_module+0x1b5f/0x1c40
[   16.080525]  ? __kernel_read+0x14a/0x2c0
[   16.084450]  ? __do_sys_finit_module+0xad/0x110
[   16.088982]  __do_sys_finit_module+0xad/0x110
[   16.093343]  do_syscall_64+0x39/0x80
[   16.096921]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   16.101975] RIP: 0033:0x7fa194c8852d
[   16.105554] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e
fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 79 2c 00 f7 d8 64 89
01 48
[   16.124298] RSP: 002b:7fff5daec098 EFLAGS: 0246 ORIG_RAX:
0139
[   16.131864] RAX: ffda RBX: 55ccd2937dc0 RCX: 7fa194c8852d
[   16.138996] RDX:  RSI: 7fa1957fc86d RDI: 001c
[   16.146129] RBP: 7fa1957fc86d R08:  R09: 7fff5daec1c0
[   16.153260] R10: 001c R11: 0246 R12: 
[   16.160392] R13: 55ccd2838c30 R14: 0002 R15: 
[   16.167527] Modules linked in: skx_edac(+) x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel ipmi_ssif mgag200 i2c_algo_bit kvm
drm_kms_helper iTCO_wdt iTCO_vendor_support syscopyarea sysfillrect
sysimgblt fb_sys_fops drm irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel acpi_ipmi rapl ipmi_si mei_me intel_cstate
intel_uncore mei i2c_i801 pcspkr wmi_bmof ipmi_devintf
intel_pch_thermal i2c_smbus lpc_ich ipmi_msghandler acpi_power_meter
ip_tables xfs libcrc32c sd_mod t10_pi sg ahci libahci libata tg3
megaraid_sas crc32c_intel nfit wmi libnvdimm dm_mirror dm_region_hash
dm_log dm_mod
[   16.220349] CR2: ffa8
[   16.223674] ---[ end trace 3e1fbf6e28c10643 ]---
[   16.231424] RIP: 0010:nfit_get_smbios_id+0x6e/0xf0 [nfit]
[   16.236822] Code: b1 f3 49 8b 84 24 c0 00 00 00 49 8d 8c 24 c0 00
00 00 48 8d 50 a0 48 39 c1 75 0f eb 49 48 8b 42 60 48 8d 50 a0 48 39
c1 74 3c <48> 8b 42 08 48 85 c0 75 04 48 8b 42 10 39 58 04 75 e1 0f b7
50 2c
[   16.255568] RSP: 0018:9c28c284bb10 EFLAGS: 00010286
[   16.260794] RAX:  RBX: 0020 RCX: 897b832d8cd8
[   16.267925] RDX: ffa0 RSI: 9c28c284bb46 RDI: 897b832d8c98
[   16.275057] RBP: 9c28c284bb46 R08: c08c982c R09: 9c28c284bb6c
[   16.282189] R10: 0058 R11: 897b4bfb0aee R12: 897b832d8c18
[   16.289313] R13: 897b832d8c98 R14: 897b4bfb0038 R15: 897b4bfb1800
[   16.296440] FS:  7fa1960ab180() GS:898a7ff8()
knlGS:
[   16.304525] CS:  0010 DS:  ES:  CR0: 80050033
[   16.310271] CR2: ffa8 

Urgent PO

2021-04-30 Thread Accountant Assistant
Dear, linux-nvdimm
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm

2021-04-30 Thread Dan Williams
Some corrections to terminology confusion below...


On Wed, Apr 28, 2021 at 8:49 PM Shivaprasad G Bhat  wrote:
>
> The nvdimm devices are expected to ensure write persistence during power
> failure kind of scenarios.

No, QEMU is not expected to make that guarantee. QEMU is free to lie
to the guest about the persistence guarantees of the guest PMEM
ranges. It's more accurate to say that QEMU nvdimm devices can emulate
persistent memory and optionally pass through host power-fail
persistence guarantees to the guest. The power-fail persistence domain
can be one of "cpu_cache", or "memory_controller" if the persistent
memory region is "synchronous". If the persistent range is not
synchronous, it really isn't "persistent memory"; it's memory mapped
storage that needs I/O commands to flush.

> The libpmem has architecture specific instructions like dcbf on POWER

Which "libpmem" is this? PMDK is a reference library not a PMEM
interface... maybe I'm missing what libpmem has to do with QEMU?

> to flush the cache data to backend nvdimm device during normal writes
> followed by explicit flushes if the backend devices are not synchronous
> DAX capable.
>
> Qemu - virtual nvdimm devices are memory mapped. The dcbf in the guest
> and the subsequent flush doesn't traslate to actual flush to the backend

s/traslate/translate/

> file on the host in case of file backed v-nvdimms. This is addressed by
> virtio-pmem in case of x86_64 by making explicit flushes translating to
> fsync at qemu.

Note that virtio-pmem was a proposal for a specific optimization of
allowing guests to share page cache. The virtio-pmem approach is not
to be confused with actual persistent memory.

> On SPAPR, the issue is addressed by adding a new hcall to
> request for an explicit flush from the guest ndctl driver when the backend

What is an "ndctl" driver? ndctl is userspace tooling, do you mean the
guest pmem driver?

> nvdimm cannot ensure write persistence with dcbf alone. So, the approach
> here is to convey when the hcall flush is required in a device tree
> property. The guest makes the hcall when the property is found, instead
> of relying on dcbf.
>
> A new device property sync-dax is added to the nvdimm device. When the
> sync-dax is 'writeback'(default for PPC), device property
> "hcall-flush-required" is set, and the guest makes hcall H_SCM_FLUSH
> requesting for an explicit flush.

I'm not sure "sync-dax" is a suitable name for the property of the
guest persistent memory. There is no requirement that the
memory-backend file for a guest be a dax-capable file. It's also
implementation specific what hypercall needs to be invoked for a given
occurrence of "sync-dax". What does that map to on non-PPC platforms
for example? It seems to me that an "nvdimm" device presents the
synchronous usage model and a whole other device type implements an
async-hypercall setup that the guest happens to service with its
nvdimm stack, but it's not an "nvdimm" anymore at that point.

> sync-dax is "unsafe" on all other platforms(x86, ARM) and old pseries machines
> prior to 5.2 on PPC. sync-dax="writeback" on ARM and x86_64 is prevented
> now as the flush semantics are unimplemented.

"sync-dax" has no meaning on its own, I think this needs an explicit
mechanism to convey both the "not-sync" property *and* the callback
method, it shouldn't be inferred by arch type.

> When the backend file is actually synchronous DAX capable and no explicit
> flushes are required, the sync-dax mode 'direct' is to be used.
>
> The below demonstration shows the map_sync behavior with sync-dax writeback &
> direct.
> (https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/ndctl.py.data/map_sync.c)
>
> The pmem0 is from nvdimm with With sync-dax=direct, and pmem1 is from
> nvdimm with syn-dax=writeback, mounted as
> /dev/pmem0 on /mnt1 type xfs 
> (rw,relatime,attr2,dax=always,inode64,logbufs=8,logbsize=32k,noquota)
> /dev/pmem1 on /mnt2 type xfs 
> (rw,relatime,attr2,dax=always,inode64,logbufs=8,logbsize=32k,noquota)
>
> [root@atest-guest ~]# ./mapsync /mnt1/newfile > When 
> sync-dax=unsafe/direct
> [root@atest-guest ~]# ./mapsync /mnt2/newfile > when sync-dax=writeback
> Failed to mmap  with Operation not supported
>
> The first patch does the header file cleanup necessary for the
> subsequent ones. Second patch implements the hcall, adds the necessary
> vmstate properties to spapr machine structure for carrying the hcall
> status during save-restore. The nature of the hcall being asynchronus,
> the patch uses aio utilities to offload the flush. The third patch adds
> the 'sync-dax' device property and enables the device tree property
> for the guest to utilise the hcall.
>
> The kernel changes to exploit this hcall is at
> https://github.com/linuxppc/linux/commit/75b7c05ebf9026.patch
>
> ---
> v3 - https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg07916.html
> Changes from v3:
>   - Fixed the forward declaration coding 

Re: [PATCH 2/4] test: Don't skip tests if nfit modules are missing

2021-04-30 Thread Verma, Vishal L
On Sun, 2021-03-28 at 07:39 +0530, Santosh Sivaraj wrote:
> For NFIT to be available ACPI is a must, so don't fail when nfit modules
> are missing on a platform that doesn't support ACPI.
> 
> Signed-off-by: Santosh Sivaraj 
> ---
>  test.h|  2 +-
>  test/ack-shutdown-count-set.c |  2 +-
>  test/blk_namespaces.c |  2 +-
>  test/core.c   | 30 --
>  test/dpa-alloc.c  |  2 +-
>  test/dsm-fail.c   |  2 +-
>  test/libndctl.c   |  2 +-
>  test/multi-pmem.c |  2 +-
>  test/parent-uuid.c|  2 +-
>  test/pmem_namespaces.c|  2 +-
>  10 files changed, 37 insertions(+), 11 deletions(-)
> 

I haven't looked deeper, but this seems to fail the blk-ns test with:

  ACPI.NFIT unavailable falling back to nfit_test
  test/init: ndctl_test_init: Cannot determine NVDIMM family
  __ndctl_test_skip: explicit skip test_blk_namespaces:235
  nfit_test unavailable skipping tests

> diff --git a/test.h b/test.h
> index cba8d41..7de13fe 100644
> --- a/test.h
> +++ b/test.h
> @@ -20,7 +20,7 @@ void builtin_xaction_namespace_reset(void);
>  
> 
>  struct kmod_ctx;
>  struct kmod_module;
> -int nfit_test_init(struct kmod_ctx **ctx, struct kmod_module **mod,
> +int ndctl_test_init(struct kmod_ctx **ctx, struct kmod_module **mod,
>   struct ndctl_ctx *nd_ctx, int log_level,
>   struct ndctl_test *test);
>  
> 
> diff --git a/test/ack-shutdown-count-set.c b/test/ack-shutdown-count-set.c
> index fb1d82b..c561ff3 100644
> --- a/test/ack-shutdown-count-set.c
> +++ b/test/ack-shutdown-count-set.c
> @@ -99,7 +99,7 @@ static int test_ack_shutdown_count_set(int loglevel, struct 
> ndctl_test *test,
>   int result = EXIT_FAILURE, err;
>  
> 
>   ndctl_set_log_priority(ctx, loglevel);
> - err = nfit_test_init(_ctx, , NULL, loglevel, test);
> + err = ndctl_test_init(_ctx, , NULL, loglevel, test);
>   if (err < 0) {
>   result = 77;
>   ndctl_test_skip(test);
> diff --git a/test/blk_namespaces.c b/test/blk_namespaces.c
> index d7f00cb..f076e85 100644
> --- a/test/blk_namespaces.c
> +++ b/test/blk_namespaces.c
> @@ -228,7 +228,7 @@ int test_blk_namespaces(int log_level, struct ndctl_test 
> *test,
>  
> 
>   if (!bus) {
>   fprintf(stderr, "ACPI.NFIT unavailable falling back to 
> nfit_test\n");
> - rc = nfit_test_init(_ctx, , NULL, log_level, test);
> + rc = ndctl_test_init(_ctx, , NULL, log_level, test);
>   ndctl_invalidate(ctx);
>   bus = ndctl_bus_get_by_provider(ctx, "nfit_test.0");
>   if (rc < 0 || !bus) {
> diff --git a/test/core.c b/test/core.c
> index cc7d8d9..44cb277 100644
> --- a/test/core.c
> +++ b/test/core.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
> 
>  #define KVER_STRLEN 20
> @@ -106,11 +107,11 @@ int ndctl_test_get_skipped(struct ndctl_test *test)
>   return test->skip;
>  }
>  
> 
> -int nfit_test_init(struct kmod_ctx **ctx, struct kmod_module **mod,
> +int ndctl_test_init(struct kmod_ctx **ctx, struct kmod_module **mod,
>   struct ndctl_ctx *nd_ctx, int log_level,
>   struct ndctl_test *test)
>  {
> - int rc;
> + int rc, family = -1;
>   unsigned int i;
>   const char *name;
>   struct ndctl_bus *bus;
> @@ -127,10 +128,30 @@ int nfit_test_init(struct kmod_ctx **ctx, struct 
> kmod_module **mod,
>   "nd_e820",
>   "nd_pmem",
>   };
> + char *test_env;
>  
> 
>   log_init(_ctx, "test/init", "NDCTL_TEST");
>   log_ctx.log_priority = log_level;
>  
> 
> + /*
> +  * The following two checks determine the platform family. For
> +  * Intel/platforms which support ACPI, check sysfs; for other platforms
> +  * determine from the environment variable NVDIMM_TEST_FAMILY
> +  */
> + if (access("/sys/bus/acpi", F_OK) == 0) {
> + if (errno == ENOENT)
> + family = NVDIMM_FAMILY_INTEL;
> + }
> +
> + test_env = getenv("NDCTL_TEST_FAMILY");
> + if (test_env && strcmp(test_env, "PAPR") == 0)
> + family = NVDIMM_FAMILY_PAPR;
> +
> + if (family == -1) {
> + log_err(_ctx, "Cannot determine NVDIMM family\n");
> + return -ENOTSUP;
> + }
> +
>   *ctx = kmod_new(NULL, NULL);
>   if (!*ctx)
>   return -ENXIO;
> @@ -185,6 +206,11 @@ retry:
>  
> 
>   path = kmod_module_get_path(*mod);
>   if (!path) {
> + if (family != NVDIMM_FAMILY_INTEL &&
> + (strcmp(name, "nfit") == 0 ||
> +  strcmp(name, "nd_e820") == 0))
> + continue;
> +
>   log_err(_ctx, "%s.ko: failed to get path\n", name);
>   break;
>   }
> diff --git a/test/dpa-alloc.c 

Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm

2021-04-30 Thread Stefan Hajnoczi
On Fri, Apr 30, 2021 at 02:27:18PM +1000, David Gibson wrote:
> On Thu, Apr 29, 2021 at 10:02:23PM +0530, Aneesh Kumar K.V wrote:
> > On 4/29/21 9:25 PM, Stefan Hajnoczi wrote:
> > > On Wed, Apr 28, 2021 at 11:48:21PM -0400, Shivaprasad G Bhat wrote:
> > > > The nvdimm devices are expected to ensure write persistence during power
> > > > failure kind of scenarios.
> > > > 
> > > > The libpmem has architecture specific instructions like dcbf on POWER
> > > > to flush the cache data to backend nvdimm device during normal writes
> > > > followed by explicit flushes if the backend devices are not synchronous
> > > > DAX capable.
> > > > 
> > > > Qemu - virtual nvdimm devices are memory mapped. The dcbf in the guest
> > > > and the subsequent flush doesn't traslate to actual flush to the backend
> > > > file on the host in case of file backed v-nvdimms. This is addressed by
> > > > virtio-pmem in case of x86_64 by making explicit flushes translating to
> > > > fsync at qemu.
> > > > 
> > > > On SPAPR, the issue is addressed by adding a new hcall to
> > > > request for an explicit flush from the guest ndctl driver when the 
> > > > backend
> > > > nvdimm cannot ensure write persistence with dcbf alone. So, the approach
> > > > here is to convey when the hcall flush is required in a device tree
> > > > property. The guest makes the hcall when the property is found, instead
> > > > of relying on dcbf.
> > > 
> > > Sorry, I'm not very familiar with SPAPR. Why add a hypercall when the
> > > virtio-nvdimm device already exists?
> > > 
> > 
> > On virtualized ppc64 platforms, guests use papr_scm.ko kernel drive for
> > persistent memory support. This was done such that we can use one kernel
> > driver to support persistent memory with multiple hypervisors. To avoid
> > supporting multiple drivers in the guest, -device nvdimm Qemu command-line
> > results in Qemu using PAPR SCM backend. What this patch series does is to
> > make sure we expose the correct synchronous fault support, when we back such
> > nvdimm device with a file.
> > 
> > The existing PAPR SCM backend enables persistent memory support with the
> > help of multiple hypercall.
> > 
> > #define H_SCM_READ_METADATA 0x3E4
> > #define H_SCM_WRITE_METADATA0x3E8
> > #define H_SCM_BIND_MEM  0x3EC
> > #define H_SCM_UNBIND_MEM0x3F0
> > #define H_SCM_UNBIND_ALL0x3FC
> > 
> > Most of them are already implemented in Qemu. This patch series implements
> > H_SCM_FLUSH hypercall.
> 
> The overall point here is that we didn't define the hypercall.  It was
> defined in order to support NVDIMM/pmem devices under PowerVM.  For
> uniformity between PowerVM and KVM guests, we want to support the same
> hypercall interface on KVM/qemu as well.

Okay, that's fine. Now Linux and QEMU have multiple ways of doing this,
but it's fair enough if it's an existing platform hypercall.

Stefan


signature.asc
Description: PGP signature
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: BUG_ON(!mapping_empty(>i_data))

2021-04-30 Thread Hugh Dickins
On Thu, 29 Apr 2021, Andrew Morton wrote:
> 
> I'm not sure this ever was resolved?

It was not resolved: Matthew had prospective fixes for one way in which
it could happen, but they did not help the case which still hits my
testing (well, I replace the BUG_ON by a WARN_ON, so not hit badly).

> 
> Is it the case that the series "Remove nrexceptional tracking v2" at
> least exposed this bug?

Yes: makes a BUG out of a long-standing issue not noticed before.

> 
> IOW, what the heck should I do with
> 
> mm-introduce-and-use-mapping_empty.patch
> mm-stop-accounting-shadow-entries.patch
> dax-account-dax-entries-as-nrpages.patch
> mm-remove-nrexceptional-from-inode.patch

If Matthew doesn't have a proper fix yet (and it's a bit late for more
than an obvious fix), I think those should go in, with this addition:

[PATCH] mm: remove nrexceptional from inode: remove BUG_ON

clear_inode()'s BUG_ON(!mapping_empty(>i_data)) is unsafe: we know
of two ways in which nodes can and do (on rare occasions) get left behind.
Until those are fixed, do not BUG_ON() nor even WARN_ON(). Yes, this will
then leak those nodes (or the next user of the struct inode may use them);
but this has been happening for years, and the new BUG_ON(!mapping_empty)
was only guilty of revealing that. A proper fix will follow, but no hurry.

Signed-off-by: Hugh Dickins 
---

 fs/inode.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

--- mmotm/fs/inode.c2021-04-22 18:30:46.285908982 -0700
+++ linux/fs/inode.c2021-04-29 22:13:54.096530691 -0700
@@ -529,7 +529,14 @@ void clear_inode(struct inode *inode)
 */
xa_lock_irq(>i_data.i_pages);
BUG_ON(inode->i_data.nrpages);
-   BUG_ON(!mapping_empty(>i_data));
+   /*
+* Almost always, mapping_empty(>i_data) here; but there are
+* two known and long-standing ways in which nodes may get left behind
+* (when deep radix-tree node allocation failed partway; or when THP
+* collapse_file() failed). Until those two known cases are cleaned up,
+* or a cleanup function is called here, do not BUG_ON(!mapping_empty),
+* nor even WARN_ON(!mapping_empty).
+*/
xa_unlock_irq(>i_data.i_pages);
BUG_ON(!list_empty(>i_data.private_list));
BUG_ON(!(inode->i_state & I_FREEING));
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm

2021-04-30 Thread David Gibson
On Thu, Apr 29, 2021 at 10:02:23PM +0530, Aneesh Kumar K.V wrote:
> On 4/29/21 9:25 PM, Stefan Hajnoczi wrote:
> > On Wed, Apr 28, 2021 at 11:48:21PM -0400, Shivaprasad G Bhat wrote:
> > > The nvdimm devices are expected to ensure write persistence during power
> > > failure kind of scenarios.
> > > 
> > > The libpmem has architecture specific instructions like dcbf on POWER
> > > to flush the cache data to backend nvdimm device during normal writes
> > > followed by explicit flushes if the backend devices are not synchronous
> > > DAX capable.
> > > 
> > > Qemu - virtual nvdimm devices are memory mapped. The dcbf in the guest
> > > and the subsequent flush doesn't traslate to actual flush to the backend
> > > file on the host in case of file backed v-nvdimms. This is addressed by
> > > virtio-pmem in case of x86_64 by making explicit flushes translating to
> > > fsync at qemu.
> > > 
> > > On SPAPR, the issue is addressed by adding a new hcall to
> > > request for an explicit flush from the guest ndctl driver when the backend
> > > nvdimm cannot ensure write persistence with dcbf alone. So, the approach
> > > here is to convey when the hcall flush is required in a device tree
> > > property. The guest makes the hcall when the property is found, instead
> > > of relying on dcbf.
> > 
> > Sorry, I'm not very familiar with SPAPR. Why add a hypercall when the
> > virtio-nvdimm device already exists?
> > 
> 
> On virtualized ppc64 platforms, guests use papr_scm.ko kernel drive for
> persistent memory support. This was done such that we can use one kernel
> driver to support persistent memory with multiple hypervisors. To avoid
> supporting multiple drivers in the guest, -device nvdimm Qemu command-line
> results in Qemu using PAPR SCM backend. What this patch series does is to
> make sure we expose the correct synchronous fault support, when we back such
> nvdimm device with a file.
> 
> The existing PAPR SCM backend enables persistent memory support with the
> help of multiple hypercall.
> 
> #define H_SCM_READ_METADATA 0x3E4
> #define H_SCM_WRITE_METADATA0x3E8
> #define H_SCM_BIND_MEM  0x3EC
> #define H_SCM_UNBIND_MEM0x3F0
> #define H_SCM_UNBIND_ALL0x3FC
> 
> Most of them are already implemented in Qemu. This patch series implements
> H_SCM_FLUSH hypercall.

The overall point here is that we didn't define the hypercall.  It was
defined in order to support NVDIMM/pmem devices under PowerVM.  For
uniformity between PowerVM and KVM guests, we want to support the same
hypercall interface on KVM/qemu as well.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org