Re: [PATCH] powerpc/papr_scm: Mark nvdimm as unarmed if needed during probe

2021-04-29 Thread Michael Ellerman
On Mon, 29 Mar 2021 17:01:03 +0530, Vaibhav Jain wrote:
> In case an nvdimm is found to be unarmed during probe then set its
> NDD_UNARMED flag before nvdimm_create(). This would enforce a
> read-only access to the ndimm region. Presently even if an nvdimm is
> unarmed its not marked as read-only on ppc64 guests.
> 
> The patch updates papr_scm_nvdimm_init() to force query of nvdimm
> health via __drc_pmem_query_health() and if nvdimm is found to be
> unarmed then set the nvdimm flag ND_UNARMED for nvdimm_create().

Applied to powerpc/next.

[1/1] powerpc/papr_scm: Mark nvdimm as unarmed if needed during probe
  https://git.kernel.org/powerpc/c/adb68c38d8d49a3d60805479c558649bb2182473

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v2] powerpc/mm: Add cond_resched() while removing hpte mappings

2021-04-18 Thread Michael Ellerman
On Sun, 4 Apr 2021 22:01:48 +0530, Vaibhav Jain wrote:
> While removing large number of mappings from hash page tables for
> large memory systems as soft-lockup is reported because of the time
> spent inside htap_remove_mapping() like one below:
> 
>  watchdog: BUG: soft lockup - CPU#8 stuck for 23s!
>  
>  NIP plpar_hcall+0x38/0x58
>  LR  pSeries_lpar_hpte_invalidate+0x68/0xb0
>  Call Trace:
>   0x1fff000 (unreliable)
>   pSeries_lpar_hpte_removebolted+0x9c/0x230
>   hash__remove_section_mapping+0xec/0x1c0
>   remove_section_mapping+0x28/0x3c
>   arch_remove_memory+0xfc/0x150
>   devm_memremap_pages_release+0x180/0x2f0
>   devm_action_release+0x30/0x50
>   release_nodes+0x28c/0x300
>   device_release_driver_internal+0x16c/0x280
>   unbind_store+0x124/0x170
>   drv_attr_store+0x44/0x60
>   sysfs_kf_write+0x64/0x90
>   kernfs_fop_write+0x1b0/0x290
>   __vfs_write+0x3c/0x70
>   vfs_write+0xd4/0x270
>   ksys_write+0xdc/0x130
>   system_call+0x5c/0x70
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/mm: Add cond_resched() while removing hpte mappings
  https://git.kernel.org/powerpc/c/a5d6a3e73acbd619dd5b7b831762b755f9e2db80

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v3] powerpc/papr_scm: Implement support for H_SCM_FLUSH hcall

2021-04-18 Thread Michael Ellerman
On Mon, 29 Mar 2021 13:36:43 -0400, Shivaprasad G Bhat wrote:
> Add support for ND_REGION_ASYNC capability if the device tree
> indicates 'ibm,hcall-flush-required' property in the NVDIMM node.
> Flush is done by issuing H_SCM_FLUSH hcall to the hypervisor.
> 
> If the flush request failed, the hypervisor is expected to
> to reflect the problem in the subsequent nvdimm H_SCM_HEALTH call.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/papr_scm: Implement support for H_SCM_FLUSH hcall
  https://git.kernel.org/powerpc/c/75b7c05ebf902632f7f540c3eb0a8945c2d74aab

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v3] powerpc/papr_scm: Implement support for H_SCM_FLUSH hcall

2021-03-31 Thread Michael Ellerman
"Aneesh Kumar K.V"  writes:
> Shivaprasad G Bhat  writes:
>
>> Add support for ND_REGION_ASYNC capability if the device tree
>> indicates 'ibm,hcall-flush-required' property in the NVDIMM node.
>> Flush is done by issuing H_SCM_FLUSH hcall to the hypervisor.
>>
>> If the flush request failed, the hypervisor is expected to
>> to reflect the problem in the subsequent nvdimm H_SCM_HEALTH call.
>>
>> This patch prevents mmap of namespaces with MAP_SYNC flag if the
>> nvdimm requires an explicit flush[1].
>>
>> References:
>> [1] 
>> https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/ndctl.py.data/map_sync.c
>
>
> Reviewed-by: Aneesh Kumar K.V 

Do we need an ack from nvdimm folks on this?

Or is it entirely powerpc internal (seems like it from the diffstat)?

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH] powerpc/papr_scm: Add PAPR command family to pass-through command-set

2020-10-09 Thread Michael Ellerman
On Mon, 14 Sep 2020 02:49:04 +0530, Vaibhav Jain wrote:
> Add NVDIMM_FAMILY_PAPR to the list of valid 'dimm_family_mask'
> acceptable by papr_scm. This is needed as since commit
> 92fe2aa859f5 ("libnvdimm: Validate command family indices") libnvdimm
> performs a validation of 'nd_cmd_pkg.nd_family' received as part of
> ND_CMD_CALL processing to ensure only known command families can use
> the general ND_CMD_CALL pass-through functionality.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/papr_scm: Add PAPR command family to pass-through command-set
  https://git.kernel.org/powerpc/c/13135b461cf205941308984bd3271ec7d403dc40

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v2] powerpc/papr_scm: Fix warning triggered by perf_stats_show()

2020-09-17 Thread Michael Ellerman
On Sat, 12 Sep 2020 13:44:51 +0530, Vaibhav Jain wrote:
> A warning is reported by the kernel in case perf_stats_show() returns
> an error code. The warning is of the form below:
> 
>  papr_scm ibm,persistent-memory:ibm,pmemory@4411:
> Failed to query performance stats, Err:-10
>  dev_attr_show: perf_stats_show+0x0/0x1c0 [papr_scm] returned bad count
>  fill_read_buffer: dev_attr_show+0x0/0xb0 returned bad count
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/papr_scm: Fix warning triggered by perf_stats_show()
  https://git.kernel.org/powerpc/c/ca78ef2f08ccfa29b711d644964cdf9d7ace15e5

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v2] powerpc/papr_scm: Fix warning triggered by perf_stats_show()

2020-09-15 Thread Michael Ellerman
Vaibhav Jain  writes:
> A warning is reported by the kernel in case perf_stats_show() returns
> an error code. The warning is of the form below:
>
>  papr_scm ibm,persistent-memory:ibm,pmemory@4411:
> Failed to query performance stats, Err:-10
>  dev_attr_show: perf_stats_show+0x0/0x1c0 [papr_scm] returned bad count
>  fill_read_buffer: dev_attr_show+0x0/0xb0 returned bad count
>
> On investigation it looks like that the compiler is silently truncating the
> return value of drc_pmem_query_stats() from 'long' to 'int', since the
> variable used to store the return code 'rc' is an 'int'. This
> truncated value is then returned back as a 'ssize_t' back from
> perf_stats_show() to 'dev_attr_show()' which thinks of it as a large
> unsigned number and triggers this warning..
>
> To fix this we update the type of variable 'rc' from 'int' to
> 'ssize_t' that prevents the compiler from truncating the return value
> of drc_pmem_query_stats() and returning correct signed value back from
> perf_stats_show().
>
> Fixes: 2d02bf835e573 ('powerpc/papr_scm: Fetch nvdimm performance
>stats from PHYP')

Please don't word wrap the Fixes tag it breaks b4.

I've fixed it up this time.

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v2] powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute

2020-09-10 Thread Michael Ellerman
On Mon, 7 Sep 2020 16:35:40 +0530, Vaibhav Jain wrote:
> The newly introduced 'perf_stats' attribute uses the default access
> mode of 0444 letting non-root users access performance stats of an
> nvdimm and potentially force the kernel into issuing large number of
> expensive HCALLs. Since the information exposed by this attribute
> cannot be cached hence its better to ward of access to this attribute
> from users who don't need to access these performance statistics.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
  https://git.kernel.org/powerpc/c/0460534b532e5518c657c7d6492b9337d975eaa3

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v2 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends

2020-09-09 Thread Michael Ellerman
David Hildenbrand  writes:
> On 09.09.20 09:17, Greg Kroah-Hartman wrote:
>> On Tue, Sep 08, 2020 at 10:10:08PM +0200, David Hildenbrand wrote:
>>> We soon want to pass flags, e.g., to mark added System RAM resources.
>>> mergeable. Prepare for that.
>> 
>> What are these random "flags", and how do we know what should be passed
>> to them?
>> 
>> Why not make this an enumerated type so that we know it all works
>> properly, like the GPF_* flags are?  Passing around a random unsigned
>> long feels very odd/broken...
>
> Agreed, an enum (mhp_flags) seems to give a better hint what can
> actually be passed. Thanks!

You probably know this but ...

Just using a C enum doesn't get you any type safety.

You can get some checking via sparse by using __bitwise, which is what
gfp_t does. You don't actually have to use an enum for that, it works
with #defines also.

Or you can wrap the flag in a struct, the way atomic_t does, and then
the compiler will prevent passing plain integers in place of your custom
type.

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH] powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute

2020-08-13 Thread Michael Ellerman
"Aneesh Kumar K.V"  writes:
> On 8/13/20 10:04 AM, Vaibhav Jain wrote:
>> The newly introduced 'perf_stats' attribute uses the default access
>> mode of 0444 letting non-root users access performance stats of an
>> nvdimm and potentially force the kernel into issuing large number of
>> expensive HCALLs. Since the information exposed by this attribute
>> cannot be cached hence its better to ward of access to this attribute
>> from users who don't need to access these performance statistics.
>> 
>> Hence this patch adds check in perf_stats_show() to only let users
>> that are 'perfmon_capable()' to read the nvdimm performance
>> statistics.
>> 
>> Fixes: 2d02bf835e573 ('powerpc/papr_scm: Fetch nvdimm performance stats from 
>> PHYP')
>> Reported-by: Aneesh Kumar K.V 
>> Signed-off-by: Vaibhav Jain 
>> ---
>>   arch/powerpc/platforms/pseries/papr_scm.c | 4 
>>   1 file changed, 4 insertions(+)
>> 
>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
>> b/arch/powerpc/platforms/pseries/papr_scm.c
>> index f439f0dfea7d1..36c51bf8af9a8 100644
>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> @@ -792,6 +792,10 @@ static ssize_t perf_stats_show(struct device *dev,
>>  struct nvdimm *dimm = to_nvdimm(dev);
>>  struct papr_scm_priv *p = nvdimm_provider_data(dimm);
>>   
>> +/* Allow access only to perfmon capable users */
>> +if (!perfmon_capable())
>> +return -EACCES;
>> +
>
> An access check is usually done in open(). This is the read callback IIUC.

Yes. Otherwise an unprivileged user can open the file, and then trick a
suid program into reading from it.

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH] powerpc/papr_scm: Make access mode of 'perf_stats' attribute file to '0400'

2020-08-10 Thread Michael Ellerman
Vaibhav Jain  writes:
> The newly introduced 'perf_stats' attribute uses the default access
> mode of 0444 letting non-root users access performance stats of an
> nvdimm and potentially force the kernel into issuing large number of
> expensive HCALLs. Since the information exposed by this attribute
> cannot be cached hence its better to ward of access to this attribute
> from non-root users.
>
> Hence this patch updates the access-mode of 'perf_stats' sysfs
> attribute file to 0400 to make it only readable to root-users.

Or should we ratelimit it?

Fixes: ??

> Reported-by: Aneesh Kumar K.V 
> Signed-off-by: Vaibhav Jain 

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v4 0/2] powerpc/papr_scm: add support for reporting NVDIMM 'life_used_percentage' metric

2020-08-02 Thread Michael Ellerman
On Fri, 31 Jul 2020 12:11:51 +0530, Vaibhav Jain wrote:
> Changes since v3[1]:
> 
> * Fixed a rebase issue pointed out by Aneesh in first patch in the series.
> 
> [1] 
> https://lore.kernel.org/linux-nvdimm/20200730121303.134230-1-vaib...@linux.ibm.com

Applied to powerpc/next.

[1/2] powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
  https://git.kernel.org/powerpc/c/2d02bf835e5731de632c8a13567905fa7c0da01c
[2/2] powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric
  https://git.kernel.org/powerpc/c/af0870c4e75655b1931d0a5ffde2f448a2794362

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v7 0/7] Support new pmem flush and sync instructions for POWER

2020-07-16 Thread Michael Ellerman
On Wed, 1 Jul 2020 12:52:28 +0530, Aneesh Kumar K.V wrote:
> This patch series enables the usage os new pmem flush and sync instructions 
> on POWER
> architecture. POWER10 introduces two new variants of dcbf instructions 
> (dcbstps and dcbfps)
> that can be used to write modified locations back to persistent storage. 
> Additionally,
> POWER10 also introduce phwsync and plwsync which can be used to establish 
> order of these
> writes to persistent storage.
> 
> This series exposes these instructions to the rest of the kernel. The existing
> dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate
> synchronization with OpenCAPI-hosted persistent storage. Hence the new 
> instructions
> are added as a variant of the old ones that old hardware won't differentiate.
> 
> [...]

Applied to powerpc/next.

[1/7] powerpc/pmem: Restrict papr_scm to P8 and above.
  https://git.kernel.org/powerpc/c/c83040192f3763b243ece26073d61a895b4a230f
[2/7] powerpc/pmem: Add new instructions for persistent storage and sync
  https://git.kernel.org/powerpc/c/32db09d992ddc7d145595cff49cccfe14e018266
[3/7] powerpc/pmem: Add flush routines using new pmem store and sync instruction
  https://git.kernel.org/powerpc/c/d358042793183a57094dac45a44116e1165ac593
[4/7] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
  https://git.kernel.org/powerpc/c/3e79f082ebfc130360bcee23e4dd74729dcafdf4
[5/7] powerpc/pmem: Update ppc64 to use the new barrier instruction.
  https://git.kernel.org/powerpc/c/76e6c73f33d4e1cc4de4f25c0bf66d59e42113c4
[6/7] powerpc/pmem: Avoid the barrier in flush routines
  https://git.kernel.org/powerpc/c/436499ab868f1a9e497cfdbf641affe8a122c571
[7/7] powerpc/pmem: Initialize pmem device on newer hardware
  https://git.kernel.org/powerpc/c/8c26ab72663b4affc31e47cdf77d61d0172d1033

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [GIT PULL] libnvdimm for v5.8-rc2

2020-06-21 Thread Michael Ellerman
Dan Williams  writes:
> Hi Linus, please pull from:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
> tags/libnvdimm-for-5.8-rc2
>
> ...to receive a feature (papr_scm health retrieval) and a fix (sysfs
> attribute visibility) for v5.8.
>
> Vaibhav explains in the merge commit below why missing v5.8 would be
> painful and I agreed to try a -rc2 pull because only cosmetics kept
> this out of -rc1 and his initial versions were posted in more than
> enough time for v5.8 consideration.
>
> ===
> These patches are tied to specific features that were committed to
> customers in upcoming distros releases (RHEL and SLES) whose time-lines
> are tied to 5.8 kernel release.
>
> Being able to track the health of an nvdimm is critical for our
> customers that are running workloads leveraging papr-scm nvdimms.
> Missing the 5.8 kernel would mean missing the distro timelines and
> shifting forward the availability of this feature in distro kernels by
> at least 6 months.
> ===
>
> I notice that these do not have an ack from Michael, but I had been
> assuming that he was deferring this to a libnvdimm subsystem decision
> ever since v7 back at the end of May where he said "I don't have
> strong opinions about the user API, it's really up to the nvdimm
> folks." [1]

Yeah, sorry for not providing an actual ack, I didn't realise you were
planning to send it for 5.8.

The arch parts of that series are pretty boring plumbing of hypervisor
calls, so the important details were all the libnvdimm related issues
IMO.

So please consider this a belated ack and thanks for getting it merged.

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [RESEND PATCH v7 4/5] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-05-26 Thread Michael Ellerman
Vaibhav Jain  writes:
> Hi Ira, Mpe and Aneesh,
>
> Vaibhav Jain  writes:
>
>> Michael Ellerman  writes:
>>
>>> Ira Weiny  writes:
>>>> On Wed, May 20, 2020 at 12:30:57AM +0530, Vaibhav Jain wrote:
>>>>> Introduce support for Papr nvDimm Specific Methods (PDSM) in papr_scm
>>>>> modules and add the command family to the white list of NVDIMM command
>>>>> sets. Also advertise support for ND_CMD_CALL for the dimm
>>>>> command mask and implement necessary scaffolding in the module to
>>>>> handle ND_CMD_CALL ioctl and PDSM requests that we receive.
>>> ...
>>>>> + *
>>>>> + * Payload Version:
>>>>> + *
>>>>> + * A 'payload_version' field is present in PDSM header that indicates a 
>>>>> specific
>>>>> + * version of the structure present in PDSM Payload for a given PDSM 
>>>>> command.
>>>>> + * This provides backward compatibility in case the PDSM Payload 
>>>>> structure
>>>>> + * evolves and different structures are supported by 'papr_scm' and 
>>>>> 'libndctl'.
>>>>> + *
>>>>> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the 
>>>>> version
>>>>> + * of the payload struct it supports via 'payload_version' field. The 
>>>>> 'papr_scm'
>>>>> + * module when servicing the PDSM envelope checks the 'payload_version' 
>>>>> and then
>>>>> + * uses 'payload struct version' == MIN('payload_version field',
>>>>> + * 'max payload-struct-version supported by papr_scm') to service the 
>>>>> PDSM.
>>>>> + * After servicing the PDSM, 'papr_scm' put the negotiated version of 
>>>>> payload
>>>>> + * struct in returned 'payload_version' field.
>>>>
>>>> FWIW many people believe using a size rather than version is more 
>>>> sustainable.
>>>> It is expected that new payload structures are larger (more features) than 
>>>> the
>>>> previous payload structure.
>>>>
>>>> I can't find references at the moment through.
>>>
>>> I think clone_args is a good modern example:
>>>
>>>   
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/sched.h#n88
>>
>> Thank Ira and Mpe for pointing this out. I looked into how clone3 sycall
>> handles clone_args and few differences came out:
>>
>> * Unlike clone_args that are always transferred in one direction from
>>   user-space to kernel, payload contents of pdsms are transferred in both
>>   directions. Having a single version number makes it easier for
>>   user-space and kernel to determine what data will be exchanged.
>>
>> * For PDSMs, the version number is negotiated between libndctl and
>>   kernel. For example in case kernel only supports an older version of
>>   a structure, its free to send a lower version number back to
>>   libndctl. Such negotiations doesnt happen with clone3 syscall.
>
> If you are ok with the explaination above please let me know. I will
> quickly spin off a v8 addressing your review comments.

I don't have strong opinions about the user API, it's really up to the
nvdimm folks.

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v4 1/2] x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()

2020-05-23 Thread Michael Ellerman
Hi Dan,

Sorry one minor nit below.

Dan Williams  writes:
> diff --git a/tools/testing/selftests/powerpc/copyloops/.gitignore 
> b/tools/testing/selftests/powerpc/copyloops/.gitignore
> index ddaf140b8255..1152bcc819fe 100644
> --- a/tools/testing/selftests/powerpc/copyloops/.gitignore
> +++ b/tools/testing/selftests/powerpc/copyloops/.gitignore
> @@ -12,4 +12,4 @@ memcpy_p7_t1
>  copyuser_64_exc_t0
>  copyuser_64_exc_t1
>  copyuser_64_exc_t2
> -memcpy_mcsafe_64
> +copy_mc_to_user

Should be:

+copy_mc_64

Otherwise the powerpc bits look good to me.

Acked-by: Michael Ellerman  (powerpc)

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [RESEND PATCH v7 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-05-21 Thread Michael Ellerman
Vaibhav Jain  writes:
> Thanks for reviewing this this patch Ira. My responses below:
> Ira Weiny  writes:
>> On Wed, May 20, 2020 at 12:30:56AM +0530, Vaibhav Jain wrote:
>>> Implement support for fetching nvdimm health information via
>>> H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
>>> of 64-bit big-endian integers, bitwise-and of which is then stored in
>>> 'struct papr_scm_priv' and subsequently partially exposed to
>>> user-space via newly introduced dimm specific attribute
>>> 'papr/flags'. Since the hcall is costly, the health information is
>>> cached and only re-queried, 60s after the previous successful hcall.
...
>>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
>>> b/arch/powerpc/platforms/pseries/papr_scm.c
>>> index f35592423380..142636e1a59f 100644
>>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>>> @@ -39,6 +78,15 @@ struct papr_scm_priv {
>>> struct resource res;
>>> struct nd_region *region;
>>> struct nd_interleave_set nd_set;
>>> +
>>> +   /* Protect dimm health data from concurrent read/writes */
>>> +   struct mutex health_mutex;
>>> +
>>> +   /* Last time the health information of the dimm was updated */
>>> +   unsigned long lasthealth_jiffies;
>>> +
>>> +   /* Health information for the dimm */
>>> +   u64 health_bitmap;
>>
>> I wonder if this should be typed big endian as you mention that it is in the
>> commit message?
> This was discussed in an earlier review of the patch series at
> https://lore.kernel.org/linux-nvdimm/878sjetcis@mpe.ellerman.id.au
>
> Even though health bitmap is returned in big endian format (For ex
> value 0xC00 indicates bits 0,1 set), its value is never
> used. Instead only test for specific bits being set in the register is
> done.

This has already caused a lot of confusion, so let me try and clear it
up. I will probably fail :)

The value is not big endian.

It's returned in a GPR (a register), from the hypervisor. The ordering
of bytes in a register is not dependent on what endian we're executing
in.

It's true that the hypervisor will have been running big endian, and
when it returns to us we will now be running little endian. But the
value is unchanged, it was 0xC00 in the GPR while the HV was
running and it's still 0xC00 when we return to Linux. You
can see this in mambo, see below for an example.


_However_, the specification of the bits in the bitmap value uses MSB 0
ordering, as is traditional for IBM documentation. That means the most
significant bit, aka. the left most bit, is called "bit 0".

See: https://en.wikipedia.org/wiki/Bit_numbering#MSB_0_bit_numbering

That is the opposite numbering from what most people use, and in
particular what most code in Linux uses, which is that bit 0 is the
least significant bit.

Which is where the confusion comes in. It's not that the bytes are
returned in a different order, it's that the bits are numbered
differently in the IBM documentation.

The way to fix this kind of thing is to read the docs, and convert all
the bits into correct numbering (LSB=0), and then throw away the docs ;)

cheers



In mambo we can set a breakpoint just before the kernel enters skiboot,
towards the end of __opal_call. The kernel is running LE and skiboot
runs BE.

  systemsim-p9 [~/skiboot/skiboot/external/mambo] b 0xc00c1744
  breakpoint set at [0:0:0]: 0xc00c1744 (0x000C1744) 
Enc:0x2402004C : hrfid

Then run:

  systemsim-p9 [~/skiboot/skiboot/external/mambo] c
  [0:0:0]: 0xC00C1744 (0x000C1744) Enc:0x2402004C : hrfid
  INFO: 121671618: (121671618): ** Execution stopped: user (tcl),  **
  121671618: ** finished running 121671618 instructions **

And we stop there, on an hrfid that we haven't executed yet.
We can print r0, to see the OPAL token:

  systemsim-p9 [~/skiboot/skiboot/external/mambo] p r0
  0x0019

ie. we're calling OPAL_CONSOLE_WRITE_BUFFER_SPACE (25).

And we can print the MSR:

  systemsim-p9 [~/skiboot/skiboot/external/mambo] p msr
  0x92001033
  
 64-bit mode (SF): 0x1 [64-bit mode]
Hypervisor State (HV): 0x1
   Vector Available (VEC): 0x1
  Machine Check Interrupt Enable (ME): 0x1
Instruction Relocate (IR): 0x1
   Data Relocate (DR): 0x1
   Recoverable Interrupt (RI): 0x1
  Little-Endian Mode (LE): 0x1 [little-endian]

ie. we're little endian.

We then step one instruction:

  systemsim-p9 [~/skiboot/skiboot/external/mambo] s
  [0:0:0]: 0x30002BF0 (0x30002BF0) Enc:0x7D9FFAA6 : mfspr   
r12,PIR

Now we're in skiboot. Print the MSR again:

  systemsim-p9 [~/skiboot/skiboot/external/mambo] p msr
  0x92001002
  
 64-bit mode (SF): 0x1 [64-bit mode]
Hypervisor State (HV): 0x1
   Vector Available (VEC): 0x1
  Machine Check Interrupt Enable (ME): 

Re: [RESEND PATCH v7 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods

2020-05-21 Thread Michael Ellerman
Ira Weiny  writes:
> On Wed, May 20, 2020 at 12:30:57AM +0530, Vaibhav Jain wrote:
>> Introduce support for Papr nvDimm Specific Methods (PDSM) in papr_scm
>> modules and add the command family to the white list of NVDIMM command
>> sets. Also advertise support for ND_CMD_CALL for the dimm
>> command mask and implement necessary scaffolding in the module to
>> handle ND_CMD_CALL ioctl and PDSM requests that we receive.
...
>> + *
>> + * Payload Version:
>> + *
>> + * A 'payload_version' field is present in PDSM header that indicates a 
>> specific
>> + * version of the structure present in PDSM Payload for a given PDSM 
>> command.
>> + * This provides backward compatibility in case the PDSM Payload structure
>> + * evolves and different structures are supported by 'papr_scm' and 
>> 'libndctl'.
>> + *
>> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the 
>> version
>> + * of the payload struct it supports via 'payload_version' field. The 
>> 'papr_scm'
>> + * module when servicing the PDSM envelope checks the 'payload_version' and 
>> then
>> + * uses 'payload struct version' == MIN('payload_version field',
>> + * 'max payload-struct-version supported by papr_scm') to service the PDSM.
>> + * After servicing the PDSM, 'papr_scm' put the negotiated version of 
>> payload
>> + * struct in returned 'payload_version' field.
>
> FWIW many people believe using a size rather than version is more sustainable.
> It is expected that new payload structures are larger (more features) than the
> previous payload structure.
>
> I can't find references at the moment through.

I think clone_args is a good modern example:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/sched.h#n88

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v3 1/2] x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()

2020-05-20 Thread Michael Ellerman
Hi Dan,

Just a couple of minor things ...

Dan Williams  writes:
> In reaction to a proposal to introduce a memcpy_mcsafe_fast()
> implementation Linus points out that memcpy_mcsafe() is poorly named
> relative to communicating the scope of the interface. Specifically what
> addresses are valid to pass as source, destination, and what faults /
> exceptions are handled. Of particular concern is that even though x86
> might be able to handle the semantics of copy_mc_to_user() with its
> common copy_user_generic() implementation other archs likely need / want
> an explicit path for this case:
...

> diff --git a/arch/powerpc/include/asm/uaccess.h 
> b/arch/powerpc/include/asm/uaccess.h
> index 0969285996cb..dcbbcbf3552c 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -348,6 +348,32 @@ do { 
> \
>  extern unsigned long __copy_tofrom_user(void __user *to,
>   const void __user *from, unsigned long size);
>  
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +extern unsigned long __must_check

We try not to add extern in headers anymore.

> +copy_mc_generic(void *to, const void *from, unsigned long size);
> +
> +static inline unsigned long __must_check
> +copy_mc_to_kernel(void *to, const void *from, unsigned long size)
> +{
> + return copy_mc_generic(to, from, size);
> +}
> +#define copy_mc_to_kernel copy_mc_to_kernel
> +
> +static inline unsigned long __must_check
> +copy_mc_to_user(void __user *to, const void *from, unsigned long n)
> +{
> + if (likely(check_copy_size(from, n, true))) {
> + if (access_ok(to, n)) {
> + allow_write_to_user(to, n);
> + n = copy_mc_generic((void *)to, from, n);
> + prevent_write_to_user(to, n);
> + }
> + }
> +
> + return n;
> +}
> +#endif

Otherwise that looks fine.

...

> diff --git a/tools/testing/selftests/powerpc/copyloops/Makefile 
> b/tools/testing/selftests/powerpc/copyloops/Makefile
> index 0917983a1c78..959817e7567c 100644
> --- a/tools/testing/selftests/powerpc/copyloops/Makefile
> +++ b/tools/testing/selftests/powerpc/copyloops/Makefile
> @@ -45,9 +45,9 @@ $(OUTPUT)/memcpy_p7_t%: memcpy_power7.S $(EXTRA_SOURCES)
>   -D SELFTEST_CASE=$(subst memcpy_p7_t,,$(notdir $@)) \
>   -o $@ $^
>  
> -$(OUTPUT)/memcpy_mcsafe_64: memcpy_mcsafe_64.S $(EXTRA_SOURCES)
> +$(OUTPUT)/copy_mc: copy_mc.S $(EXTRA_SOURCES)
>   $(CC) $(CPPFLAGS) $(CFLAGS) \
> - -D COPY_LOOP=test_memcpy_mcsafe \
> + -D COPY_LOOP=test_copy_mc \

This needs a fixup:

diff --git a/tools/testing/selftests/powerpc/copyloops/Makefile 
b/tools/testing/selftests/powerpc/copyloops/Makefile
index 959817e7567c..b4eb5c4c6858 100644
--- a/tools/testing/selftests/powerpc/copyloops/Makefile
+++ b/tools/testing/selftests/powerpc/copyloops/Makefile
@@ -47,7 +47,7 @@ $(OUTPUT)/memcpy_p7_t%:   memcpy_power7.S $(EXTRA_SOURCES)
 
 $(OUTPUT)/copy_mc: copy_mc.S $(EXTRA_SOURCES)
$(CC) $(CPPFLAGS) $(CFLAGS) \
-   -D COPY_LOOP=test_copy_mc \
+   -D COPY_LOOP=test_copy_mc_generic \
-o $@ $^
 
 $(OUTPUT)/copyuser_64_exc_t%: copyuser_64.S exc_validate.c ../harness.c \


>   -o $@ $^
>  
>  $(OUTPUT)/copyuser_64_exc_t%: copyuser_64.S exc_validate.c ../harness.c \
> diff --git a/tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S 
> b/tools/testing/selftests/powerpc/copyloops/copy_mc.S
> similarity index 100%
> rename from tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S
> rename to tools/testing/selftests/powerpc/copyloops/copy_mc.S

This file is a symlink to the file in arch/powerpc/lib, so the name of
the link needs updating, as well as the target.

Also is there a reason you dropped the "_64"? It would make most sense
to keep it I think, as then the file in selftests and the file in arch/
have the same name.

If you want to keep the copy_mc.S name it needs the diff below. Though
as I said I think it would be better to use copy_mc_64.S.

cheers


diff --git a/tools/testing/selftests/powerpc/copyloops/copy_mc.S 
b/tools/testing/selftests/powerpc/copyloops/copy_mc.S
new file mode 12
index ..dcbe06d500fb
--- /dev/null
+++ b/tools/testing/selftests/powerpc/copyloops/copy_mc.S
@@ -0,0 +1 @@
+../../../../../arch/powerpc/lib/copy_mc_64.S
\ No newline at end of file
diff --git a/tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S 
b/tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S
deleted file mode 12
index f0feef3062f6..
--- a/tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S
+++ /dev/null
@@ -1 +0,0 @@
-../../../../../arch/powerpc/lib/memcpy_mcsafe_64.S
\ No newline at end of file
-- 
2.25.1

___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to 

Re: [PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules

2020-05-08 Thread Michael Ellerman
Borislav Petkov  writes:
> On Fri, May 08, 2020 at 04:19:19PM +0530, Vaibhav Jain wrote:
>> 'seq_buf' provides a very useful abstraction for writing to a string
>> buffer without needing to worry about it over-flowing. However even
>> though the API has been stable for couple of years now its stills not
>> exported to external modules limiting its usage.
>> 
>> Hence this patch proposes update to 'seq_buf.c' to mark
>> seq_buf_printf() which is part of the seq_buf API to be exported to
>> external GPL modules. This symbol will be used in later parts of this
>
> What is "external GPL modules"?

A module that has MODULE_LICENSE("GPL") ?

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v6 3/4] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods

2020-04-30 Thread Michael Ellerman
Vaibhav Jain  writes:
> Introduce support for Papr nvDimm Specific Methods (PDSM) in papr_scm
> modules and add the command family to the white list of NVDIMM command
> sets. Also advertise support for ND_CMD_CALL for the dimm
> command mask and implement necessary scaffolding in the module to
> handle ND_CMD_CALL ioctl and PDSM requests that we receive.
>
> The layout of the PDSM request as we expect from libnvdimm/libndctl is
> described in newly introduced uapi header 'papr_scm_pdsm.h' which
> defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
> to communicate the PDSM request via member
> 'nd_pkg_papr_scm->nd_command' and size of payload that need to be
> sent/received for servicing the PDSM.
>
> A new function is_cmd_valid() is implemented that reads the args to
> papr_scm_ndctl() and performs sanity tests on them. A new function
> papr_scm_service_pdsm() is introduced and is called from
> papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
> command from libnvdimm.
>
> Cc: Dan Williams 
> Cc: Michael Ellerman 
> Cc: "Aneesh Kumar K . V" 
> Signed-off-by: Vaibhav Jain 
> ---
> Changelog
>
> v5..v6 :
> * Changed the usage of the term DSM to PDSM to distinguish it from the
>   ACPI term [ Dan Williams ]
> * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
>   to reflect the new terminology.
> * Updated the patch description and title to reflect the new terminology.
> * Squashed patch to introduce new command family in 'ndctl.h' with
>   this patch [ Dan Williams ]
> * Updated the papr_scm_pdsm method starting index from 0x1 to 0x0
>   [ Dan Williams ]
> * Removed redundant license text from the papr_scm_psdm.h file.
>   [ Dan Williams ]
> * s/envelop/envelope/ at various places [ Dan Williams ]
> * Added '__packed' attribute to command package header to gaurd
>   against different compiler adding paddings between the fields.
>   [ Dan Williams]
> * Converted various pr_debug to dev_debug [ Dan Williams ]
>
> v4..v5 :
> * None
>
> v3..v4 :
> * None
>
> v2..v3 :
> * Updated the patch prefix to 'ndctl/uapi' [Aneesh]
>
> v1..v2 :
> * None
> ---
>  arch/powerpc/include/uapi/asm/papr_scm_pdsm.h | 153 ++
>  arch/powerpc/platforms/pseries/papr_scm.c | 101 +++-
>  include/uapi/linux/ndctl.h|   1 +
>  3 files changed, 249 insertions(+), 6 deletions(-)
>  create mode 100644 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
>
> diff --git a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h 
> b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
> new file mode 100644
> index ..ec48b5c7fc18
> --- /dev/null
> +++ b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
> @@ -0,0 +1,153 @@
> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> +/*
> + * PAPR-SCM Dimm specific methods (PDSM) and structs for libndctl
> + *
> + * (C) Copyright IBM 2020
> + *
> + * Author: Vaibhav Jain 
> + */
> +
> +#ifndef _UAPI_ASM_POWERPC_PAPR_SCM_PDSM_H_
> +#define _UAPI_ASM_POWERPC_PAPR_SCM_PDSM_H_
> +
> +#include 
> +
> +#ifdef __KERNEL__
> +#include 
> +#else
> +#include 
> +#endif

We shouldn't be checking __KERNEL__ in uapi headers, something has gone
wrong if this is necessary.

> +
> +/*
> + * PDSM Envelope:
> + *
> + * The ioctl ND_CMD_CALL transfers data between user-space and kernel via
> + * 'envelopes' which consists of a header and user-defined payload sections.
> + * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
> + * payload following it and offset of which relative to the struct is 
> provided
> + * by 'nd_pdsm_cmd_pkg.payload_offset'. *
> + *
> + *  +-+-+---+
> + *  |   64-Bytes  |   8-Bytes   |   Max 184-Bytes   |
> + *  +-+-+---+
> + *  |   nd_pdsm_cmd_pkg |   |
> + *  |-+ |   |
> + *  |  nd_cmd_pkg | |   |
> + *  +-+-+---+
> + *  | nd_family   |  |   |
> + *  | nd_size_out | cmd_status  ||
> + *  | nd_size_in  | payload_version |  PAYLOAD   |
> + *  | nd_command  | payload_offset ->|
> + *  | nd_fw_size  | ||
> + *  +-+-+---+
> + *
> + * PDSM Header:
> + *
> + * The header is defined as 's

Re: [PATCH v6 2/4] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-04-30 Thread Michael Ellerman
Vaibhav Jain  writes:

> Implement support for fetching nvdimm health information via
> H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
> of 64-bit big-endian integers, bitwise-and of which is then stored in
> 'struct papr_scm_priv' and subsequently partially exposed to
> user-space via newly introduced dimm specific attribute
> 'papr/flags'. Since the hcall is costly, the health information is
> cached and only re-queried, 60s after the previous successful hcall.
>
> The patch also adds a new asm header named 'papr_scm.h' describing the
> interface between PHYP and guest kernel. A documentation text
> describing flags reported by the the new sysfs attribute 'papr/flags'
> is also introduced at Documentation/ABI/testing/sysfs-bus-papr-scm.
>
> [1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
> PAPR hcalls")
>
> Cc: Dan Williams 
> Cc: Michael Ellerman 
> Cc: "Aneesh Kumar K . V" 
> Signed-off-by: Vaibhav Jain 
> ---
> Changelog
>
> v5..v6 :
> * Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
>   [Dan Williams]
> * Include documentation for 'papr/flags' attr [Dan Williams]
> * Change flag 'save_fail' to 'flush_fail' [Dan Williams]
> * Caching of health bitmap to reduce expensive hcalls [Dan Williams]
> * Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
> * Replaced two __be64 integers from papr_scm_priv to a single u64
>   integer [Mpe]
> * Updated patch description to reflect the changes made in this
>   version.
> * Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
>   flags_show() [Dan Williams]
>
> v4..v5 :
> * None
>
> v3..v4 :
> * None
>
> v2..v3 :
> * Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
>NVDIMM unarmed [Aneesh]
>
> v1..v2 :
> * New patch in the series.
> ---
>  Documentation/ABI/testing/sysfs-bus-papr-scm |  27 
>  arch/powerpc/include/asm/papr_scm.h  |  49 
>  arch/powerpc/platforms/pseries/papr_scm.c| 126 ++-
>  3 files changed, 200 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-scm
>  create mode 100644 arch/powerpc/include/asm/papr_scm.h
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-papr-scm 
> b/Documentation/ABI/testing/sysfs-bus-papr-scm
> new file mode 100644
> index ..001e4d34ab5c
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-papr-scm
> @@ -0,0 +1,27 @@
> +What:/sys/bus/nd/devices/nmemX/papr/flags
> +Date:Apr, 2020
> +KernelVersion:   v5.8
> +Contact: linuxppc-dev , 
> linux-nvdimm@lists.01.org,
> +Description:
> + (RO) Report flags indicating various states of a
> + papr-scm NVDIMM device. Each flag maps to a one or
> + more bits set in the dimm-health-bitmap retrieved in
> + response to H_SCM_HEALTH hcall. The details of the bit
> + flags returned in response to this hcall is available
> + at 'Documentation/powerpc/papr_hcalls.rst' . Below are
> + the flags reported in this sysfs file:
> +
> + * "not_armed"   : Indicating that nvdimm contents will not

NVDIMM?

> +   survive a power cycle.
> + * "flush_fail"  : Indicating that nvdimm contents
> +   couldn't be flushed during last
> +   shutdown event.
> + * "restore_fail": Indicating that nvdimm contents
> +   couldn't be restored during dimm

DIMM?

> +   initialization.
> + * "encrypted"   : Dimm contents are encrypted.
> + * "smart_notify": There is health event for the nvdimm.
> + * "scrubbed": Indicating that contents of the
> +   nvdimm have been scrubbed.
> + * "locked"  : Indicating that nvdimm contents cant
> +   be modified until next power cycle.

Some of these strings are not very meaningful to me, I would choose
different values.

But I assume you are using these because they are defined somewhere
else, where is that?

> diff --git a/arch/powerpc/include/asm/papr_scm.h 
> b/arch/powerpc/include/asm/papr_scm.h
> new file mode 100644
> index ..b51c048e906a
> --- /dev/null
> +++ b/arch/powerpc/include/asm/papr_scm.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + * Structures and defines needed to manage nvdimms for spapr guests.
> + */
> +#ifndef _AS

Re: [PATCH v6 1/4] powerpc: Document details on H_SCM_HEALTH hcall

2020-04-30 Thread Michael Ellerman
Vaibhav Jain  writes:

> Add documentation to 'papr_hcalls.rst' describing the bitmap flags
> that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
> specification.
>
> Cc: Dan Williams 
> Cc: Michael Ellerman 
> Cc: "Aneesh Kumar K . V" 
> Signed-off-by: Vaibhav Jain 
> ---
> Changelog:
>
> v5..v6
> * New patch in the series
> ---
>  Documentation/powerpc/papr_hcalls.rst | 43 ---
>  1 file changed, 39 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/powerpc/papr_hcalls.rst 
> b/Documentation/powerpc/papr_hcalls.rst
> index 3493631a60f8..9a5ba5eaf323 100644
> --- a/Documentation/powerpc/papr_hcalls.rst
> +++ b/Documentation/powerpc/papr_hcalls.rst
> @@ -220,13 +220,48 @@ from the LPAR memory.
>  **H_SCM_HEALTH**
>  
>  | Input: drcIndex
> -| Out: *health-bitmap, health-bit-valid-bitmap*
> +| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
>  | Return Value: *H_Success, H_Parameter, H_Hardware*
>  
>  Given a DRC Index return the info on predictive failure and overall health of
> -the NVDIMM. The asserted bits in the health-bitmap indicate a single 
> predictive
> -failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
> -valid.
> +the NVDIMM. The asserted bits in the health-bitmap indicate one or more 
> states
> +(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate
> +which bits in health-bitmap are valid.
> +
> +Health Bitmap Flags:
> +
> ++--+---+
> +|  Bit |   Definition
>   |
> ++==+===+
> +|  00  | SCM device is unable to persist memory contents.
>   |
> +|  | If the system is powered down, nothing will be saved.   
>   |
> ++--+---+

Are these correct bit numbers or backward IBM big endian bit numbers?

ie. which bit is LSB?

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v4 03/25] powerpc/powernv: Map & release OpenCAPI LPC memory

2020-04-02 Thread Michael Ellerman
Benjamin Herrenschmidt  writes:
> On Wed, 2020-04-01 at 01:48 -0700, Dan Williams wrote:
>> > 
>> > +u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size)
>> > +{
>> > +   struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>> > +   struct pnv_phb *phb = hose->private_data;
>> 
>> Is calling the local variable 'hose' instead of 'host' on purpose?
>
> Haha that's funny :-)
>
> It's an ld usage that comes iirc from sparc ? or maybe alpha ?

Yeah it was alpha, I found it in the history tree:

  
https://github.com/mpe/linux-fullhistory/blob/1928de59ba4209dc5e9f2cef63560c09ba0df73b/arch/alpha/kernel/mcpcia.c

And airlied found an old manual which confirms it:

  The TIOP module interfaces the AlphaServer 8000 system bus to four I/O 
channels, called "hoses."

  https://www.hpl.hp.com/hpjournal/dtj/vol7num1/vol7num1art4.pdf


So at least now we know where it comes from.

It's also used widely in mips, microblaze, sh and a little bit in drm.

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v5 1/4] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-04-02 Thread Michael Ellerman
Vaibhav Jain  writes:
> Thanks for reviewing this patch Mpe,
> Michael Ellerman  writes:
>> Vaibhav Jain  writes:
...
>>
>>> +   /* Check for various masks in bitmap and set the buffer */
>>> +   if (health & PAPR_SCM_DIMM_UNARMED_MASK)
>>> +   rc += sprintf(buf, "not_armed ");
>>
>> I know buf is "big enough" but using sprintf() in 2020 is a bit ... :)
>>
>> seq_buf is a pretty thin wrapper over a buffer you can use to make this
>> cleaner and also handles overflow for you.
>>
>> See eg. show_user_instructions() for an example.
>
> Unfortunatly seq_buf_printf() is still not an exported symbol hence not
> usable in external modules.

Send a patch? :)

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v5 1/4] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-04-02 Thread Michael Ellerman
Vaibhav Jain  writes:

> Implement support for fetching nvdimm health information via
> H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
> of 64-bit big-endian integers which are then stored in 'struct
> papr_scm_priv' and subsequently partially exposed to user-space via
> newly introduced dimm specific attribute 'papr_flags'. Also a new asm
> header named 'papr-scm.h' is added that describes the interface
> between PHYP and guest kernel.
>
> Following flags are reported via 'papr_flags' sysfs attribute contents
> of which are space separated string flags indicating various nvdimm
> states:
>
>  * "not_armed": Indicating that nvdimm contents wont survive a power
>  cycle.
>  * "save_fail": Indicating that nvdimm contents couldn't be flushed
>  during last shutdown event.
>  * "restore_fail": Indicating that nvdimm contents couldn't be restored
>  during dimm initialization.
>  * "encrypted": Dimm contents are encrypted.
>  * "smart_notify": There is health event for the nvdimm.
>  * "scrubbed" : Indicating that contents of the nvdimm have been
>  scrubbed.
>  * "locked"   : Indicating that nvdimm contents cant be modified
>  until next power cycle.
>
> [1]: commit 58b278f568f0 ("powerpc: Provide initial documentation for
> PAPR hcalls")
>
> Signed-off-by: Vaibhav Jain 
> ---
> Changelog:
>
> v4..v5 : None
>
> v3..v4 : None
>
> v2..v3 : Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
>NVDIMM unarmed [Aneesh]
>
> v1..v2 : New patch in the series.
> ---
>  arch/powerpc/include/asm/papr_scm.h   |  48 ++
>  arch/powerpc/platforms/pseries/papr_scm.c | 105 +-
>  2 files changed, 151 insertions(+), 2 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/papr_scm.h
>
> diff --git a/arch/powerpc/include/asm/papr_scm.h 
> b/arch/powerpc/include/asm/papr_scm.h
> new file mode 100644
> index ..868d3360f56a
> --- /dev/null
> +++ b/arch/powerpc/include/asm/papr_scm.h
> @@ -0,0 +1,48 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + * Structures and defines needed to manage nvdimms for spapr guests.
> + */
> +#ifndef _ASM_POWERPC_PAPR_SCM_H_
> +#define _ASM_POWERPC_PAPR_SCM_H_
> +
> +#include 
> +#include 
> +
> +/* DIMM health bitmap bitmap indicators */
> +/* SCM device is unable to persist memory contents */
> +#define PAPR_SCM_DIMM_UNARMEDPPC_BIT(0)

Please don't use PPC_BIT, it's just unncessary obfuscation for folks
who are reading the code without access to the docs (ie. more or less
everyone other than you :)

> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
> b/arch/powerpc/platforms/pseries/papr_scm.c
> index 0b4467e378e5..aaf2e4ab1f75 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -14,6 +14,7 @@
>  #include 
>  
>  #include 
> +#include 
>  
>  #define BIND_ANY_ADDR (~0ul)
>  
> @@ -39,6 +40,13 @@ struct papr_scm_priv {
>   struct resource res;
>   struct nd_region *region;
>   struct nd_interleave_set nd_set;
> +
> + /* Protect dimm data from concurrent access */
> + struct mutex dimm_mutex;
> +
> + /* Health information for the dimm */
> + __be64 health_bitmap;
> + __be64 health_bitmap_valid;

It's much less error prone to store the data in CPU endian and do the
endian conversion only at the point where the data either comes from or
goes to firmware.

That would also mean you can define flags above without needing PPC_BIT
because they'll be in CPU endian too.

> @@ -144,6 +152,35 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
>   return drc_pmem_bind(p);
>  }
>  
> +static int drc_pmem_query_health(struct papr_scm_priv *p)
> +{
> + unsigned long ret[PLPAR_HCALL_BUFSIZE];
> + int64_t rc;

Use kernel types please, ie. s64, or just long.

> + rc = plpar_hcall(H_SCM_HEALTH, ret, p->drc_index);
> + if (rc != H_SUCCESS) {
> + dev_err(>pdev->dev,
> +  "Failed to query health information, Err:%lld\n", rc);
> + return -ENXIO;
> + }
> +
> + /* Protect modifications to papr_scm_priv with the mutex */
> + rc = mutex_lock_interruptible(>dimm_mutex);
> + if (rc)
> + return rc;
> +
> + /* Store the retrieved health information in dimm platform data */
> + p->health_bitmap = ret[0];
> + p->health_bitmap_valid = ret[1];
> +
> + dev_dbg(>pdev->dev,
> + "Queried dimm health info. Bitmap:0x%016llx Mask:0x%016llx\n",
> + be64_to_cpu(p->health_bitmap),
> + be64_to_cpu(p->health_bitmap_valid));
> +
> + mutex_unlock(>dimm_mutex);
> + return 0;
> +}
>  
>  static int papr_scm_meta_get(struct papr_scm_priv *p,
>struct nd_cmd_get_config_data_hdr *hdr)
> @@ -304,6 +341,67 @@ static inline int 

Re: [PATCH v4 00/25] Add support for OpenCAPI Persistent Memory devices

2020-04-02 Thread Michael Ellerman
"Oliver O'Halloran"  writes:
> On Thu, Apr 2, 2020 at 2:42 PM Michael Ellerman  wrote:
>> "Alastair D'Silva"  writes:
>> >> -Original Message-
>> >> From: Dan Williams 
>> >>
>> >> On Sun, Mar 29, 2020 at 10:23 PM Alastair D'Silva 
>> >> wrote:
>> >> >
>> >> > *snip*
>> >> Are OPAL calls similar to ACPI DSMs? I.e. methods for the OS to invoke
>> >> platform firmware services? What's Skiboot?
>> >
>> > Yes, OPAL is the interface to firmware for POWER. Skiboot is the 
>> > open-source (and only) implementation of OPAL.
>>
>>   https://github.com/open-power/skiboot
>>
>> In particular the tokens for calls are defined here:
>>
>>   https://github.com/open-power/skiboot/blob/master/include/opal-api.h#L220
>>
>> And you can grep for the token to find the implementation:
>>
>>   https://github.com/open-power/skiboot/blob/master/hw/npu2-opencapi.c#L2328
>
> I'm not sure I'd encourage anyone to read npu2-opencapi.c. I find it
> hard enough to follow even with access to the workbooks.

Compared to certain firmwares that run on certain other platforms it's
actually pretty readable code ;)

> There's an OPAL call API reference here:
> http://open-power.github.io/skiboot/doc/opal-api/index.html

Even better.

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


RE: [PATCH v4 00/25] Add support for OpenCAPI Persistent Memory devices

2020-04-01 Thread Michael Ellerman
"Alastair D'Silva"  writes:
>> -Original Message-
>> From: Dan Williams 
>> 
>> On Sun, Mar 29, 2020 at 10:23 PM Alastair D'Silva 
>> wrote:
>> >
>> > This series adds support for OpenCAPI Persistent Memory devices on
>> > bare metal (arch/powernv), exposing them as nvdimms so that we can
>> > make use of the existing infrastructure. There already exists a driver
>> > for the same devices abstracted through PowerVM (arch/pseries):
>> > arch/powerpc/platforms/pseries/papr_scm.c
>> >
>> > These devices are connected via OpenCAPI, and present as LPC (lowest
>> coherence point) memory to the system, practically, that means that
>> memory on these cards could be treated as conventional, cache-coherent
>> memory.
>> >
>> > Since the devices are connected via OpenCAPI, they are not enumerated
>> via ACPI. Instead, OpenCAPI links present as pseudo-PCI bridges, with
>> devices below them.
>> >
>> > This series introduces a driver that exposes the memory on these cards as
>> nvdimms, with each card getting it's own bus. This is somewhat complicated
>> by the fact that the cards do not have out of band persistent storage for
>> metadata, so 1 SECTION_SIZE's (see SPARSEMEM) worth of storage is carved
>> out of the top of the card storage to implement the ndctl_config_* calls.
>> 
>> Is it really tied to section-size? Can't that change based on the configured
>> page-size? It's not clear to me why that would be the choice, but I'll dig 
>> into
>> the implementation.
>> 
>
> I had tried using PAGE_SIZE, but ran into problems carving off just 1 page 
> and handing it to the kernel, while leaving the rest as pmem. That was a 
> while ago though, so maybe I should retry it.
>
>> > The driver is not responsible for configuring the NPU (NVLink Processing
>> Unit) BARs to map the LPC memory from the card into the system's physical
>> address space, instead, it requests this to be done via OPAL calls (typically
>> implemented by Skiboot).
>> 
>> Are OPAL calls similar to ACPI DSMs? I.e. methods for the OS to invoke
>> platform firmware services? What's Skiboot?
>> 
>
> Yes, OPAL is the interface to firmware for POWER. Skiboot is the open-source 
> (and only) implementation of OPAL.

  https://github.com/open-power/skiboot

In particular the tokens for calls are defined here:

  https://github.com/open-power/skiboot/blob/master/include/opal-api.h#L220

And you can grep for the token to find the implementation:

  https://github.com/open-power/skiboot/blob/master/hw/npu2-opencapi.c#L2328


cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v4 1/5] mm/memremap_pages: Introduce memremap_compat_align()

2020-03-11 Thread Michael Ellerman
Dan Williams  writes:
> The "sub-section memory hotplug" facility allows memremap_pages() users
> like libnvdimm to compensate for hardware platforms like x86 that have a
> section size larger than their hardware memory mapping granularity.  The
> compensation that sub-section support affords is being tolerant of
> physical memory resources shifting by units smaller (64MiB on x86) than
> the memory-hotplug section size (128 MiB). Where the platform
> physical-memory mapping granularity is limited by the number and
> capability of address-decode-registers in the memory controller.
>
> While the sub-section support allows memremap_pages() to operate on
> sub-section (2MiB) granularity, the Power architecture may still
> require 16MiB alignment on "!radix_enabled()" platforms.
>
> In order for libnvdimm to be able to detect and manage this per-arch
> limitation, introduce memremap_compat_align() as a common minimum
> alignment across all driver-facing memory-mapping interfaces, and let
> Power override it to 16MiB in the "!radix_enabled()" case.
>
> The assumption / requirement for 16MiB to be a viable
> memremap_compat_align() value is that Power does not have platforms
> where its equivalent of address-decode-registers never hardware remaps a
> persistent memory resource on smaller than 16MiB boundaries. Note that I
> tried my best to not add a new Kconfig symbol, but header include
> entanglements defeated the #ifndef memremap_compat_align design pattern
> and the need to export it defeats the __weak design pattern for arch
> overrides.
>
> Based on an initial patch by Aneesh.
>
> Link: 
> http://lore.kernel.org/r/capcyv4gbgnp95apyabcsocea50tqj9b5h__83vgngjq3oug...@mail.gmail.com
> Reported-by: Aneesh Kumar K.V 
> Reported-by: Jeff Moyer 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Reviewed-by: Aneesh Kumar K.V 
> Signed-off-by: Dan Williams 
> ---
>  arch/powerpc/Kconfig  |1 +
>  arch/powerpc/mm/ioremap.c |   21 +
>  drivers/nvdimm/pfn_devs.c |2 +-
>  include/linux/memremap.h  |8 
>  include/linux/mmzone.h|1 +
>  lib/Kconfig   |3 +++
>  mm/memremap.c |   23 +++
>  7 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 497b7d0b2d7e..e6ffe905e2b9 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -122,6 +122,7 @@ config PPC
>   select ARCH_HAS_GCOV_PROFILE_ALL
>   select ARCH_HAS_KCOV
>   select ARCH_HAS_HUGEPD  if HUGETLB_PAGE
> + select ARCH_HAS_MEMREMAP_COMPAT_ALIGN
>   select ARCH_HAS_MMIOWB  if PPC64
>   select ARCH_HAS_PHYS_TO_DMA
>   select ARCH_HAS_PMEM_API
> diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
> index fc669643ce6a..b1a0aebe8c48 100644
> --- a/arch/powerpc/mm/ioremap.c
> +++ b/arch/powerpc/mm/ioremap.c
> @@ -2,6 +2,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -97,3 +98,23 @@ void __iomem *do_ioremap(phys_addr_t pa, phys_addr_t 
> offset, unsigned long size,
>  
>   return NULL;
>  }
> +
> +#ifdef CONFIG_ZONE_DEVICE
> +/*
> + * Override the generic version in mm/memremap.c.
> + *
> + * With hash translation, the direct-map range is mapped with just one
> + * page size selected by htab_init_page_sizes(). Consult
> + * mmu_psize_defs[] to determine the minimum page size alignment.
> +*/
> +unsigned long memremap_compat_align(void)
> +{
> + unsigned int shift = mmu_psize_defs[mmu_linear_psize].shift;
> +
> + if (radix_enabled())
> + return SUBSECTION_SIZE;
> + return max(SUBSECTION_SIZE, 1UL << shift);
> +
> +}
> +EXPORT_SYMBOL_GPL(memremap_compat_align);
> +#endif

LGTM.

Acked-by: Michael Ellerman  (powerpc)

cheers
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH 2/5] mm/memremap_pages: Introduce memremap_compat_align()

2020-02-04 Thread Michael Ellerman
Dan Williams  writes:
> The "sub-section memory hotplug" facility allows memremap_pages() users
> like libnvdimm to compensate for hardware platforms like x86 that have a
> section size larger than their hardware memory mapping granularity.  The
> compensation that sub-section support affords is being tolerant of
> physical memory resources shifting by units smaller (64MiB on x86) than
> the memory-hotplug section size (128 MiB). Where the platform
> physical-memory mapping granularity is limited by the number and
> capability of address-decode-registers in the memory controller.
>
> While the sub-section support allows memremap_pages() to operate on
> sub-section (2MiB) granularity, the Power architecture may still
> require 16MiB alignment on "!radix_enabled()" platforms.
>
> In order for libnvdimm to be able to detect and manage this per-arch
> limitation, introduce memremap_compat_align() as a common minimum
> alignment across all driver-facing memory-mapping interfaces, and let
> Power override it to 16MiB in the "!radix_enabled()" case.
>
> The assumption / requirement for 16MiB to be a viable
> memremap_compat_align() value is that Power does not have platforms
> where its equivalent of address-decode-registers never hardware remaps a
> persistent memory resource on smaller than 16MiB boundaries.
>
> Based on an initial patch by Aneesh.
>
> Link: 
> http://lore.kernel.org/r/capcyv4gbgnp95apyabcsocea50tqj9b5h__83vgngjq3oug...@mail.gmail.com
> Reported-by: Aneesh Kumar K.V 
> Reported-by: Jeff Moyer 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Signed-off-by: Dan Williams 
> ---
>  arch/powerpc/include/asm/io.h |   10 ++
>  drivers/nvdimm/pfn_devs.c |2 +-
>  include/linux/io.h|   23 +++
>  include/linux/mmzone.h|1 +
>  4 files changed, 35 insertions(+), 1 deletion(-)

The powerpc change here looks fine to me.

Acked-by: Michael Ellerman  (powerpc)

cheers

> diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
> index a63ec938636d..0fa2dc483008 100644
> --- a/arch/powerpc/include/asm/io.h
> +++ b/arch/powerpc/include/asm/io.h
> @@ -734,6 +734,16 @@ extern void __iomem * __ioremap_at(phys_addr_t pa, void 
> *ea,
>  unsigned long size, pgprot_t prot);
>  extern void __iounmap_at(void *ea, unsigned long size);
>  
> +#ifdef CONFIG_SPARSEMEM
> +static inline unsigned long memremap_compat_align(void)
> +{
> + if (radix_enabled())
> + return SUBSECTION_SIZE;
> + return (1UL << mmu_psize_defs[mmu_linear_psize].shift);
> +}
> +#define memremap_compat_align memremap_compat_align
> +#endif
> +
>  /*
>   * When CONFIG_PPC_INDIRECT_PIO is set, we use the generic iomap 
> implementation
>   * which needs some additional definitions here. They basically allow PIO
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v2 16/18] powerpc/papr_scm: Switch to numa_map_to_online_node()

2019-11-20 Thread Michael Ellerman
Dan Williams  writes:
> Now that the core exports numa_map_to_online_node() switch to that
> instead of the locally coded duplicate.
>
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 

Acked-by: Michael Ellerman 

cheers

> Cc: "Oliver O'Halloran" 
> Reported-by: "Aneesh Kumar K.V" 
> Signed-off-by: Dan Williams 
> ---
>  arch/powerpc/platforms/pseries/papr_scm.c |   21 +
>  1 file changed, 1 insertion(+), 20 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
> b/arch/powerpc/platforms/pseries/papr_scm.c
> index 33aa59e666e5..ef81515f3b6a 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -284,25 +284,6 @@ int papr_scm_ndctl(struct nvdimm_bus_descriptor 
> *nd_desc, struct nvdimm *nvdimm,
>   return 0;
>  }
>  
> -static inline int papr_scm_node(int node)
> -{
> - int min_dist = INT_MAX, dist;
> - int nid, min_node;
> -
> - if ((node == NUMA_NO_NODE) || node_online(node))
> - return node;
> -
> - min_node = first_online_node;
> - for_each_online_node(nid) {
> - dist = node_distance(node, nid);
> - if (dist < min_dist) {
> - min_dist = dist;
> - min_node = nid;
> - }
> - }
> - return min_node;
> -}
> -
>  static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
>  {
>   struct device *dev = >pdev->dev;
> @@ -347,7 +328,7 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
>  
>   memset(_desc, 0, sizeof(ndr_desc));
>   target_nid = dev_to_node(>pdev->dev);
> - online_nid = papr_scm_node(target_nid);
> + online_nid = numa_map_to_online_node(target_nid);
>   ndr_desc.numa_node = online_nid;
>   ndr_desc.target_node = target_nid;
>   ndr_desc.res = >res;
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH] mm/nvdimm: Add is_ioremap_addr and use that to check ioremap address

2019-07-04 Thread Michael Ellerman
"Aneesh Kumar K.V"  writes:
> Andrew Morton  writes:
>
>> On Mon,  1 Jul 2019 19:10:38 +0530 "Aneesh Kumar K.V" 
>>  wrote:
>>
>>> Architectures like powerpc use different address range to map ioremap
>>> and vmalloc range. The memunmap() check used by the nvdimm layer was
>>> wrongly using is_vmalloc_addr() to check for ioremap range which fails for
>>> ppc64. This result in ppc64 not freeing the ioremap mapping. The side effect
>>> of this is an unbind failure during module unload with papr_scm nvdimm 
>>> driver
>>
>> The patch applies to 5.1.  Does it need a Fixes: and a Cc:stable?
>
> Actually, we want it to be backported to an older kernel possibly one
> that added papr-scm driver, b5beae5e224f ("powerpc/pseries: Add driver
> for PAPR SCM regions"). But that doesn't apply easily. It does apply
> without conflicts to 5.0

Don't worry about where it applies or doesn't, just tag it with the
correct Fixes: and stable versions and then if it doesn't backport
cleanly then we deal with that later.

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/5] mm/resource: move HMM pr_debug() deeper into resource code

2019-01-28 Thread Michael Ellerman
Dave Hansen  writes:
> On 1/25/19 1:18 PM, Bjorn Helgaas wrote:
>> On Thu, Jan 24, 2019 at 5:21 PM Dave Hansen  
>> wrote:
>>> diff -puN kernel/resource.c~move-request_region-check kernel/resource.c
>>> --- a/kernel/resource.c~move-request_region-check   2019-01-24 
>>> 15:13:14.453199539 -0800
>>> +++ b/kernel/resource.c 2019-01-24 15:13:14.458199539 -0800
>>> @@ -1123,6 +1123,16 @@ struct resource * __request_region(struc
>>> conflict = __request_resource(parent, res);
>>> if (!conflict)
>>> break;
>>> +   /*
>>> +* mm/hmm.c reserves physical addresses which then
>>> +* become unavailable to other users.  Conflicts are
>>> +* not expected.  Be verbose if one is encountered.
>>> +*/
>>> +   if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) {
>>> +   pr_debug("Resource conflict with unaddressable "
>>> +"device memory at %#010llx !\n",
>>> +(unsigned long long)start);
>> 
>> I don't object to the change, but are you really OK with this being a
>> pr_debug() message that is only emitted when enabled via either the
>> dynamic debug mechanism or DEBUG being defined?  From the comments, it
>> seems more like a KERN_INFO sort of message.
>
> I left it consistent with the original message that was in the code.
> I'm happy to change it, though, if the consumers of it (Jerome,
> basically) want something different.

At least using pr_debug() doesn't match the comment, ie. the comment
says "Be verbose" but pr_debug() is silent by default.

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 1/5] mm/resource: return real error codes from walk failures

2019-01-28 Thread Michael Ellerman
Dave Hansen  writes:
> On 1/25/19 1:02 PM, Bjorn Helgaas wrote:
>>> @@ -453,7 +453,7 @@ int walk_system_ram_range(unsigned long
>>> unsigned long flags;
>>> struct resource res;
>>> unsigned long pfn, end_pfn;
>>> -   int ret = -1;
>>> +   int ret = -EINVAL;
>> Can you either make a similar change to the powerpc version of
>> walk_system_ram_range() in arch/powerpc/mm/mem.c or explain why it's
>> not needed?  It *seems* like we'd want both versions of
>> walk_system_ram_range() to behave similarly in this respect.
>
> Sure.  A quick grep shows powerpc being the only other implementation.

Ugh gross, why are we reimplementing it? ...

Oh right, memblock vs iomem. We should fix that one day :/

> I'll just add this hunk:
>
>> diff -puN 
>> arch/powerpc/mm/mem.c~memory-hotplug-walk_system_ram_range-returns-neg-1 
>> arch/powerpc/mm/mem.c
>> --- 
>> a/arch/powerpc/mm/mem.c~memory-hotplug-walk_system_ram_range-returns-neg-1  
>> 2019-01-25 12:57:00.04446 -0800
>> +++ b/arch/powerpc/mm/mem.c 2019-01-25 12:58:13.215004263 -0800 
>> @@ -188,7 +188,7 @@ walk_system_ram_range(unsigned long star 
>> struct memblock_region *reg; 
>> unsigned long end_pfn = start_pfn + nr_pages; 
>> unsigned long tstart, tend; 
>> -   int ret = -1; 
>> +   int ret = -EINVAL; 
>
> I'll also dust off the ol' cross-compiler and make sure I didn't
> fat-finger anything.

Modern Fedora & Ubuntu have packaged cross toolchains. Otherwise there's
the kernel.org ones, or bootlin has versions with libc if you need it.

Patch looks fine. That value could only get to userspace if we have no
memory, which would be interesting.

Acked-by: Michael Ellerman  (powerpc)

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [v3,1/2] powerpc/pseries: PAPR persistent memory support

2018-10-18 Thread Michael Ellerman
On Sun, 2018-10-14 at 23:18:27 UTC, Oliver O'Halloran wrote:
> This patch implements support for discovering storage class memory
> devices at boot and for handling hotplug of new regions via RTAS
> hotplug events.
> 
> Signed-off-by: Oliver O'Halloran 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/4c5d87db497832c493ed296157bd17

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/2] powerpc/pseries: Add driver for PAPR SCM regions

2018-10-14 Thread Michael Ellerman
Dan Williams  writes:
> On Sat, Oct 13, 2018 at 5:08 AM Michael Ellerman  wrote:
>>
>> Dan Williams  writes:
>> > On Tue, Oct 9, 2018 at 11:21 PM Oliver O'Halloran  wrote:
>> >>
>> >> Adds a driver that implements support for enabling and accessing PAPR
>> >> SCM regions. Unfortunately due to how the PAPR interface works we can't
>> >> use the existing of_pmem driver (yet) because:
>> >>
>> ...
>> >> +
>> >> +static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
>> >> +{
>> >> +   struct device *dev = >pdev->dev;
>> >> +   struct nd_mapping_desc mapping;
>> >> +   struct nd_region_desc ndr_desc;
>> >> +   unsigned long dimm_flags;
>> >> +
>> >> +   p->bus_desc.ndctl = papr_scm_ndctl;
>> >> +   p->bus_desc.module = THIS_MODULE;
>> >> +   p->bus_desc.of_node = p->pdev->dev.of_node;
>> >> +   p->bus_desc.attr_groups = bus_attr_groups;
>> >> +   p->bus_desc.provider_name = kstrdup(p->pdev->name, GFP_KERNEL);
>> >> +
>> >> +   if (!p->bus_desc.provider_name)
>> >> +   return -ENOMEM;
>> >> +
>> >> +   p->bus = nvdimm_bus_register(NULL, >bus_desc);
>> >> +   if (!p->bus) {
>> >> +   dev_err(dev, "Error creating nvdimm bus %pOF\n", p->dn);
>> >> +   return -ENXIO;
>> >> +   }
>> >> +
>> >> +   dimm_flags = 0;
>> >> +   set_bit(NDD_ALIASING, _flags);
>> >> +
>> >> +   p->nvdimm = nvdimm_create(p->bus, p, papr_scm_dimm_groups,
>> >> +   dimm_flags, PAPR_SCM_DIMM_CMD_MASK, 0, 
>> >> NULL);
>> >
>> > Looks good, although I'm just about to push out commits that change
>> > this function signature to take a 'security_ops' pointer. If you need
>> > a stable branch to base this on, let me know.
>> ...
>> >
>> > Other than that looks ok to me:
>> >
>> > Acked-by: Dan Williams 
>> >
>> > ...just the matter of what to do about function signature change.
>>
>> Yeah that's a bit of a bother.
>>
>> The ideal for me would be that you put the commit that changes the
>> signature by itself in a branch based on 4.19-rc3 (or earlier), and then
>> we could both just merge that.
>>
>> But not sure if that will work with whatever else you're trying to sync
>> up with.
>
> How about this, I'll move the new signature to an 'advanced':
>
> __nvdimm_create()
>
> ...and make the existing:
>
>nvdimm_create()
>
> ...a simple wrapper around the new functionality. That way no matter
> what merge order we should be ok.

Yep that's much easier, thanks.

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/2] powerpc/pseries: Add driver for PAPR SCM regions

2018-10-13 Thread Michael Ellerman
Dan Williams  writes:
> On Tue, Oct 9, 2018 at 11:21 PM Oliver O'Halloran  wrote:
>>
>> Adds a driver that implements support for enabling and accessing PAPR
>> SCM regions. Unfortunately due to how the PAPR interface works we can't
>> use the existing of_pmem driver (yet) because:
>>
...
>> +
>> +static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
>> +{
>> +   struct device *dev = >pdev->dev;
>> +   struct nd_mapping_desc mapping;
>> +   struct nd_region_desc ndr_desc;
>> +   unsigned long dimm_flags;
>> +
>> +   p->bus_desc.ndctl = papr_scm_ndctl;
>> +   p->bus_desc.module = THIS_MODULE;
>> +   p->bus_desc.of_node = p->pdev->dev.of_node;
>> +   p->bus_desc.attr_groups = bus_attr_groups;
>> +   p->bus_desc.provider_name = kstrdup(p->pdev->name, GFP_KERNEL);
>> +
>> +   if (!p->bus_desc.provider_name)
>> +   return -ENOMEM;
>> +
>> +   p->bus = nvdimm_bus_register(NULL, >bus_desc);
>> +   if (!p->bus) {
>> +   dev_err(dev, "Error creating nvdimm bus %pOF\n", p->dn);
>> +   return -ENXIO;
>> +   }
>> +
>> +   dimm_flags = 0;
>> +   set_bit(NDD_ALIASING, _flags);
>> +
>> +   p->nvdimm = nvdimm_create(p->bus, p, papr_scm_dimm_groups,
>> +   dimm_flags, PAPR_SCM_DIMM_CMD_MASK, 0, NULL);
>
> Looks good, although I'm just about to push out commits that change
> this function signature to take a 'security_ops' pointer. If you need
> a stable branch to base this on, let me know.
...
>
> Other than that looks ok to me:
>
> Acked-by: Dan Williams 
>
> ...just the matter of what to do about function signature change.

Yeah that's a bit of a bother.

The ideal for me would be that you put the commit that changes the
signature by itself in a branch based on 4.19-rc3 (or earlier), and then
we could both just merge that.

But not sure if that will work with whatever else you're trying to sync
up with.

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/2] powerpc/pseries: Add driver for PAPR SCM regions

2018-10-10 Thread Michael Ellerman
Oliver O'Halloran  writes:

> Adds a driver that implements support for enabling and accessing PAPR
> SCM regions. Unfortunately due to how the PAPR interface works we can't
> use the existing of_pmem driver (yet) because:
>
>  a) The guest is required to use the H_SCM_BIND_MEM h-call to add
> add the SCM region to it's physical address space, and
>  b) There is currently no mechanism for relating a bare of_pmem region
> to the backing DIMM (or not-a-DIMM for our case).
>
> Both of these are easily handled by rolling the functionality into a
> seperate driver so here we are...
>
> Signed-off-by: Oliver O'Halloran 
> ---
> The alternative implementation here is that we have the pseries code
> do the h-calls and craft a pmem-region@ node based on that.
> However, that doesn't solve b) and mpe has expressed his dislike of
> adding new stuff to the DT at runtime so i'd say that's a non-starter.

Hmm, from memory you yelled something at me across the office about
that, so my response may not have been entirely well considered.

I'm not quite sure what the state of the OF overlays support is, but
that would be The Right Way to do that sort of modification to the
device tree at runtime.

If we merged this and then later got the of_pmem driver to work for us
would there be any user-visible change?

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v3 3/4] doc/devicetree: Persistent memory region bindings

2018-04-07 Thread Michael Ellerman
Oliver O'Halloran <ooh...@gmail.com> writes:
> Add device-tree binding documentation for the nvdimm region driver.
>
> Cc: devicet...@vger.kernel.org
> Signed-off-by: Oliver O'Halloran <ooh...@gmail.com>
> ---
> v2: Changed name from nvdimm-region to pmem-region.
> Cleaned up the example binding and fixed the overlapping regions.
> Added support for multiple regions in a single reg.
> v3: Removed platform bus boilerplate from the example.
> Changed description of the volatile and reg properties
> to make them more clear.

Thanks.

Acked-by: Michael Ellerman <m...@ellerman.id.au>

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v3 1/3] mm, powerpc: use vma_kernel_pagesize() in vma_mmu_pagesize()

2018-03-02 Thread Michael Ellerman
Dan Williams <dan.j.willi...@intel.com> writes:

> The current powerpc definition of vma_mmu_pagesize() open codes looking
> up the page size via hstate. It is identical to the generic
> vma_kernel_pagesize() implementation.
>
> Now, vma_kernel_pagesize() is growing support for determining the
> page size of Device-DAX vmas in addition to the existing Hugetlbfs page
> size determination.
>
> Ideally, if the powerpc vma_mmu_pagesize() used vma_kernel_pagesize() it
> would automatically benefit from any new vma-type support that is added
> to vma_kernel_pagesize(). However, the powerpc vma_mmu_pagesize() is
> prevented from calling vma_kernel_pagesize() due to a circular header
> dependency that requires vma_mmu_pagesize() to be defined before
> including .
>
> Break this circular dependency by defining the default
> vma_mmu_pagesize() as a __weak symbol to be overridden by the powerpc
> version.
>
> Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org>
> Cc: Paul Mackerras <pau...@samba.org>
> Cc: Michael Ellerman <m...@ellerman.id.au>
> Signed-off-by: Dan Williams <dan.j.willi...@intel.com>
> ---
>  arch/powerpc/include/asm/hugetlb.h |6 --
>  arch/powerpc/mm/hugetlbpage.c  |5 +
>  mm/hugetlb.c   |8 +++-
>  3 files changed, 4 insertions(+), 15 deletions(-)

This looks OK to me. I was worried switching to a weak symbol would mean
it doesn't get inlined, but it's not inlined today anyway!

Acked-by: Michael Ellerman <m...@ellerman.id.au> (powerpc)

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT

2017-12-21 Thread Michael Ellerman
Matthew Wilcox  writes:

> On Mon, Dec 18, 2017 at 01:35:47PM -0700, Ross Zwisler wrote:
>> What I'm hoping to do with this series is to just provide a sysfs
>> representation of the HMAT so that applications can know which NUMA nodes to
>> select with existing utilities like numactl.  This series does not currently
>> alter any kernel behavior, it only provides a sysfs interface.
>> 
>> Say for example you had a system with some high bandwidth memory (HBM), and
>> you wanted to use it for a specific application.  You could use the sysfs
>> representation of the HMAT to figure out which memory target held your HBM.
>> You could do this by looking at the local bandwidth values for the various
>> memory targets, so:
>> 
>>  # grep . /sys/devices/system/hmat/mem_tgt*/local_init/write_bw_MBps
>>  /sys/devices/system/hmat/mem_tgt2/local_init/write_bw_MBps:81920
>>  /sys/devices/system/hmat/mem_tgt3/local_init/write_bw_MBps:40960
>>  /sys/devices/system/hmat/mem_tgt4/local_init/write_bw_MBps:40960
>>  /sys/devices/system/hmat/mem_tgt5/local_init/write_bw_MBps:40960
>> 
>> and look for the one that corresponds to your HBM speed. (These numbers are
>> made up, but you get the idea.)
>
> Presumably ACPI-based platforms will not be the only ones who have the
> ability to expose different bandwidth memories in the future.  I think
> we need a platform-agnostic way ... right, PowerPC people?

Yes!

I don't have any detail at hand but will try and rustle something up.

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 1/2] mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct

2017-12-13 Thread Michael Ellerman
Michal Hocko <mho...@kernel.org> writes:

> On Thu 07-12-17 19:30:55, Dan Williams wrote:
>> When device-dax is operating in huge-page mode we want it to behave like
>> hugetlbfs and report the MMU page mapping size that is being enforced by
>> the vma. Similar to commit 31383c6865a5 "mm, hugetlbfs: introduce
>> ->split() to vm_operations_struct" it would be messy to teach
>> vma_mmu_pagesize() about device-dax page mapping sizes in the same
>> (hstate) way that hugetlbfs communicates this attribute.  Instead, these
>> patches introduce a new ->pagesize() vm operation.
>> 
>> Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org>
>> Cc: Paul Mackerras <pau...@samba.org>
>> Cc: Michael Ellerman <m...@ellerman.id.au>
>> Reported-by: Jane Chu <jane@oracle.com>
>> Signed-off-by: Dan Williams <dan.j.willi...@intel.com>
>
> My build battery choked on the following
> In file included from drivers/infiniband/core/umem_odp.c:41:0:
> ./include/linux/hugetlb.h: In function 'vma_kernel_pagesize':
> ./include/linux/hugetlb.h:262:32: error: dereferencing pointer to incomplete 
> type
>   if (vma->vm_ops && vma->vm_ops->pagesize)
> ^
> ./include/linux/hugetlb.h:263:21: error: dereferencing pointer to incomplete 
> type
>return vma->vm_ops->pagesize(vma);
>
> I thought that adding #include  into linux/hugetlb.h would
> be sufficient but then it failed for powerpc defconfig which overrides
> vma_kernel_pagesize
> In file included from ./include/linux/hugetlb.h:452:0,
>  from arch/powerpc/mm/hugetlbpage.c:14:
> ./arch/powerpc/include/asm/hugetlb.h:131:26: error: redefinition of 
> 'vma_mmu_pagesize'
>  #define vma_mmu_pagesize vma_mmu_pagesize
>   ^
> arch/powerpc/mm/hugetlbpage.c:563:15: note: in expansion of macro 
> 'vma_mmu_pagesize'
>  unsigned long vma_mmu_pagesize(struct vm_area_struct *vma)
>^
> In file included from arch/powerpc/mm/hugetlbpage.c:14:0:
> ./include/linux/hugetlb.h:275:29: note: previous definition of 
> 'vma_mmu_pagesize' was here
>  static inline unsigned long vma_mmu_pagesize(struct vm_area_struct *vma)
>
> So it looks this needs something more laborous.

This builds for me.

cheers


diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index 14c9d44f355b..3cc6ca1bdaf2 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -123,6 +123,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, 
unsigned long addr,
  * to override the version in mm/hugetlb.c
  */
 #define vma_mmu_pagesize vma_mmu_pagesize
+unsigned long vma_mmu_pagesize(struct vm_area_struct *vma);
 
 /*
  * If the arch doesn't supply something else, assume that hugepage
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index a9b9083c5e49..c6a2e577e842 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -568,10 +568,7 @@ unsigned long vma_mmu_pagesize(struct vm_area_struct *vma)
if (!radix_enabled())
return 1UL << mmu_psize_to_shift(psize);
 #endif
-   if (!is_vm_hugetlb_page(vma))
-   return PAGE_SIZE;
-
-   return huge_page_size(hstate_vma(vma));
+   return vma_kernel_pagesize(vma);
 }
 
 static inline bool is_power_of_4(unsigned long x)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 6e3696c7b35a..fe7b74325856 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -394,10 +394,6 @@ static inline unsigned long huge_page_size(struct hstate 
*h)
return (unsigned long)PAGE_SIZE << h->order;
 }
 
-extern unsigned long vma_kernel_pagesize(struct vm_area_struct *vma);
-
-extern unsigned long vma_mmu_pagesize(struct vm_area_struct *vma);
-
 static inline unsigned long huge_page_mask(struct hstate *h)
 {
return h->mask;
@@ -430,6 +426,30 @@ static inline unsigned int blocks_per_huge_page(struct 
hstate *h)
 
 #include 
 
+/*
+ * Return the size of the pages allocated when backing a VMA. In the majority
+ * cases this will be same size as used by the page table entries.
+ */
+static inline unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
+{
+   if (vma->vm_ops && vma->vm_ops->pagesize)
+   return vma->vm_ops->pagesize(vma);
+   return PAGE_SIZE;
+}
+
+/*
+ * Return the page size being used by the MMU to back a VMA. In the majority
+ * of cases, the page size used by the kernel matches the MMU size. On
+ * architectures where it differs, an architecture-specific version of this
+ * function is required.
+ */
+#ifndef vma_mmu_pagesize
+static inline unsigned long vma_mmu_pagesize(struct vm_area_struct 

Re: [PATCH 2/3] libnvdimm: Add a device-tree interface

2017-11-23 Thread Michael Ellerman
Rob Herring  writes:
> On Thu, Nov 16, 2017 at 04:51:30AM +1100, Oliver O'Halloran wrote:
...
>> diff --git a/drivers/nvdimm/of_nvdimm.c b/drivers/nvdimm/of_nvdimm.c
>> new file mode 100644
>> index ..765cbbae8bcb
>> --- /dev/null
>> +++ b/drivers/nvdimm/of_nvdimm.c
>> @@ -0,0 +1,202 @@
>> +/*
>> + * Copyright 2017, IBM Corporation
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, you can access it online at
>> + * http://www.gnu.org/licenses/gpl-2.0.html.
>
> This can be SPDX tag.

Specifically:

// SPDX-License-Identifier: GPL-2.0+
/*
 * Copyright 2017, IBM Corporation
 */

...

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 02/15] mm, dax: introduce pfn_t_special()

2017-11-02 Thread Michael Ellerman
Dan Williams  writes:

> In support of removing the VM_MIXEDMAP indication from DAX VMAs,
> introduce pfn_t_special() for drivers to indicate that _PAGE_SPECIAL
> should be used for DAX ptes. This also helps identify drivers like
> dccssblk that only want to use DAX in a read-only fashion without
> get_user_pages() support.
>
> Ideally we could delete axonram and dcssblk DAX support, but if we need
> to keep it better make it explicit that axonram and dcssblk only support
> a sub-set of DAX due to missing _PAGE_DEVMAP support.

I sent a patch to remove axonram (sorry meant to Cc you):

  http://patchwork.ozlabs.org/patch/833588/

Will see if there's any feedback.

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 8/9] powerpc/mm: Wire up hpte_removebolted for powernv

2017-04-13 Thread Michael Ellerman
Oliver O'Halloran  writes:
> On Wed, Apr 12, 2017 at 11:53 AM, Balbir Singh  wrote:
>>
>> The API also does not clear linear_map_hash_slots[] under DEBUG_PAGEALLOC
>
> I'm not sure what API you're referring to here. The tracking for
> linear_map_hash_slots[] is agnostic of mmu_hash_ops so we shouldn't be
> touching it here. It also looks like DEBUG_PAGEALLOC is a bit broken
> with hotplugged memory anyway so I think that's a fix for a different
> patch.

It's old and has probably bit-rotted, so I'm happy for it to be fixed up
later if necessary.

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 6/9] powerpc, mm: Enable ZONE_DEVICE on powerpc

2017-04-11 Thread Michael Ellerman
Stephen Rothwell  writes:

> Hi Oliver,
>
> On Wed, 12 Apr 2017 03:42:30 +1000 Oliver O'Halloran  wrote:
>>
>> Flip the switch. Running around and screaming "IT'S ALIVE" is optional,
>> but recommended.
>> 
>> Signed-off-by: Oliver O'Halloran 
>> ---
>>  mm/Kconfig | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index 43d000e44424..d696af58f97f 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -724,7 +724,7 @@ config ZONE_DEVICE
>>  depends on MEMORY_HOTPLUG
>>  depends on MEMORY_HOTREMOVE
>>  depends on SPARSEMEM_VMEMMAP
>> -depends on X86_64 #arch_add_memory() comprehends device memory
>> +depends on (X86_64 || PPC_BOOK3S_64)  #arch_add_memory() comprehends 
>> device memory
>>  
>
> That's fine, but at what point do we create
> CONFIG_ARCH_HAVE_ZONE_DEVICE, replace the "depends on
> " above with "depends on ARCH_HAVE_ZONE_DEVICE" and
> select that from the appropriate places?

You mean CONFIG_HAVE_ZONE_DEVICE :)

A patch to do that, and update x86, would be a good precursor to this
series. It could probably go in right now, and be in place for when this
series lands.

cheers
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm