Re: [PATCH v2 1/3] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction

2017-10-18 Thread Leizhen (ThunderTown)


On 2017/10/18 20:58, Will Deacon wrote:
> Hi Thunder,
> 
> On Tue, Sep 12, 2017 at 09:00:36PM +0800, Zhen Lei wrote:
>> Because all TLBI commands should be followed by a SYNC command, to make
>> sure that it has been completely finished. So we can just add the TLBI
>> commands into the queue, and put off the execution until meet SYNC or
>> other commands. To prevent the followed SYNC command waiting for a long
>> time because of too many commands have been delayed, restrict the max
>> delayed number.
>>
>> According to my test, I got the same performance data as I replaced writel
>> with writel_relaxed in queue_inc_prod.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 42 +-
>>  1 file changed, 37 insertions(+), 5 deletions(-)
> 
> If we want to go down the route of explicit command batching, I'd much
> rather do it by implementing the iotlb_range_add callback in the driver,
> and have a fixed-length array of batched ranges on the domain. We could
I think even if iotlb_range_add callback is implemented, this patch is still 
valuable. The main purpose
of this patch is to reduce dsb operation. So in the scenario with 
iotlb_range_add implemented:
.iotlb_range_add:
spin_lock_irqsave(>cmdq.lock, flags);
...
add tlbi range-1 to cmq-queue
...
add tlbi range-n to cmq-queue   //n
dsb
...
spin_unlock_irqrestore(>cmdq.lock, flags);

.iotlb_sync
spin_lock_irqsave(>cmdq.lock, flags);
...
add cmd_sync to cmq-queue
dsb
...
spin_unlock_irqrestore(>cmdq.lock, flags);

Although iotlb_range_add can reduce n-1 dsb operations, but there are still 1 
left. If n is not large enough,
this patch is helpful.


> potentially toggle this function pointer based on the compatible string too,
> if it shows only to benefit some systems.
[
On 2017/9/19 12:31, Nate Watterson wrote:
I tested these (2) patches on QDF2400 hardware and saw performance
improvements in line with those I reported when testing the original
series.
]

I'm not sure whether this patch can improve performance on QDF2400, because 
there are two patches. But at least
it seems harmless, maybe the other hardware platforms are the same.

> 
> Will
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/1] Input: ims-pcu - fix typo in an error log

2017-11-23 Thread Leizhen (ThunderTown)


On 2017/11/24 15:17, Joe Perches wrote:
> On Fri, 2017-11-24 at 14:59 +0800, Zhen Lei wrote:
>> Tiny typo fixed in an error log.
>>
>> I found this when I backported the CVE-2017-16645 patch:
>> ea04efee7635 ("Input: ims-psu - check if CDC union descriptor is sane")
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/input/misc/ims-pcu.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/input/misc/ims-pcu.c b/drivers/input/misc/ims-pcu.c
> []
>> @@ -1651,7 +1651,7 @@ static void ims_pcu_buffers_free(struct ims_pcu *pcu)
>>  return union_desc;
>>
>>  dev_err(>dev,
>> -"Union descriptor to short (%d vs %zd\n)",
>> +"Union descriptor too short (%d vs %zd\n)",
> 
> And this format is incorrect too.  It should be:
> 
> + "Union descriptor too short (%d vs %zd)\n",
> 
> with the close parenthesis before the newline, not after.
You are very observant. Do I need to post v2? It seems that we can simply 
modify it directly.

> 
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/1] aio: make sure the input "timeout" value is valid

2017-12-13 Thread Leizhen (ThunderTown)


On 2017/12/14 3:31, Matthew Wilcox wrote:
> On Wed, Dec 13, 2017 at 11:27:00AM -0500, Jeff Moyer wrote:
>> Matthew Wilcox  writes:
>>
>>> On Wed, Dec 13, 2017 at 09:42:52PM +0800, Zhen Lei wrote:
 Below information is reported by a lower kernel version, and I saw the
 problem still exist in current version.
>>>
>>> I think you're right, but what an awful interface we have here!
>>> The user must not only fetch it, they must validate it separately?
>>> And if they forget, then userspace is provoking undefined behaviour?  Ugh.
>>> Why not this:
>>
>> Why not go a step further and have get_timespec64 check for validity?
>> I wonder what caller doesn't want that to happen...
I tried this before. But I found some places call get_timespec64 in the 
following function.
If we do the check in get_timespec64, the check will be duplicated.

For example:
static long do_pselect(int n, fd_set __user *inp, fd_set __user *outp,

if (get_timespec64(, tsp))
return -EFAULT;

to = _time;
if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))

int poll_select_set_timeout(struct timespec64 *to, time64_t sec, long nsec)
{
struct timespec64 ts = {.tv_sec = sec, .tv_nsec = nsec};

if (!timespec64_valid())
return -EINVAL;

> 
> There are some which don't today.  I'm hoping Deepa takes this and goes
> off and fixes them all up.
As my search results, just the case I mentioned above, which may cause 
duplicate check.
So if we don't care the slightly performance drop, maybe we should do 
timespec64_valid
check in get_timespec64. I can try this in v2. Otherwise, use your method.

> 
> .
> 

-- 
Thanks!
BestRegards



Re: Is this a kernel BUG? ///Re: [Question] Can we use SIGRTMIN when vdso disabled on X86?

2018-06-06 Thread Leizhen (ThunderTown)
I found that glibc has already dealt with this case. So this issue must have 
been met before, should it be maintained by libc/user?

if (GLRO(dl_sysinfo_dso) == NULL)
{
kact.sa_flags |= SA_RESTORER;

kact.sa_restorer = ((act->sa_flags & SA_SIGINFO)
? _rt : );
}


On 2018/6/6 15:52, Leizhen (ThunderTown) wrote:
> 
> 
> On 2018/6/5 19:24, Leizhen (ThunderTown) wrote:
>> After I executed "echo 0 > /proc/sys/abi/vsyscall32" to disable vdso, the 
>> rt_sigaction01 test case from ltp_2015 failed.
>> The test case source code please refer to the attachment, and the output as 
>> blow:
>>
>> -
>> ./rt_sigaction01
>> rt_sigaction010  TINFO  :  signal: 34
>> rt_sigaction011  TPASS  :  rt_sigaction call succeeded: result = 0
>> rt_sigaction010  TINFO  :  sa.sa_flags = SA_RESETHAND|SA_SIGINFO
>> rt_sigaction010  TINFO  :  Signal Handler Called with signal number 34
>>
>> Segmentation fault
>> --
>>
>>
>> Is this the desired result? In function ia32_setup_rt_frame, I found below 
>> code:
>>
>>  if (ksig->ka.sa.sa_flags & SA_RESTORER)
>>  restorer = ksig->ka.sa.sa_restorer;
>>  else
>>  restorer = current->mm->context.vdso +
>>  vdso_image_32.sym___kernel_rt_sigreturn;
>>  put_user_ex(ptr_to_compat(restorer), >pretcode);
>>
>> Because the vdso is disabled, so current->mm->context.vdso is NULL, which 
>> cause the result of frame->pretcode invalid.
>>
>> I'm not sure whether this is a kernel bug or just an error of test case 
>> itself. Can anyone help me?
>>
> 

-- 
Thanks!
BestRegards



[Question] Can we use SIGRTMIN when vdso disabled on X86?

2018-06-05 Thread Leizhen (ThunderTown)
After I executed "echo 0 > /proc/sys/abi/vsyscall32" to disable vdso, the 
rt_sigaction01 test case from ltp_2015 failed.
The test case source code please refer to the attachment, and the output as 
blow:

-
./rt_sigaction01
rt_sigaction010  TINFO  :  signal: 34
rt_sigaction011  TPASS  :  rt_sigaction call succeeded: result = 0
rt_sigaction010  TINFO  :  sa.sa_flags = SA_RESETHAND|SA_SIGINFO
rt_sigaction010  TINFO  :  Signal Handler Called with signal number 34

Segmentation fault
--


Is this the desired result? In function ia32_setup_rt_frame, I found below code:

if (ksig->ka.sa.sa_flags & SA_RESTORER)
restorer = ksig->ka.sa.sa_restorer;
else
restorer = current->mm->context.vdso +
vdso_image_32.sym___kernel_rt_sigreturn;
put_user_ex(ptr_to_compat(restorer), >pretcode);

Because the vdso is disabled, so current->mm->context.vdso is NULL, which cause 
the result of frame->pretcode invalid.

I'm not sure whether this is a kernel bug or just an error of test case itself. 
Can anyone help me?

-- 
Thanks!
BestRegards
/**/
/* Copyright (c) Crackerjack Project., 2007   */
/**/
/* This program is free software;  you can redistribute it and/or modify  */
/* it under the terms of the GNU General Public License as published by   */
/* the Free Software Foundation; either version 2 of the License, or  */
/* (at your option) any later version.*/
/**/
/* This program is distributed in the hope that it will be useful,*/
/* but WITHOUT ANY WARRANTY;  without even the implied warranty of*/
/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See  */
/* the GNU General Public License for more details.   */
/**/
/* You should have received a copy of the GNU General Public License  */
/* along with this program;  if not, write to the Free Software Foundation,   */
/* Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA   */
/**/
/* History: Porting from Crackerjack to LTP is done by*/
/*  Manas Kumar Nayak makna...@in.ibm.com>*/
/**/

/**/
/* Description: This tests the rt_sigaction() syscall */
/*  rt_sigaction alters an action taken by a process on receipt   */
/*  of a particular signal. The action is specified by the*/
/*  sigaction structure. The previous action on the signal is */
/*  saved in oact.sigsetsize should indicate the size of a*/
/*  sigset_t type.*/
/**/

#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include "test.h"
#include "linux_syscall_numbers.h"
#include "lapi/rt_sigaction.h"

char *TCID = "rt_sigaction01";
static int testno;
int TST_TOTAL = 1;

static void cleanup(void)
{
tst_rmdir();
}

static void setup(void)
{
TEST_PAUSE;
tst_tmpdir();
}

static int test_flags[] =
{ SA_RESETHAND | SA_SIGINFO, SA_RESETHAND, SA_RESETHAND | SA_SIGINFO,
SA_RESETHAND | SA_SIGINFO, SA_NOMASK };
char *test_flags_list[] =
{ "SA_RESETHAND|SA_SIGINFO", "SA_RESETHAND", "SA_RESETHAND|SA_SIGINFO",
"SA_RESETHAND|SA_SIGINFO", "SA_NOMASK" };

static void handler(int sig)
{
tst_resm(TINFO, "Signal Handler Called with signal number %d\n", sig);
return;
}

static int set_handler(int sig, int sig_to_mask, int mask_flags)
{
struct sigaction sa, oldaction;

sa.sa_handler = (void *)handler;
sa.sa_flags = mask_flags;
sigemptyset(_mask);
sigaddset(_mask, sig);

return ltp_rt_sigaction(sig, , , SIGSETSIZE);
}

int main(int ac, char **av)
{
unsigned int flag;
int signal;
int lc;

tst_parse_opts(ac, av, NULL, NULL);

setup();

for (lc = 0; TEST_LOOPING(lc); ++lc) {

tst_count = 0;

for (testno = 0; testno < TST_TOTAL; ++testno) {

for (signal = SIGRTMIN; signal <= SIGRTMAX; signal++) {
for (flag = 0;
 flag <
 

Is this a kernel BUG? ///Re: [Question] Can we use SIGRTMIN when vdso disabled on X86?

2018-06-06 Thread Leizhen (ThunderTown)



On 2018/6/5 19:24, Leizhen (ThunderTown) wrote:
> After I executed "echo 0 > /proc/sys/abi/vsyscall32" to disable vdso, the 
> rt_sigaction01 test case from ltp_2015 failed.
> The test case source code please refer to the attachment, and the output as 
> blow:
> 
> -
> ./rt_sigaction01
> rt_sigaction010  TINFO  :  signal: 34
> rt_sigaction011  TPASS  :  rt_sigaction call succeeded: result = 0
> rt_sigaction010  TINFO  :  sa.sa_flags = SA_RESETHAND|SA_SIGINFO
> rt_sigaction010  TINFO  :  Signal Handler Called with signal number 34
> 
> Segmentation fault
> --
> 
> 
> Is this the desired result? In function ia32_setup_rt_frame, I found below 
> code:
> 
>   if (ksig->ka.sa.sa_flags & SA_RESTORER)
>   restorer = ksig->ka.sa.sa_restorer;
>   else
>   restorer = current->mm->context.vdso +
>   vdso_image_32.sym___kernel_rt_sigreturn;
>   put_user_ex(ptr_to_compat(restorer), >pretcode);
> 
> Because the vdso is disabled, so current->mm->context.vdso is NULL, which 
> cause the result of frame->pretcode invalid.
> 
> I'm not sure whether this is a kernel bug or just an error of test case 
> itself. Can anyone help me?
> 

-- 
Thanks!
BestRegards



Re: Is this a kernel BUG? ///Re: [Question] Can we use SIGRTMIN when vdso disabled on X86?

2018-06-06 Thread Leizhen (ThunderTown)



On 2018/6/7 1:48, h...@zytor.com wrote:
> On June 6, 2018 2:17:42 AM PDT, "Leizhen (ThunderTown)" 
>  wrote:
>> I found that glibc has already dealt with this case. So this issue must
>> have been met before, should it be maintained by libc/user?
>>
>>  if (GLRO(dl_sysinfo_dso) == NULL)
>>  {
>>  kact.sa_flags |= SA_RESTORER;
>>
>>  kact.sa_restorer = ((act->sa_flags & SA_SIGINFO)
>>      ? _rt : );
>>  }
>>
>>
>> On 2018/6/6 15:52, Leizhen (ThunderTown) wrote:
>>>
>>>
>>> On 2018/6/5 19:24, Leizhen (ThunderTown) wrote:
>>>> After I executed "echo 0 > /proc/sys/abi/vsyscall32" to disable
>> vdso, the rt_sigaction01 test case from ltp_2015 failed.
>>>> The test case source code please refer to the attachment, and the
>> output as blow:
>>>>
>>>> -
>>>> ./rt_sigaction01
>>>> rt_sigaction010  TINFO  :  signal: 34
>>>> rt_sigaction011  TPASS  :  rt_sigaction call succeeded: result =
>> 0
>>>> rt_sigaction010  TINFO  :  sa.sa_flags = SA_RESETHAND|SA_SIGINFO
>>>> rt_sigaction010  TINFO  :  Signal Handler Called with signal
>> number 34
>>>>
>>>> Segmentation fault
>>>> --
>>>>
>>>>
>>>> Is this the desired result? In function ia32_setup_rt_frame, I found
>> below code:
>>>>
>>>>if (ksig->ka.sa.sa_flags & SA_RESTORER)
>>>>restorer = ksig->ka.sa.sa_restorer;
>>>>else
>>>>restorer = current->mm->context.vdso +
>>>>vdso_image_32.sym___kernel_rt_sigreturn;
>>>>put_user_ex(ptr_to_compat(restorer), >pretcode);
>>>>
>>>> Because the vdso is disabled, so current->mm->context.vdso is NULL,
>> which cause the result of frame->pretcode invalid.
>>>>
>>>> I'm not sure whether this is a kernel bug or just an error of test
>> case itself. Can anyone help me?
>>>>
>>>
> 
> The use of signals without SA_RESTORER is considered obsolete, but it's 
> somewhat surprising that the vdso isn't there; it should be mapped even for 
> static binaries esp. on i386 since it is the preferred way to do system calls 
> (you don't need to parse the ELF for that.) Are you explicitly disabling the 
> VDSO? If so, Don't Do That.

Yes, the vdso was explicitly disabled by the tester. Thanks.

> 

-- 
Thanks!
BestRegards



Re: Is this a kernel BUG? ///Re: [Question] Can we use SIGRTMIN when vdso disabled on X86?

2018-06-06 Thread Leizhen (ThunderTown)



On 2018/6/7 1:01, Andy Lutomirski wrote:
> On Wed, Jun 6, 2018 at 2:18 AM Leizhen (ThunderTown)
>  wrote:
>>
>> I found that glibc has already dealt with this case. So this issue must have 
>> been met before, should it be maintained by libc/user?
>>
>> if (GLRO(dl_sysinfo_dso) == NULL)
>> {
>> kact.sa_flags |= SA_RESTORER;
>>
>> kact.sa_restorer = ((act->sa_flags & SA_SIGINFO)
>>         ? _rt : );
>> }
>>
>>
>> On 2018/6/6 15:52, Leizhen (ThunderTown) wrote:
>>>
>>>
>>> On 2018/6/5 19:24, Leizhen (ThunderTown) wrote:
>>>> After I executed "echo 0 > /proc/sys/abi/vsyscall32" to disable vdso, the 
>>>> rt_sigaction01 test case from ltp_2015 failed.
>>>> The test case source code please refer to the attachment, and the output 
>>>> as blow:
>>>>
>>>> -
>>>> ./rt_sigaction01
>>>> rt_sigaction010  TINFO  :  signal: 34
>>>> rt_sigaction011  TPASS  :  rt_sigaction call succeeded: result = 0
>>>> rt_sigaction010  TINFO  :  sa.sa_flags = SA_RESETHAND|SA_SIGINFO
>>>> rt_sigaction010  TINFO  :  Signal Handler Called with signal number 34
>>>>
>>>> Segmentation fault
>>>> --
>>>>
>>>>
>>>> Is this the desired result? In function ia32_setup_rt_frame, I found below 
>>>> code:
>>>>
>>>>  if (ksig->ka.sa.sa_flags & SA_RESTORER)
>>>>  restorer = ksig->ka.sa.sa_restorer;
>>>>  else
>>>>  restorer = current->mm->context.vdso +
>>>>  vdso_image_32.sym___kernel_rt_sigreturn;
>>>>  put_user_ex(ptr_to_compat(restorer), >pretcode);
>>>>
>>>> Because the vdso is disabled, so current->mm->context.vdso is NULL, which 
>>>> cause the result of frame->pretcode invalid.
>>>>
>>>> I'm not sure whether this is a kernel bug or just an error of test case 
>>>> itself. Can anyone help me?
>>>>
>>>
>>
>>
> 
> I can't tell from your email what you're testing, what behavior you
> expect, and what you saw.  A program that sets up a signal handler
> without supplying a restorer will not work if the vDSO is off, and
> this is by design.
OK, so that the user should take care whether the vDSO is disabled by itself or 
not, and use different strategies to process it appropriately, like glibc.

> 
> (FWIW, there is a very longstanding libc bug that causes this case to
> get severely screwed up if the user's SS is not the expected value,
> and that bug was just fixed very recently.  But I doubt this is what
> you're seeing.)
> 
> I suppose we could improve the kernel to at least push NULL instead of
> some random address a bit above 0, but it'll still crash.
Should we add a warning? Which may help the user to aware this error in time.

> 
> .
> 

-- 
Thanks!
BestRegards



Re: Is this a kernel BUG? ///Re: [Question] Can we use SIGRTMIN when vdso disabled on X86?

2018-06-06 Thread Leizhen (ThunderTown)



On 2018/6/7 10:39, Andy Lutomirski wrote:
> 
> 
>> On Jun 6, 2018, at 7:05 PM, Leizhen (ThunderTown) 
>>  wrote:
>>
>>
>>
>>> On 2018/6/7 1:01, Andy Lutomirski wrote:
>>> On Wed, Jun 6, 2018 at 2:18 AM Leizhen (ThunderTown)
>>>  wrote:
>>>>
>>>> I found that glibc has already dealt with this case. So this issue must 
>>>> have been met before, should it be maintained by libc/user?
>>>>
>>>>if (GLRO(dl_sysinfo_dso) == NULL)
>>>>{
>>>>kact.sa_flags |= SA_RESTORER;
>>>>
>>>>    kact.sa_restorer = ((act->sa_flags & SA_SIGINFO)
>>>>? _rt : );
>>>>}
>>>>
>>>>
>>>>> On 2018/6/6 15:52, Leizhen (ThunderTown) wrote:
>>>>>
>>>>>
>>>>>> On 2018/6/5 19:24, Leizhen (ThunderTown) wrote:
>>>>>> After I executed "echo 0 > /proc/sys/abi/vsyscall32" to disable vdso, 
>>>>>> the rt_sigaction01 test case from ltp_2015 failed.
>>>>>> The test case source code please refer to the attachment, and the output 
>>>>>> as blow:
>>>>>>
>>>>>> -
>>>>>> ./rt_sigaction01
>>>>>> rt_sigaction010  TINFO  :  signal: 34
>>>>>> rt_sigaction011  TPASS  :  rt_sigaction call succeeded: result = 0
>>>>>> rt_sigaction010  TINFO  :  sa.sa_flags = SA_RESETHAND|SA_SIGINFO
>>>>>> rt_sigaction010  TINFO  :  Signal Handler Called with signal number 
>>>>>> 34
>>>>>>
>>>>>> Segmentation fault
>>>>>> --
>>>>>>
>>>>>>
>>>>>> Is this the desired result? In function ia32_setup_rt_frame, I found 
>>>>>> below code:
>>>>>>
>>>>>> if (ksig->ka.sa.sa_flags & SA_RESTORER)
>>>>>> restorer = ksig->ka.sa.sa_restorer;
>>>>>> else
>>>>>> restorer = current->mm->context.vdso +
>>>>>> vdso_image_32.sym___kernel_rt_sigreturn;
>>>>>> put_user_ex(ptr_to_compat(restorer), >pretcode);
>>>>>>
>>>>>> Because the vdso is disabled, so current->mm->context.vdso is NULL, 
>>>>>> which cause the result of frame->pretcode invalid.
>>>>>>
>>>>>> I'm not sure whether this is a kernel bug or just an error of test case 
>>>>>> itself. Can anyone help me?
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>> I can't tell from your email what you're testing, what behavior you
>>> expect, and what you saw.  A program that sets up a signal handler
>>> without supplying a restorer will not work if the vDSO is off, and
>>> this is by design.
>> OK, so that the user should take care whether the vDSO is disabled by itself 
>> or not, and use different strategies to process it appropriately, like glibc.
>>
>>>
>>> (FWIW, there is a very longstanding libc bug that causes this case to
>>> get severely screwed up if the user's SS is not the expected value,
>>> and that bug was just fixed very recently.  But I doubt this is what
>>> you're seeing.)
>>>
>>> I suppose we could improve the kernel to at least push NULL instead of
>>> some random address a bit above 0, but it'll still crash.
>> Should we add a warning? Which may help the user to aware this error in time.
>>
> 
> It’s entirely valid to have a non working restorer if you never plan to 
> return from a signal handler. And anyone who writes their own libc should be 
> able to figure this out on their own, I think.

OK. Thanks a lot.

> 
>>>
>>> .
>>>
>>
>> -- 
>> Thanks!
>> BestRegards
>>
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH v5 3/5] iommu/io-pgtable-arm: add support for non-strict mode

2018-08-27 Thread Leizhen (ThunderTown)



On 2018/8/23 1:52, Robin Murphy wrote:
> On 15/08/18 02:28, Zhen Lei wrote:
>> To support the non-strict mode, now we only tlbi and sync for the strict
>> mode. But for the non-leaf case, always follow strict mode.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>   drivers/iommu/io-pgtable-arm.c | 20 ++--
>>   drivers/iommu/io-pgtable.h |  3 +++
>>   2 files changed, 17 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index 010a254..20d3e98 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -538,6 +538,7 @@ static size_t arm_lpae_split_blk_unmap(struct 
>> arm_lpae_io_pgtable *data,
>>   phys_addr_t blk_paddr;
>>   size_t tablesz = ARM_LPAE_GRANULE(data);
>>   size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
>> +size_t unmapped = size;
>>   int i, unmap_idx = -1;
>>
>>   if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
>> @@ -575,11 +576,16 @@ static size_t arm_lpae_split_blk_unmap(struct 
>> arm_lpae_io_pgtable *data,
>>   tablep = iopte_deref(pte, data);
>>   }
>>
>> -if (unmap_idx < 0)
> 
> [ side note: the more I see this test the more it looks slightly wrong, but 
> that's a separate issue. I'll have to sit down and really remember what's 
> going on here... ]
> 
>> -return __arm_lpae_unmap(data, iova, size, lvl, tablep);
>> +if (unmap_idx < 0) {
>> +unmapped = __arm_lpae_unmap(data, iova, size, lvl, tablep);
>> +if (!(data->iop.cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT))
>> +return unmapped;
>> +}
> 
> I don't quite get this change - we should only be recursing back into 
> __arm_lpae_unmap() here if nothing's actually been unmapped at this point - 
> the block entry is simply replaced by a full next-level table and we're going 
> to come back and split another block at that next level, or we raced against 
> someone else splitting the same block and that's their table instead. Since 
> that's reverting back to a "regular" unmap, I don't see where the need to 
> introduce an additional flush to that path comes from (relative to the 
> existing behaviour, at least).

The old block mapping maybe cached in TLBs, it should be invalidated completely 
before the new next-level mapping to be used. Just ensure that.

In fact, I think the code of arm_lpae_split_blk_unmap may has some mistakes. 
For example:
if (size == split_sz)
unmap_idx = ARM_LPAE_LVL_IDX(iova, lvl, data);

It means that "the size" can only be the block/page size of the next-level. 
Suppose current level is 2M block, but we may unmap 12K, and
the above "if" will limit us only be able to unmap 4K.

Furthermore, the situation "if (unmap_idx < 0)" should not appear.

Maybe my analysis is wrong, I will try to test it.


> 
>>   io_pgtable_tlb_add_flush(>iop, iova, size, size, true);
> 
> This is the flush which corresponds to whatever page split_blk_unmap() 
> actually unmapped itself (and also covers any recursively-split 
> intermediate-level entries)...
> 
>> -return size;
>> +io_pgtable_tlb_sync(>iop);
> 
> ...which does want this sync, but only technically for non-strict mode, since 
> it's otherwise covered by the sync in iommu_unmap().

Because split_blk_unmap() is rarely to be called, it has little impact on the 
overall performance,
so I ommitted the if statement of non-strict, I will add it back.

> 
> I'm not *against* tightening up the TLB maintenance here in general, but if 
> so that should be a separately-reasoned patch, not snuck in with other 
> changes.

OK

> 
> Robin.
> 
>> +
>> +return unmapped;
>>   }
>>
>>   static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>> @@ -609,7 +615,7 @@ static size_t __arm_lpae_unmap(struct 
>> arm_lpae_io_pgtable *data,
>>   io_pgtable_tlb_sync(iop);
>>   ptep = iopte_deref(pte, data);
>>   __arm_lpae_free_pgtable(data, lvl + 1, ptep);
>> -} else {
>> +} else if (!(iop->cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT)) {
>>   io_pgtable_tlb_add_flush(iop, iova, size, size, true);
>>   }
>>
>> @@ -771,7 +777,8 @@ static void arm_lpae_restrict_pgsizes(struct 
>> io_pgtable_cfg *cfg)
>>   u64 reg;
>>   struct arm_lpae_io_pgtable *data;
>>
>> -if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA))
>> +if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA |
>> +IO_PGTABLE_QUIRK_NON_STRICT))
>>   return NULL;
>>
>>   data = arm_lpae_alloc_pgtable(cfg);
>> @@ -863,7 +870,8 @@ static void arm_lpae_restrict_pgsizes(struct 
>> io_pgtable_cfg *cfg)
>>   struct arm_lpae_io_pgtable *data;
>>
>>   /* The NS quirk doesn't apply at stage 2 */
>> -if (cfg->quirks & ~IO_PGTABLE_QUIRK_NO_DMA)
>> +if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA |
>> +IO_PGTABLE_QUIRK_NON_STRICT))
>>   return NULL;
>>

Re: [PATCH v5 5/5] iommu/arm-smmu-v3: add bootup option "iommu.non_strict"

2018-08-27 Thread Leizhen (ThunderTown)



On 2018/8/23 1:02, Robin Murphy wrote:
> On 15/08/18 02:28, Zhen Lei wrote:
>> Add a bootup option to make the system manager can choose which mode to
>> be used. The default mode is strict.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>   Documentation/admin-guide/kernel-parameters.txt | 13 +
>>   drivers/iommu/arm-smmu-v3.c | 22 +-
>>   2 files changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index 5cde1ff..cb9d043e 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -1720,6 +1720,19 @@
>>   nobypass[PPC/POWERNV]
>>   Disable IOMMU bypass, using IOMMU for PCI devices.
>>
>> +iommu.non_strict=[ARM64]
>> +Format: { "0" | "1" }
>> +0 - strict mode, default.
>> +Release IOVAs after the related TLBs are invalid
>> +completely.
>> +1 - non-strict mode.
>> +Put off TLBs invalidation and release memory first.
>> +It's good for scatter-gather performance but lacks
>> +full isolation, an untrusted device can access the
>> +reused memory because the TLBs may still valid.
>> +Please takefull consideration before choosing this
>> +mode. Note that, VFIO will always use strict mode.
>> +
>>   iommu.passthrough=
>>   [ARM64] Configure DMA to bypass the IOMMU by default.
>>   Format: { "0" | "1" }
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 61eb7ec..0eda90e 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -631,6 +631,26 @@ struct arm_smmu_option_prop {
>>   { 0, NULL},
>>   };
>>
>> +static bool smmu_non_strict __read_mostly;
>> +
>> +static int __init arm_smmu_setup(char *str)
>> +{
>> +int ret;
>> +
>> +ret = kstrtobool(str, _non_strict);
>> +if (ret)
>> +return ret;
>> +
>> +if (smmu_non_strict) {
>> +pr_warn("WARNING: iommu non-strict mode is chosen.\n"
>> +"It's good for scatter-gather performance but lacks full 
>> isolation\n");
>> +add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
>> +}
>> +
>> +return 0;
>> +}
>> +early_param("iommu.non_strict", arm_smmu_setup);
> 
> As I said on v3, the option should be parsed by iommu-dma, since that's where 
> it takes effect, and I'm sure SMMUv2 users will be interested in trying it 
> out too.

OK, I am so sorry that I have not understood your opinion correctly.

> 
> In other words, if iommu_dma_init_domain() does something like:
> 
> if (iommu_dma_non_strict && domain->ops->flush_iotlb_all) {
> domain->non_strict = true;
> cookie->domain = domain;
> init_iova_flush_queue(...);
> }
> 
>> +
>>   static inline void __iomem *arm_smmu_page1_fixup(unsigned long offset,
>>struct arm_smmu_device *smmu)
>>   {
>> @@ -1622,7 +1642,7 @@ static int arm_smmu_domain_finalise(struct 
>> iommu_domain *domain)
>>   if (smmu->features & ARM_SMMU_FEAT_COHERENCY)
>>   pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
>>
>> -if (domain->type == IOMMU_DOMAIN_DMA) {
>> +if ((domain->type == IOMMU_DOMAIN_DMA) && smmu_non_strict) {
>>   domain->non_strict = true;
>>   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>>   }
> 
> ...then all the driver should need to do is:
> 
> if (domain->non_strict)
> pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> 
> 
> Now, that would make it possible to request non-strict mode even with drivers 
> which *don't* understand it, but I think that's not actually harmful, just 
> means that some TLBIs will still get issued synchronously and the flush queue 
> might not do much. If you wanted to avoid even that, you could replace 
> domain->non_strict with an iommu_domain_set_attr() call, so iommu_dma could 
> tell up-front whether the driver understands non-strict mode and it's worth 
> setting the queue up or not.

OK, I will think seriously about it, thanks. I've been busy these days, I will 
reply to you as soon as possible.

> 
> Robin.
> 
> .
> 

-- 
Thanks!
BestRegards



[Question] Are the trace APIs declared by "TRACE_EVENT(irq_handler_entry" allowed to be used in Ko?

2018-09-11 Thread Leizhen (ThunderTown)
After patch 7e066fb870fc ("tracepoints: add DECLARE_TRACE() and 
DEFINE_TRACE()"),
the trace APIs declared by "TRACE_EVENT(irq_handler_entry" can not be directly 
used
by ko, because it's not explicitly exported by EXPORT_TRACEPOINT_SYMBOL_GPL or
EXPORT_TRACEPOINT_SYMBOL.

Did we miss it? or it's not recommended to be used in ko?


-

commit 7e066fb870fcd1025ec3ba7bbde5d541094f4ce1
Author: Mathieu Desnoyers 
Date:   Fri Nov 14 17:47:47 2008 -0500

tracepoints: add DECLARE_TRACE() and DEFINE_TRACE()

Impact: API *CHANGE*. Must update all tracepoint users.

Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint
structure in a single spot for all the kernel. It helps reducing memory
consumption, especially when declaring a lot of tracepoints, e.g. for
kmalloc tracing.

*API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for
tracepoint declarations rather than DEFINE_TRACE(). This is the sane way
to do it. The name previously used was misleading.

Updates scheduler instrumentation to follow this API change.


-- 
Thanks!
BestRegards



Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc

2018-10-16 Thread Leizhen (ThunderTown)



On 2018/10/15 20:46, Andrew Murray wrote:
> Hi Zhen,
> 
> On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
>> ITS translation register map:
>> 0x-0x003CReserved
>> 0x0040   GITS_TRANSLATER
>> 0x0044-0xFFFCReserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 
>> bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |4bytes|4bytes|
>>   |MSIData   |IMPDEF|
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in 
>> ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both 
>> aligned
>> by 8 bytes, so no problem is met now.
> 
> My understanding is that MSI's are 32bit memory writes and as such the SMMU
> performs a 32bit write in response to the MSI. If so then what is different
> with the Hi16xx that causes a problem? Have you been able to able to adjust
> the layout of the arm_smmu_device struct to demonstrate this?

In normal, only 32bits MSIdata will be written into sync_count:
|4bytes|4bytes|
|  sync_count  |  |

But for Hi16xx, the ITS hardware will write extra 32bits IMDDEF data into 
"". If
"" is the space of the next struct member, its value will be overwritten.

> 
> Thanks,
> 
> Andrew Murray
> 
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is 
>> always
>>aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>  
>>  struct arm_smmu_strtab_cfg  strtab_cfg;
>>  
>> +union {
>> +u64 padding; /* workaround for Hisilicon */
>>  u32 sync_count;
>> +} __attribute__((aligned(8)));
>>  
>>  /* IOMMU core code handle */
>>  struct iommu_device iommu;
>> -- 
>> 1.8.3
>>
>>
>> ___
>> iommu mailing list
>> io...@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc

2018-10-16 Thread Leizhen (ThunderTown)



On 2018/10/15 19:17, John Garry wrote:
> On 15/10/2018 09:36, Zhen Lei wrote:
>> ITS translation register map:
>> 0x-0x003CReserved
>> 0x0040GITS_TRANSLATER
>> 0x0044-0xFFFCReserved
>>
> 
> Can you add a better opening than the ITS translation register map?

OK

> 
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 
>> bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |4bytes|4bytes|
>>  |MSIData   |IMPDEF|
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in 
>> ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
> 
> I think arm_smmu_device.sync_count is better, or "sync_count member in the 
> the smmu driver control struct".

OK, I will use "struct" in v2.

+   struct {
u32 sync_count;
+   u32 padding;
+   } __attribute__((aligned(8)));

> 
>> luckly that the previous and the next neighbour of "sync_count" are both 
>> aligned
> 
> /s/luckly/luckily or fortunately/

OK, thanks

> 
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is 
>> always
>>aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>
>>  struct arm_smmu_strtab_cfgstrtab_cfg;
>>
>> +union {
>> +u64padding; /* workaround for Hisilicon */
> 
> I think that a more detailed comment is required.

OK, I will try to describe it more clearly.

> 
>>  u32sync_count;
> 
> Can you indent these 2 members? However - as discussed internally - this may 
> have endian issue so better to declare full 64b struct.

These indent is inherited, to keep aligning with other members.

There is no endian issue, I have tested it on both little-endian and big-endian.

$gdb vmlinux
..
(gdb) p &((struct arm_smmu_device *)0)->sync_count
$1 = (u32 *) 0x4178
(gdb) p &((struct arm_smmu_device *)0)->tst1
$2 = (int *) 0x4170
(gdb) p &((struct arm_smmu_device *)0)->tst2
$3 = (int *) 0x4180

testcase

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5059d09..7c6f7ac 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -586,7 +586,14 @@ struct arm_smmu_device {

struct arm_smmu_strtab_cfg  strtab_cfg;

+ int tst1;
+
+ union {
+ u64 padding;
u32 sync_count;
+ } __attribute__((aligned(8)));
+
+ int tst2;

/* IOMMU core code handle */
struct iommu_device iommu;

> 
>> +} __attribute__((aligned(8)));
>>
>>  /* IOMMU core code handle */
>>  struct iommu_deviceiommu;
>>
> Thanks
> 
> 
> 
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH v2 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc

2018-10-29 Thread Leizhen (ThunderTown)



On 2018/10/30 1:59, Will Deacon wrote:
> On Sat, Oct 20, 2018 at 03:36:54PM +0800, Zhen Lei wrote:
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but
>> Hisilicon expands the next 4 bytes to carry some IMPDEF information. That
>> means, total 8 bytes data will be written to MSIAddress each time.
>>
>> MSIAddr: |4bytes|4bytes|
>>   |MSIData   |IMPDEF|
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved
>> in ITS. But it will overwrite the 4 bytes memory following "sync_count".
>> It's very fortunately that the previous and the next neighbour of the
>> "sync_count" are both aligned by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is
>>always aligned by 8 bytes.
>> 2. Add a "int" struct member to make sure the 4 bytes padding is always
>>exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 15 ++-
>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..624fdd0 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,20 @@ struct arm_smmu_device {
>>  
>>  struct arm_smmu_strtab_cfg  strtab_cfg;
>>  
>> -u32 sync_count;
>> +/*
>> + * The alignment and padding is required by Hi16xx of Hisilicon.
>> + * Because the ITS hardware on Hi16xx will truncate the MSIAddress(Here
>> + * it's the address of "sync_count") to 8 bytes boundary first, then
>> + * write 32 bits MSIdata at offset 0, and 32 bits IMPDEF data at offset
>> + * 4. Without this workaround, the adjacent member maybe overwritten.
>> + *
>> + *|---4bytes---|---4bytes---|
>> + * MSIAddress & (~0x7):   MSIdata  | IMPDEF data|
>> + */
>> +struct {
>> +u32 sync_count;
>> +int padding;
>> +} __attribute__((aligned(8)));
> 
> I thought the conclusion after reviewing your original patch was to maintain
> the union and drop the alignment directive? e.g.
> 
>   union {
>   u32 sync_count;
>   u64 padding; /* Hi16xx writes an extra 32 bits of goodness 
> */
>   };
OK, I will sent v3.

> 
> Will
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH v2 3/4] iommu/iova: Extend rbtree node caching

2017-08-31 Thread Leizhen (ThunderTown)


On 2017/8/4 3:41, Nate Watterson wrote:
> Hi Robin,
> 
> On 7/31/2017 7:42 AM, Robin Murphy wrote:
>> Hi Nate,
>>
>> On 29/07/17 04:57, Nate Watterson wrote:
>>> Hi Robin,
>>> I am seeing a crash when performing very basic testing on this series
>>> with a Mellanox CX4 NIC. I dug into the crash a bit, and think this
>>> patch is the culprit, but this rcache business is still mostly
>>> witchcraft to me.
>>>
>>> # ifconfig eth5 up
>>> # ifconfig eth5 down
>>>  Unable to handle kernel NULL pointer dereference at virtual address
>>> 0020
>>>  user pgtable: 64k pages, 48-bit VAs, pgd = 8007dbf47c00
>>>  [0020] *pgd=0006efab0003, *pud=0006efab0003,
>>> *pmd=0007d8720003, *pte=
>>>  Internal error: Oops: 9607 [#1] SMP
>>>  Modules linked in:
>>>  CPU: 47 PID: 5082 Comm: ifconfig Not tainted 4.13.0-rtp-enablement+ #3
>>>  task: 8007da1e5780 task.stack: 8007ddcb8000
>>>  PC is at __cached_rbnode_delete_update+0x2c/0x58
>>>  LR is at private_free_iova+0x2c/0x60
>>>  pc : [] lr : [] pstate: 204001c5
>>>  sp : 8007ddcbba00
>>>  x29: 8007ddcbba00 x28: 8007c8350210
>>>  x27: 8007d1a8 x26: 8007dcc20800
>>>  x25: 0140 x24: 8007c98f0008
>>>  x23: fe4e x22: 0140
>>>  x21: 8007c98f0008 x20: 8007c9adb240
>>>  x19: 8007c98f0018 x18: 0010
>>>  x17:  x16: 
>>>  x15: 4000 x14: 
>>>  x13:  x12: 0001
>>>  x11: dead0200 x10: 
>>>  x9 :  x8 : 8007c9adb1c0
>>>  x7 : 40002000 x6 : 00210d00
>>>  x5 :  x4 : c57e
>>>  x3 : ffcf x2 : ffcf
>>>  x1 : 8007c9adb240 x0 : 
>>>  [...]
>>>  [] __cached_rbnode_delete_update+0x2c/0x58
>>>  [] private_free_iova+0x2c/0x60
>>>  [] iova_magazine_free_pfns+0x4c/0xa0
>>>  [] free_iova_fast+0x1b0/0x230
>>>  [] iommu_dma_free_iova+0x5c/0x80
>>>  [] __iommu_dma_unmap+0x5c/0x98
>>>  [] iommu_dma_unmap_resource+0x24/0x30
>>>  [] iommu_dma_unmap_page+0xc/0x18
>>>  [] __iommu_unmap_page+0x40/0x60
>>>  [] mlx5e_page_release+0xbc/0x128
>>>  [] mlx5e_dealloc_rx_wqe+0x30/0x40
>>>  [] mlx5e_close_channel+0x70/0x1f8
>>>  [] mlx5e_close_channels+0x2c/0x50
>>>  [] mlx5e_close_locked+0x54/0x68
>>>  [] mlx5e_close+0x30/0x58
>>>  [...]
>>>
>>> ** Disassembly for __cached_rbnode_delete_update() near the fault **
>>>92|if (free->pfn_hi < iovad->dma_32bit_pfn)
>>> 0852C6C4|ldr x3,[x1,#0x18]; x3,[free,#24]
>>> 0852C6C8|ldr x2,[x0,#0x30]; x2,[iovad,#48]
>>> 0852C6CC|cmp x3,x2
>>> 0852C6D0|b.cs0x0852C708
>>>  |curr = >cached32_node;
>>>94|if (!curr)
>>> 0852C6D4|addsx19,x0,#0x18 ; x19,iovad,#24
>>> 0852C6D8|b.eq0x0852C708
>>>  |
>>>  |cached_iova = rb_entry(*curr, struct iova, node);
>>>  |
>>>99|if (free->pfn_lo >= cached_iova->pfn_lo)
>>> 0852C6DC|ldr x0,[x19] ; xiovad,[curr]
>>> 0852C6E0|ldr x2,[x1,#0x20]; x2,[free,#32]
>>> 0852C6E4|ldr x0,[x0,#0x20]; x0,[x0,#32]
>>> Apparently cached_iova was NULL so the pfn_lo access faulted.
>>>
>>> 0852C6E8|cmp x2,x0
>>> 0852C6EC|b.cc0x0852C6FC
>>> 0852C6F0|mov x0,x1; x0,free
>>>   100|*curr = rb_next(>node);
>>> After instrumenting the code a bit, this seems to be the culprit. In the
>>> previous call, free->pfn_lo was 0x_ which is actually the
>>> dma_limit for the domain so rb_next() returns NULL.
>>>
>>> Let me know if you have any questions or would like additional tests
>>> run. I also applied your "DMA domain debug info" patches and dumped the
>>> contents of the domain at each of the steps above in case that would be
>>> useful. If nothing else, they reinforce how thirsty the CX4 NIC is
>>> especially when using 64k pages and many CPUs.
>>
>> Thanks for the report - I somehow managed to reason myself out of
>> keeping the "no cached node" check in __cached_rbnode_delete_update() on
>> the assumption that it must always be set by a previous allocation.
>> However, there is indeed just one case case for which that fails: when
>> you free any IOVA immediately after freeing the very topmost one. Which
>> is something that freeing an entire magazine's worth of IOVAs back to
>> the tree all at once has a very real chance of doing...
>>
>> The obvious 

Re: [PATCH 0/5] arm-smmu: performance optimization

2017-08-17 Thread Leizhen (ThunderTown)


On 2017/8/17 22:36, Will Deacon wrote:
> Thunder, Nate, Robin,
> 
> On Mon, Jun 26, 2017 at 09:38:45PM +0800, Zhen Lei wrote:
>> I described the optimization more detail in patch 1 and 2, and patch 3-5 are
>> the implementation on arm-smmu/arm-smmu-v3 of patch 2.
>>
>> Patch 1 is v2. In v1, I directly replaced writel with writel_relaxed in
>> queue_inc_prod. But Robin figured that it may lead SMMU consume stale
>> memory contents. I thought more than 3 whole days and got this one.
>>
>> This patchset is based on Robin Murphy's [PATCH v2 0/8] io-pgtable lock 
>> removal.
> 
> For the time being, I think we should focus on the new TLB flushing
> interface posted by Joerg:
> 
> http://lkml.kernel.org/r/1502974596-23835-1-git-send-email-j...@8bytes.org
> 
> which looks like it can give us most of the benefits of this series. Once
> we've got that, we can see what's left in the way of performance and focus
> on the cmdq batching separately (because I'm still not convinced about it).
OK, this is a good news.

But I have a review comment(sorry, I have not subscribed it yet, so can not 
directly reply it):
I don't think we should add tlb sync for map operation
1. at init time, all tlbs will be invalidated
2. when we try to map a new range, there are no related ptes bufferd in tlb, 
because of above 1 and below 3
3. when we unmap the above range, make sure all related ptes bufferd in tlb to 
be invalidated before unmap finished

> 
> Thanks,
> 
> Will
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/1] mm: only dispaly online cpus of the numa node

2017-09-29 Thread Leizhen (ThunderTown)


On 2017/8/28 21:13, Michal Hocko wrote:
> On Fri 25-08-17 18:34:33, Will Deacon wrote:
>> On Thu, Aug 24, 2017 at 10:32:26AM +0200, Michal Hocko wrote:
>>> It seems this has slipped through cracks. Let's CC arm64 guys
>>>
>>> On Tue 20-06-17 20:43:28, Zhen Lei wrote:
 When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap
 and display cpumask_of_node for each node), but I got different result on
 X86 and arm64. For each numa node, the former only displayed online CPUs,
 and the latter displayed all possible CPUs. Unfortunately, both Linux
 documentation and numactl manual have not described it clear.

 I sent a mail to ask for help, and Michal Hocko  replied
 that he preferred to print online cpus because it doesn't really make much
 sense to bind anything on offline nodes.
>>>
>>> Yes printing offline CPUs is just confusing and more so when the
>>> behavior is not consistent over architectures. I believe that x86
>>> behavior is the more appropriate one because it is more logical to dump
>>> the NUMA topology and use it for affinity setting than adding one
>>> additional step to check the cpu state to achieve the same.
>>>
>>> It is true that the online/offline state might change at any time so the
>>> above might be tricky on its own but if we should at least make the
>>> behavior consistent.
>>>
 Signed-off-by: Zhen Lei 
>>>
>>> Acked-by: Michal Hocko 
>>
>> The concept looks find to me, but shouldn't we use cpumask_var_t and
>> alloc/free_cpumask_var?
> 
> This will be safer but both callers of node_read_cpumap are shallow
> stack so I am not sure a stack is a limiting factor here.
> 
> Zhen Lei, would you care to update that part please?
> 
Sure, I will send v2 immediately.

I'm so sorry that missed this email until someone told me.

-- 
Thanks!
BestRegards



Re: [PATCH v2 0/3] arm-smmu: performance optimization

2017-09-19 Thread Leizhen (ThunderTown)


On 2017/9/19 12:31, Nate Watterson wrote:
> Hi Leizhen,
> 
> On 9/12/2017 9:00 AM, Zhen Lei wrote:
>> v1 -> v2:
>> base on (add02cfdc9bc2 "iommu: Introduce Interface for IOMMU TLB Flushing")
>>
>> Zhen Lei (3):
>>iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock
>>  confliction
>>iommu/arm-smmu-v3: add support for unmap an iova range with only one
>>  tlb sync
> 
> I tested these (2) patches on QDF2400 hardware and saw performance
> improvements in line with those I reported when testing the original
> series. I don't have any hardware close at hand to test the 3rd patch
> in the series so that will have to come from someone else.
Thanks a lot.

> 
> Tested-by: Nate Watterson 
> 
> Thanks,
> Nate
> 
>>iommu/arm-smmu: add support for unmap a memory range with only one tlb
>>  sync
>>
>>   drivers/iommu/arm-smmu-v3.c| 52 
>> ++
>>   drivers/iommu/arm-smmu.c   | 10 
>>   drivers/iommu/io-pgtable-arm-v7s.c | 32 +++
>>   drivers/iommu/io-pgtable-arm.c | 30 ++
>>   drivers/iommu/io-pgtable.h |  1 +
>>   5 files changed, 99 insertions(+), 26 deletions(-)
>>
> 

-- 
Thanks!
BestRegards



Re: [PATCH v2 3/4] iommu/iova: Extend rbtree node caching

2017-09-19 Thread Leizhen (ThunderTown)


On 2017/7/31 19:42, Robin Murphy wrote:
> Hi Nate,
> 
> On 29/07/17 04:57, Nate Watterson wrote:
>> Hi Robin,
>> I am seeing a crash when performing very basic testing on this series
>> with a Mellanox CX4 NIC. I dug into the crash a bit, and think this
>> patch is the culprit, but this rcache business is still mostly
>> witchcraft to me.
>>
>> # ifconfig eth5 up
>> # ifconfig eth5 down
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 0020
>> user pgtable: 64k pages, 48-bit VAs, pgd = 8007dbf47c00
>> [0020] *pgd=0006efab0003, *pud=0006efab0003,
>> *pmd=0007d8720003, *pte=
>> Internal error: Oops: 9607 [#1] SMP
>> Modules linked in:
>> CPU: 47 PID: 5082 Comm: ifconfig Not tainted 4.13.0-rtp-enablement+ #3
>> task: 8007da1e5780 task.stack: 8007ddcb8000
>> PC is at __cached_rbnode_delete_update+0x2c/0x58
>> LR is at private_free_iova+0x2c/0x60
>> pc : [] lr : [] pstate: 204001c5
>> sp : 8007ddcbba00
>> x29: 8007ddcbba00 x28: 8007c8350210
>> x27: 8007d1a8 x26: 8007dcc20800
>> x25: 0140 x24: 8007c98f0008
>> x23: fe4e x22: 0140
>> x21: 8007c98f0008 x20: 8007c9adb240
>> x19: 8007c98f0018 x18: 0010
>> x17:  x16: 
>> x15: 4000 x14: 
>> x13:  x12: 0001
>> x11: dead0200 x10: 
>> x9 :  x8 : 8007c9adb1c0
>> x7 : 40002000 x6 : 00210d00
>> x5 :  x4 : c57e
>> x3 : ffcf x2 : ffcf
>> x1 : 8007c9adb240 x0 : 
>> [...]
>> [] __cached_rbnode_delete_update+0x2c/0x58
>> [] private_free_iova+0x2c/0x60
>> [] iova_magazine_free_pfns+0x4c/0xa0
>> [] free_iova_fast+0x1b0/0x230
>> [] iommu_dma_free_iova+0x5c/0x80
>> [] __iommu_dma_unmap+0x5c/0x98
>> [] iommu_dma_unmap_resource+0x24/0x30
>> [] iommu_dma_unmap_page+0xc/0x18
>> [] __iommu_unmap_page+0x40/0x60
>> [] mlx5e_page_release+0xbc/0x128
>> [] mlx5e_dealloc_rx_wqe+0x30/0x40
>> [] mlx5e_close_channel+0x70/0x1f8
>> [] mlx5e_close_channels+0x2c/0x50
>> [] mlx5e_close_locked+0x54/0x68
>> [] mlx5e_close+0x30/0x58
>> [...]
>>
>> ** Disassembly for __cached_rbnode_delete_update() near the fault **
>>   92|if (free->pfn_hi < iovad->dma_32bit_pfn)
>> 0852C6C4|ldr x3,[x1,#0x18]; x3,[free,#24]
>> 0852C6C8|ldr x2,[x0,#0x30]; x2,[iovad,#48]
>> 0852C6CC|cmp x3,x2
>> 0852C6D0|b.cs0x0852C708
>> |curr = >cached32_node;
>>   94|if (!curr)
>> 0852C6D4|addsx19,x0,#0x18 ; x19,iovad,#24
>> 0852C6D8|b.eq0x0852C708
>> |
>> |cached_iova = rb_entry(*curr, struct iova, node);
>> |
>>   99|if (free->pfn_lo >= cached_iova->pfn_lo)
>> 0852C6DC|ldr x0,[x19] ; xiovad,[curr]
>> 0852C6E0|ldr x2,[x1,#0x20]; x2,[free,#32]
>> 0852C6E4|ldr x0,[x0,#0x20]; x0,[x0,#32]
>> Apparently cached_iova was NULL so the pfn_lo access faulted.
>>
>> 0852C6E8|cmp x2,x0
>> 0852C6EC|b.cc0x0852C6FC
>> 0852C6F0|mov x0,x1; x0,free
>>  100|*curr = rb_next(>node);
>> After instrumenting the code a bit, this seems to be the culprit. In the
>> previous call, free->pfn_lo was 0x_ which is actually the
>> dma_limit for the domain so rb_next() returns NULL.
>>
>> Let me know if you have any questions or would like additional tests
>> run. I also applied your "DMA domain debug info" patches and dumped the
>> contents of the domain at each of the steps above in case that would be
>> useful. If nothing else, they reinforce how thirsty the CX4 NIC is
>> especially when using 64k pages and many CPUs.
> 
> Thanks for the report - I somehow managed to reason myself out of
> keeping the "no cached node" check in __cached_rbnode_delete_update() on
> the assumption that it must always be set by a previous allocation.
> However, there is indeed just one case case for which that fails: when
> you free any IOVA immediately after freeing the very topmost one. Which
> is something that freeing an entire magazine's worth of IOVAs back to
> the tree all at once has a very real chance of doing...
> 
> The obvious straightforward fix is inline below, but I'm now starting to
> understand the appeal of reserving a sentinel node to ensure the tree
> can never be empty, so I might have a quick go at that to see if it
> results in 

Re: [PATCH v2 0/4] Optimise 64-bit IOVA allocations

2017-08-08 Thread Leizhen (ThunderTown)


On 2017/8/8 20:03, Ganapatrao Kulkarni wrote:
> On Wed, Jul 26, 2017 at 4:47 PM, Leizhen (ThunderTown)
>  wrote:
>>
>>
>> On 2017/7/26 19:08, Joerg Roedel wrote:
>>> Hi Robin.
>>>
>>> On Fri, Jul 21, 2017 at 12:41:57PM +0100, Robin Murphy wrote:
>>>> Hi all,
>>>>
>>>> In the wake of the ARM SMMU optimisation efforts, it seems that certain
>>>> workloads (e.g. storage I/O with large scatterlists) probably remain quite
>>>> heavily influenced by IOVA allocation performance. Separately, Ard also
>>>> reported massive performance drops for a graphical desktop on AMD Seattle
>>>> when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA
>>>> ops domain getting initialised differently for ACPI vs. DT, and exposing
>>>> the overhead of the rbtree slow path. Whilst we could go around trying to
>>>> close up all the little gaps that lead to hitting the slowest case, it
>>>> seems a much better idea to simply make said slowest case a lot less slow.
>>>
>>> Do you have some numbers here? How big was the impact before these
>>> patches and how is it with the patches?
>> Here are some numbers:
>>
>> (before)$ iperf -s
>> 
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> 
>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
>> [ ID] Interval   Transfer Bandwidth
>> [  4]  0.0-10.2 sec  7.88 MBytes  6.48 Mbits/sec
>> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
>> [  5]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
>> [  4]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
>>
>> (after)$ iperf -s
>> 
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> 
>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
>> [ ID] Interval   Transfer Bandwidth
>> [  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
>> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
>> [  5]  0.0-10.0 sec  1.10 GBytes   939 Mbits/sec
>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
>> [  4]  0.0-10.0 sec  1.10 GBytes   938 Mbits/sec
>>
> 
> Is this testing done on Host or on Guest/VM?
Host

> 
>>>
>>>
>>>   Joerg
>>>
>>>
>>> .
>>>
>>
>> --
>> Thanks!
>> BestRegards
>>
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> thanks
> Ganapat
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH v2 0/4] Optimise 64-bit IOVA allocations

2017-08-08 Thread Leizhen (ThunderTown)


On 2017/8/9 11:24, Ganapatrao Kulkarni wrote:
> On Wed, Aug 9, 2017 at 7:12 AM, Leizhen (ThunderTown)
>  wrote:
>>
>>
>> On 2017/8/8 20:03, Ganapatrao Kulkarni wrote:
>>> On Wed, Jul 26, 2017 at 4:47 PM, Leizhen (ThunderTown)
>>>  wrote:
>>>>
>>>>
>>>> On 2017/7/26 19:08, Joerg Roedel wrote:
>>>>> Hi Robin.
>>>>>
>>>>> On Fri, Jul 21, 2017 at 12:41:57PM +0100, Robin Murphy wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> In the wake of the ARM SMMU optimisation efforts, it seems that certain
>>>>>> workloads (e.g. storage I/O with large scatterlists) probably remain 
>>>>>> quite
>>>>>> heavily influenced by IOVA allocation performance. Separately, Ard also
>>>>>> reported massive performance drops for a graphical desktop on AMD Seattle
>>>>>> when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA
>>>>>> ops domain getting initialised differently for ACPI vs. DT, and exposing
>>>>>> the overhead of the rbtree slow path. Whilst we could go around trying to
>>>>>> close up all the little gaps that lead to hitting the slowest case, it
>>>>>> seems a much better idea to simply make said slowest case a lot less 
>>>>>> slow.
>>>>>
>>>>> Do you have some numbers here? How big was the impact before these
>>>>> patches and how is it with the patches?
>>>> Here are some numbers:
>>>>
>>>> (before)$ iperf -s
>>>> 
>>>> Server listening on TCP port 5001
>>>> TCP window size: 85.3 KByte (default)
>>>> 
>>>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
>>>> [ ID] Interval   Transfer Bandwidth
>>>> [  4]  0.0-10.2 sec  7.88 MBytes  6.48 Mbits/sec
>>>> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
>>>> [  5]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
>>>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
>>>> [  4]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
>>>>
>>>> (after)$ iperf -s
>>>> 
>>>> Server listening on TCP port 5001
>>>> TCP window size: 85.3 KByte (default)
>>>> 
>>>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
>>>> [ ID] Interval   Transfer Bandwidth
>>>> [  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
>>>> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
>>>> [  5]  0.0-10.0 sec  1.10 GBytes   939 Mbits/sec
>>>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
>>>> [  4]  0.0-10.0 sec  1.10 GBytes   938 Mbits/sec
>>>>
>>>
>>> Is this testing done on Host or on Guest/VM?
>> Host
> 
> As per your log, iperf throughput is improved to 938 Mbits/sec
> from  6.43 Mbits/sec.
> IMO, this seems to be unrealistic, some thing wrong with the testing?
For 64bits non-pci devices, the iova allocation is always searched from the 
last rb-tree node.
When many iovas allocated and keep a long time, the search process should check 
many rb nodes
then find a suitable free space. As my tracking, the average times exceeds 10K.
[free-space][free][used][...][used]
  ^ ^  ^
  | |  |-rb_last
  | |- maybe more than 10K allocated iova nodes
  |--- for 32bits devices, cached32_node remember the 
lastest freed node, which can help us reduce check times

This patch series add a new member "cached_node" to service for 64bits devices, 
like cached32_node service for 32bits devices.

> 
>>
>>>
>>>>>
>>>>>
>>>>>   Joerg
>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>> --
>>>> Thanks!
>>>> BestRegards
>>>>
>>>>
>>>> ___
>>>> linux-arm-kernel mailing list
>>>> linux-arm-ker...@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>> thanks
>>> Ganapat
>>>
>>> .
>>>
>>
>> --
>> Thanks!
>> BestRegards
>>
> 
> thanks
> Ganapat
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/5] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction

2017-08-22 Thread Leizhen (ThunderTown)


On 2017/8/22 23:41, Joerg Roedel wrote:
> On Mon, Jun 26, 2017 at 09:38:46PM +0800, Zhen Lei wrote:
>> -static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent)
>> +static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent, int 
>> optimize)
>>  {
>>  if (queue_full(q))
>>  return -ENOSPC;
>>  
>>  queue_write(Q_ENT(q, q->prod), ent, q->ent_dwords);
>> -queue_inc_prod(q);
>> +
>> +/*
>> + * We don't want too many commands to be delayed, this may lead the
>> + * followed sync command to wait for a long time.
>> + */
>> +if (optimize && (++q->nr_delay < CMDQ_MAX_DELAYED)) {
>> +queue_inc_swprod(q);
>> +} else {
>> +queue_inc_prod(q);
>> +q->nr_delay = 0;
>> +}
>> +
>>  return 0;
>>  }
>>  
>> @@ -909,6 +928,7 @@ static void arm_smmu_cmdq_skip_err(struct 
>> arm_smmu_device *smmu)
>>  static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu,
>>  struct arm_smmu_cmdq_ent *ent)
>>  {
>> +int optimize = 0;
>>  u64 cmd[CMDQ_ENT_DWORDS];
>>  unsigned long flags;
>>  bool wfe = !!(smmu->features & ARM_SMMU_FEAT_SEV);
>> @@ -920,8 +940,17 @@ static void arm_smmu_cmdq_issue_cmd(struct 
>> arm_smmu_device *smmu,
>>  return;
>>  }
>>  
>> +/*
>> + * All TLBI commands should be followed by a sync command later.
>> + * The CFGI commands is the same, but they are rarely executed.
>> + * So just optimize TLBI commands now, to reduce the "if" judgement.
>> + */
>> +if ((ent->opcode >= CMDQ_OP_TLBI_NH_ALL) &&
>> +(ent->opcode <= CMDQ_OP_TLBI_NSNH_ALL))
>> +optimize = 1;
>> +
>>  spin_lock_irqsave(>cmdq.lock, flags);
>> -while (queue_insert_raw(q, cmd) == -ENOSPC) {
>> +while (queue_insert_raw(q, cmd, optimize) == -ENOSPC) {
>>  if (queue_poll_cons(q, false, wfe))
>>  dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
>>  }
> 
> This doesn't look correct. How do you make sure that a given IOVA range
> is flushed before the addresses are reused?
Hi, Joerg:
It's actullay guaranteed by the upper layer functions, for example:
static int arm_lpae_unmap(
...
unmapped = __arm_lpae_unmap(data, iova, size, lvl, ptep);   
//__arm_lpae_unmap will indirectly call arm_smmu_cmdq_issue_cmd to invalidate 
tlbs
if (unmapped)
io_pgtable_tlb_sync(>iop);//a 
tlb_sync wait all tlbi operations finished


I also described it in the next patch(2/5). Showed below:

Some people might ask: Is it safe to do so? The answer is yes. The standard
processing flow is:
alloc iova
map
process data
unmap
tlb invalidation and sync
free iova

What should be guaranteed is: "free iova" action is behind "unmap" and "tlbi
operation" action, that is what we are doing right now. This ensures that:
all TLBs of an iova-range have been invalidated before the iova reallocated.

Best regards,
LeiZhen

> 
> 
> Regards,
> 
>   Joerg
> 
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH v2 0/4] Optimise 64-bit IOVA allocations

2017-07-26 Thread Leizhen (ThunderTown)


On 2017/7/26 19:08, Joerg Roedel wrote:
> Hi Robin.
> 
> On Fri, Jul 21, 2017 at 12:41:57PM +0100, Robin Murphy wrote:
>> Hi all,
>>
>> In the wake of the ARM SMMU optimisation efforts, it seems that certain
>> workloads (e.g. storage I/O with large scatterlists) probably remain quite
>> heavily influenced by IOVA allocation performance. Separately, Ard also
>> reported massive performance drops for a graphical desktop on AMD Seattle
>> when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA
>> ops domain getting initialised differently for ACPI vs. DT, and exposing
>> the overhead of the rbtree slow path. Whilst we could go around trying to
>> close up all the little gaps that lead to hitting the slowest case, it
>> seems a much better idea to simply make said slowest case a lot less slow.
> 
> Do you have some numbers here? How big was the impact before these
> patches and how is it with the patches?
Here are some numbers:

(before)$ iperf -s

Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)

[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
[ ID] Interval   Transfer Bandwidth
[  4]  0.0-10.2 sec  7.88 MBytes  6.48 Mbits/sec
[  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
[  5]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
[  4]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec

(after)$ iperf -s

Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)

[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
[ ID] Interval   Transfer Bandwidth
[  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
[  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
[  5]  0.0-10.0 sec  1.10 GBytes   939 Mbits/sec
[  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
[  4]  0.0-10.0 sec  1.10 GBytes   938 Mbits/sec

> 
> 
>   Joerg
> 
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/1] iommu/arm-smmu-v3: replace writel with writel_relaxed in queue_inc_prod

2017-06-26 Thread Leizhen (ThunderTown)


On 2017/6/21 17:08, Will Deacon wrote:
> On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote:
>> On 2017/6/20 19:35, Robin Murphy wrote:
>>> On 20/06/17 12:04, Zhen Lei wrote:
>>>> This function is protected by spinlock, and the latter will do memory
>>>> barrier implicitly. So that we can safely use writel_relaxed. In fact, the
>>>> dmb operation will lengthen the time protected by lock, which indirectly
>>>> increase the locking confliction in the stress scene.
>>>
>>> If you remove the DSB between writing the commands (to Normal memory)
>>> and writing the pointer (to Device memory), how can you guarantee that
>>> the complete command is visible to the SMMU and it isn't going to try to
>>> consume stale memory contents? The spinlock is irrelevant since it's
>>> taken *before* the command is written.
>> OK, I see, thanks. Let's me see if there are any other methods. And I think
>> that this may should be done well by hardware.
> 
> FWIW, I did use the _relaxed variants wherever I could when I wrote the
> driver. There might, of course, be bugs, but it's not like the normal case
> for drivers where the author didn't consider the _relaxed accessors
> initially.
A good news. I got a new idea and I will post v2 later.

> 
> Will
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/1] iommu/arm-smmu-v3: replace writel with writel_relaxed in queue_inc_prod

2017-06-26 Thread Leizhen (ThunderTown)


On 2017/6/26 21:29, Leizhen (ThunderTown) wrote:
> 
> 
> On 2017/6/21 17:08, Will Deacon wrote:
>> On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote:
>>> On 2017/6/20 19:35, Robin Murphy wrote:
>>>> On 20/06/17 12:04, Zhen Lei wrote:
>>>>> This function is protected by spinlock, and the latter will do memory
>>>>> barrier implicitly. So that we can safely use writel_relaxed. In fact, the
>>>>> dmb operation will lengthen the time protected by lock, which indirectly
>>>>> increase the locking confliction in the stress scene.
>>>>
>>>> If you remove the DSB between writing the commands (to Normal memory)
>>>> and writing the pointer (to Device memory), how can you guarantee that
>>>> the complete command is visible to the SMMU and it isn't going to try to
>>>> consume stale memory contents? The spinlock is irrelevant since it's
>>>> taken *before* the command is written.
>>> OK, I see, thanks. Let's me see if there are any other methods. And I think
>>> that this may should be done well by hardware.
>>
>> FWIW, I did use the _relaxed variants wherever I could when I wrote the
>> driver. There might, of course, be bugs, but it's not like the normal case
>> for drivers where the author didn't consider the _relaxed accessors
>> initially.
> A good news. I got a new idea and I will post v2 later.
[PATCH 0/5] arm-smmu: performance optimization
[PATCH 1/5] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock 
confliction

I just sent.

> 
>>
>> Will
>>
>> .
>>
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/5] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction

2017-06-28 Thread Leizhen (ThunderTown)


On 2017/6/28 17:32, Will Deacon wrote:
> Hi Zhen Lei,
> 
> Nate (CC'd), Robin and I have been working on something very similar to
> this series, but this patch is different to what we had planned. More below.
> 
> On Mon, Jun 26, 2017 at 09:38:46PM +0800, Zhen Lei wrote:
>> Because all TLBI commands should be followed by a SYNC command, to make
>> sure that it has been completely finished. So we can just add the TLBI
>> commands into the queue, and put off the execution until meet SYNC or
>> other commands. To prevent the followed SYNC command waiting for a long
>> time because of too many commands have been delayed, restrict the max
>> delayed number.
>>
>> According to my test, I got the same performance data as I replaced writel
>> with writel_relaxed in queue_inc_prod.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 42 +-
>>  1 file changed, 37 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 291da5f..4481123 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -337,6 +337,7 @@
>>  /* Command queue */
>>  #define CMDQ_ENT_DWORDS 2
>>  #define CMDQ_MAX_SZ_SHIFT   8
>> +#define CMDQ_MAX_DELAYED32
>>  
>>  #define CMDQ_ERR_SHIFT  24
>>  #define CMDQ_ERR_MASK   0x7f
>> @@ -472,6 +473,7 @@ struct arm_smmu_cmdq_ent {
>>  };
>>  } cfgi;
>>  
>> +#define CMDQ_OP_TLBI_NH_ALL 0x10
>>  #define CMDQ_OP_TLBI_NH_ASID0x11
>>  #define CMDQ_OP_TLBI_NH_VA  0x12
>>  #define CMDQ_OP_TLBI_EL2_ALL0x20
>> @@ -499,6 +501,7 @@ struct arm_smmu_cmdq_ent {
>>  
>>  struct arm_smmu_queue {
>>  int irq; /* Wired interrupt */
>> +u32 nr_delay;
>>  
>>  __le64  *base;
>>  dma_addr_t  base_dma;
>> @@ -722,11 +725,16 @@ static int queue_sync_prod(struct arm_smmu_queue *q)
>>  return ret;
>>  }
>>  
>> -static void queue_inc_prod(struct arm_smmu_queue *q)
>> +static void queue_inc_swprod(struct arm_smmu_queue *q)
>>  {
>> -u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1;
>> +u32 prod = q->prod + 1;
>>  
>>  q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
>> +}
>> +
>> +static void queue_inc_prod(struct arm_smmu_queue *q)
>> +{
>> +queue_inc_swprod(q);
>>  writel(q->prod, q->prod_reg);
>>  }
>>  
>> @@ -761,13 +769,24 @@ static void queue_write(__le64 *dst, u64 *src, size_t 
>> n_dwords)
>>  *dst++ = cpu_to_le64(*src++);
>>  }
>>  
>> -static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent)
>> +static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent, int 
>> optimize)
>>  {
>>  if (queue_full(q))
>>  return -ENOSPC;
>>  
>>  queue_write(Q_ENT(q, q->prod), ent, q->ent_dwords);
>> -queue_inc_prod(q);
>> +
>> +/*
>> + * We don't want too many commands to be delayed, this may lead the
>> + * followed sync command to wait for a long time.
>> + */
>> +if (optimize && (++q->nr_delay < CMDQ_MAX_DELAYED)) {
>> +queue_inc_swprod(q);
>> +} else {
>> +queue_inc_prod(q);
>> +q->nr_delay = 0;
>> +}
>> +
> 
> So here, you're effectively putting invalidation commands into the command
> queue without updating PROD. Do you actually see a performance advantage
> from doing so? Another side of the argument would be that we should be
Yes, my sas ssd performance test showed that it can improve about 
100-150K/s(the same to I directly replace
writel with writel_relaxed). And the average execution time of 
iommu_unmap(which called by iommu_dma_unmap_sg)
dropped from 10us to 5us.

> moving PROD as soon as we can, so that the SMMU can process invalidation
> commands in the background and reduce the cost of the final SYNC operation
> when the high-level unmap operation is complete.
There maybe that __iowmb() is more expensive than wait for tlbi complete. 
Except the time of __iowmb()
itself, it also protected by spinlock, lock confliction will rise rapidly in 
the stress scene. __iowmb()
average cost 300-500ns(Sorry, I forget the exact value).

In addition, after applied this patcheset and Robin's v2, and my earlier dma64 
iova optimization patchset.
Our net performance test got the same data to global bypass. But sas ssd still 
have more than 20% dropped.
Maybe we should still focus at map/unamp, because the average execution time of 
iova alloc/free is only
about 400ns.

By the way, patch2-5 is more effective than this one, it can improve more than 
350K/s. And with it, we can
got about 100-150K/s improvement of Robin's v2. Otherwise, I saw non effective 
of Robin's v2. Sorry, I have
not tested how about this patch without patch2-5. Further more, I got 

Re: [PATCH 4/4] iommu/iova: Make dma_32bit_pfn implicit

2017-07-19 Thread Leizhen (ThunderTown)


On 2017/7/19 23:07, kbuild test robot wrote:
> Hi Zhen,
> 
> [auto build test WARNING on iommu/next]
> [also build test WARNING on v4.13-rc1]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Robin-Murphy/Optimise-64-bit-IOVA-allocations/20170719-060847
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
> config: arm-multi_v7_defconfig (attached as .config)
> compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
> reproduce:
> wget 
> https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=arm 
> 
> All warnings (new ones prefixed by >>):
> 
>drivers/iommu/iova.c: In function 'init_iova_domain':
>>> drivers/iommu/iova.c:53:41: warning: large integer implicitly truncated to 
>>> unsigned type [-Woverflow]
>  iovad->dma_32bit_pfn = iova_pfn(iovad, 1ULL << 32);
OK, I see. I think the problem is that "1ULL << 32" exceed the scope of 32bits 
general register. We should
replace "1ULL << 32" with DMA_BIT_MASK(32), the latter will minus one to keep 
it can be safely stored in
the general register.

iovad->dma_32bit_pfn = iova_pfn(iovad, DMA_BIT_MASK(32)) + 1;

> ^~~~
> 
> vim +53 drivers/iommu/iova.c
> 
> 35
> 36void
> 37init_iova_domain(struct iova_domain *iovad, unsigned long 
> granule,
> 38unsigned long start_pfn)
> 39{
> 40/*
> 41 * IOVA granularity will normally be equal to the 
> smallest
> 42 * supported IOMMU page size; both *must* be capable of
> 43 * representing individual CPU pages exactly.
> 44 */
> 45BUG_ON((granule > PAGE_SIZE) || 
> !is_power_of_2(granule));
> 46
> 47spin_lock_init(>iova_rbtree_lock);
> 48iovad->rbroot = RB_ROOT;
> 49iovad->cached_node = NULL;
> 50iovad->cached32_node = NULL;
> 51iovad->granule = granule;
> 52iovad->start_pfn = start_pfn;
>   > 53iovad->dma_32bit_pfn = iova_pfn(iovad, 1ULL << 32);
> 54init_iova_rcaches(iovad);
> 55}
> 56EXPORT_SYMBOL_GPL(init_iova_domain);
> 57
> 
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation
> 

-- 
Thanks!
BestRegards



Re: [PATCH 0/4] Optimise 64-bit IOVA allocations

2017-07-21 Thread Leizhen (ThunderTown)


On 2017/7/19 18:23, Robin Murphy wrote:
> On 19/07/17 09:37, Ard Biesheuvel wrote:
>> On 18 July 2017 at 17:57, Robin Murphy  wrote:
>>> Hi all,
>>>
>>> In the wake of the ARM SMMU optimisation efforts, it seems that certain
>>> workloads (e.g. storage I/O with large scatterlists) probably remain quite
>>> heavily influenced by IOVA allocation performance. Separately, Ard also
>>> reported massive performance drops for a graphical desktop on AMD Seattle
>>> when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA
>>> ops domain getting initialised differently for ACPI vs. DT, and exposing
>>> the overhead of the rbtree slow path. Whilst we could go around trying to
>>> close up all the little gaps that lead to hitting the slowest case, it
>>> seems a much better idea to simply make said slowest case a lot less slow.
>>>
>>> I had a go at rebasing Leizhen's last IOVA series[1], but ended up finding
>>> the changes rather too hard to follow, so I've taken the liberty here of
>>> picking the whole thing up and reimplementing the main part in a rather
>>> less invasive manner.
>>>
>>> Robin.
>>>
>>> [1] 
>>> https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg17753.html
>>>
>>> Robin Murphy (1):
>>>   iommu/iova: Extend rbtree node caching
>>>
>>> Zhen Lei (3):
>>>   iommu/iova: Optimise rbtree searching
>>>   iommu/iova: Optimise the padding calculation
>>>   iommu/iova: Make dma_32bit_pfn implicit
>>>
>>>  drivers/gpu/drm/tegra/drm.c  |   3 +-
>>>  drivers/gpu/host1x/dev.c |   3 +-
>>>  drivers/iommu/amd_iommu.c|   7 +--
>>>  drivers/iommu/dma-iommu.c|  18 +--
>>>  drivers/iommu/intel-iommu.c  |  11 ++--
>>>  drivers/iommu/iova.c | 112 
>>> ---
>>>  drivers/misc/mic/scif/scif_rma.c |   3 +-
>>>  include/linux/iova.h |   8 +--
>>>  8 files changed, 60 insertions(+), 105 deletions(-)
>>>
>>
>> These patches look suspiciously like the ones I have been using over
>> the past couple of weeks (modulo the tegra and host1x changes) from
>> your git tree. They work fine on my AMD Overdrive B1, both in DT and
>> in ACPI/IORT modes, although it is difficult to quantify any
>> performance deltas on my setup.
> 
> Indeed - this is a rebase (to account for those new callers) with a
> couple of trivial tweaks to error paths and corner cases that normal
> usage shouldn't have been hitting anyway. "No longer unusably awful" is
> a good enough performance delta for me :)
> 
>> Tested-by: Ard Biesheuvel 
I got the same performance data compared with my patch version. It works well.

Tested-by: Zhen Lei 

> 
> Thanks!
> 
> Robin.
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 1/1] Input: ims-pcu - fix typo in an error log

2017-11-23 Thread Leizhen (ThunderTown)


On 2017/11/24 15:17, Joe Perches wrote:
> On Fri, 2017-11-24 at 14:59 +0800, Zhen Lei wrote:
>> Tiny typo fixed in an error log.
>>
>> I found this when I backported the CVE-2017-16645 patch:
>> ea04efee7635 ("Input: ims-psu - check if CDC union descriptor is sane")
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/input/misc/ims-pcu.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/input/misc/ims-pcu.c b/drivers/input/misc/ims-pcu.c
> []
>> @@ -1651,7 +1651,7 @@ static void ims_pcu_buffers_free(struct ims_pcu *pcu)
>>  return union_desc;
>>
>>  dev_err(>dev,
>> -"Union descriptor to short (%d vs %zd\n)",
>> +"Union descriptor too short (%d vs %zd\n)",
> 
> And this format is incorrect too.  It should be:
> 
> + "Union descriptor too short (%d vs %zd)\n",
> 
> with the close parenthesis before the newline, not after.
You are very observant. Do I need to post v2? It seems that we can simply 
modify it directly.

> 
> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH v2 1/1] mm: only dispaly online cpus of the numa node

2017-10-09 Thread Leizhen (ThunderTown)


On 2017/10/3 21:56, Michal Hocko wrote:
> On Tue 03-10-17 14:47:26, Will Deacon wrote:
>> On Mon, Oct 02, 2017 at 02:54:46PM -0700, Andrew Morton wrote:
>>> On Mon, 2 Oct 2017 11:38:07 +0100 Will Deacon  wrote:
>>>
> When I executed numactl -H(which read 
> /sys/devices/system/node/nodeX/cpumap
> and display cpumask_of_node for each node), but I got different result on
> X86 and arm64. For each numa node, the former only displayed online CPUs,
> and the latter displayed all possible CPUs. Unfortunately, both Linux
> documentation and numactl manual have not described it clear.
>
> I sent a mail to ask for help, and Michal Hocko  
> replied
> that he preferred to print online cpus because it doesn't really make much
> sense to bind anything on offline nodes.
>
> Signed-off-by: Zhen Lei 
> Acked-by: Michal Hocko 
> ---
>  drivers/base/node.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)

 Which tree is this intended to go through? I'm happy to take it via arm64,
 but I don't want to tread on anybody's toes in linux-next and it looks like
 there are already queued changes to this file via Andrew's tree.
>>>
>>> I grabbed it.  I suppose there's some small risk of userspace breakage
>>> so I suggest it be a 4.15-rc1 thing?
>>
>> To be honest, I suspect the vast majority (if not all) code that reads this
>> file was developed for x86, so having the same behaviour for arm64 sounds
>> like something we should do ASAP before people try to special case with
>> things like #ifdef __aarch64__.
>>
>> I'd rather have this in 4.14 if possible.
> 
> Agreed!
> 

+1

-- 
Thanks!
BestRegards



Re: [PATCH 1/1] aio: make sure the input "timeout" value is valid

2017-12-13 Thread Leizhen (ThunderTown)


On 2017/12/14 3:31, Matthew Wilcox wrote:
> On Wed, Dec 13, 2017 at 11:27:00AM -0500, Jeff Moyer wrote:
>> Matthew Wilcox  writes:
>>
>>> On Wed, Dec 13, 2017 at 09:42:52PM +0800, Zhen Lei wrote:
 Below information is reported by a lower kernel version, and I saw the
 problem still exist in current version.
>>>
>>> I think you're right, but what an awful interface we have here!
>>> The user must not only fetch it, they must validate it separately?
>>> And if they forget, then userspace is provoking undefined behaviour?  Ugh.
>>> Why not this:
>>
>> Why not go a step further and have get_timespec64 check for validity?
>> I wonder what caller doesn't want that to happen...
I tried this before. But I found some places call get_timespec64 in the 
following function.
If we do the check in get_timespec64, the check will be duplicated.

For example:
static long do_pselect(int n, fd_set __user *inp, fd_set __user *outp,

if (get_timespec64(, tsp))
return -EFAULT;

to = _time;
if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))

int poll_select_set_timeout(struct timespec64 *to, time64_t sec, long nsec)
{
struct timespec64 ts = {.tv_sec = sec, .tv_nsec = nsec};

if (!timespec64_valid())
return -EINVAL;

> 
> There are some which don't today.  I'm hoping Deepa takes this and goes
> off and fixes them all up.
As my search results, just the case I mentioned above, which may cause 
duplicate check.
So if we don't care the slightly performance drop, maybe we should do 
timespec64_valid
check in get_timespec64. I can try this in v2. Otherwise, use your method.

> 
> .
> 

-- 
Thanks!
BestRegards



Re: [PATCH 2/2] Revert "iommu/arm-smmu-v3: Don't reserve implementation defined register space"

2021-01-20 Thread Leizhen (ThunderTown)



On 2021/1/20 23:02, Robin Murphy wrote:
> On 2021-01-19 01:59, Zhen Lei wrote:
>> This reverts commit 52f3fab0067d6fa9e99c1b7f63265dd48ca76046.
>>
>> This problem has been fixed by another patch. The original method had side
>> effects, it was not mapped to the user-specified resource size. The code
>> will become more complex when ECMDQ is supported later.
> 
> FWIW I don't think that's a significant issue either way - there could be any 
> number of imp-def pages between SMMU page 0 and the ECMDQ control pages, so 
> it will still be logical to map them as another separate thing anyway.

Yes, so now I'm thinking of preserving the SMMUv3 resources and eliminating the 
imp-def area. Then use another devm_ioremap() to cover the entire 
resource,assign it to smmu->base.
Otherwise, a base pointer needs to be defined for each separated register 
space,or call a function to convert each time.

> 
> Robin.
> 
>> Signed-off-by: Zhen Lei 
>> ---
>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 
>> -
>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  3 ---
>>   2 files changed, 4 insertions(+), 31 deletions(-)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index 8ca7415d785d9bf..477f473842e5272 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -91,8 +91,9 @@ struct arm_smmu_option_prop {
>>   static inline void __iomem *arm_smmu_page1_fixup(unsigned long offset,
>>    struct arm_smmu_device *smmu)
>>   {
>> -    if (offset > SZ_64K)
>> -    return smmu->page1 + offset - SZ_64K;
>> +    if ((offset > SZ_64K) &&
>> +    (smmu->options & ARM_SMMU_OPT_PAGE0_REGS_ONLY))
>> +    offset -= SZ_64K;
>>     return smmu->base + offset;
>>   }
>> @@ -3486,18 +3487,6 @@ static int arm_smmu_set_bus_ops(struct iommu_ops *ops)
>>   return err;
>>   }
>>   -static void __iomem *arm_smmu_ioremap(struct device *dev, resource_size_t 
>> start,
>> -  resource_size_t size)
>> -{
>> -    struct resource res = {
>> -    .flags = IORESOURCE_MEM,
>> -    .start = start,
>> -    .end = start + size - 1,
>> -    };
>> -
>> -    return devm_ioremap_resource(dev, );
>> -}
>> -
>>   static int arm_smmu_device_probe(struct platform_device *pdev)
>>   {
>>   int irq, ret;
>> @@ -3533,23 +3522,10 @@ static int arm_smmu_device_probe(struct 
>> platform_device *pdev)
>>   }
>>   ioaddr = res->start;
>>   -    /*
>> - * Don't map the IMPLEMENTATION DEFINED regions, since they may contain
>> - * the PMCG registers which are reserved by the PMU driver.
>> - */
>> -    smmu->base = arm_smmu_ioremap(dev, ioaddr, ARM_SMMU_REG_SZ);
>> +    smmu->base = devm_ioremap_resource(dev, res);
>>   if (IS_ERR(smmu->base))
>>   return PTR_ERR(smmu->base);
>>   -    if (arm_smmu_resource_size(smmu) > SZ_64K) {
>> -    smmu->page1 = arm_smmu_ioremap(dev, ioaddr + SZ_64K,
>> -   ARM_SMMU_REG_SZ);
>> -    if (IS_ERR(smmu->page1))
>> -    return PTR_ERR(smmu->page1);
>> -    } else {
>> -    smmu->page1 = smmu->base;
>> -    }
>> -
>>   /* Interrupt lines */
>>     irq = platform_get_irq_byname_optional(pdev, "combined");
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> index 96c2e9565e00282..0c3090c60840c22 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> @@ -152,8 +152,6 @@
>>   #define ARM_SMMU_PRIQ_IRQ_CFG1    0xd8
>>   #define ARM_SMMU_PRIQ_IRQ_CFG2    0xdc
>>   -#define ARM_SMMU_REG_SZ    0xe00
>> -
>>   /* Common MSI config fields */
>>   #define MSI_CFG0_ADDR_MASK    GENMASK_ULL(51, 2)
>>   #define MSI_CFG2_SH    GENMASK(5, 4)
>> @@ -584,7 +582,6 @@ struct arm_smmu_strtab_cfg {
>>   struct arm_smmu_device {
>>   struct device    *dev;
>>   void __iomem    *base;
>> -    void __iomem    *page1;
>>     #define ARM_SMMU_FEAT_2_LVL_STRTAB    (1 << 0)
>>   #define ARM_SMMU_FEAT_2_LVL_CDTAB    (1 << 1)
>>
> 
> .
> 



Re: [PATCH 1/2] perf/smmuv3: Don't reserve the register space that overlaps with the SMMUv3

2021-01-20 Thread Leizhen (ThunderTown)



On 2021/1/20 23:54, Robin Murphy wrote:
> On 2021-01-20 14:14, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2021/1/20 21:27, Robin Murphy wrote:
>>> On 2021-01-20 09:26, Leizhen (ThunderTown) wrote:
>>>>
>>>>
>>>> On 2021/1/20 11:37, Leizhen (ThunderTown) wrote:
>>>>>
>>>>>
>>>>> On 2021/1/19 20:32, Robin Murphy wrote:
>>>>>> On 2021-01-19 01:59, Zhen Lei wrote:
>>>>>>> Some SMMUv3 implementation embed the Perf Monitor Group Registers (PMCG)
>>>>>>> inside the first 64kB region of the SMMU. Since SMMU and PMCG are 
>>>>>>> managed
>>>>>>> by two separate drivers, and this driver depends on ARM_SMMU_V3, so the
>>>>>>> SMMU driver reserves the corresponding resource first, this driver 
>>>>>>> should
>>>>>>> not reserve the corresponding resource again. Otherwise, a resource
>>>>>>> reservation conflict is reported during boot.
>>>>>>>
>>>>>>> Signed-off-by: Zhen Lei 
>>>>>>> ---
>>>>>>>     drivers/perf/arm_smmuv3_pmu.c | 42 
>>>>>>> --
>>>>>>>     1 file changed, 40 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/perf/arm_smmuv3_pmu.c 
>>>>>>> b/drivers/perf/arm_smmuv3_pmu.c
>>>>>>> index 74474bb322c3f26..dcce085431c6ce8 100644
>>>>>>> --- a/drivers/perf/arm_smmuv3_pmu.c
>>>>>>> +++ b/drivers/perf/arm_smmuv3_pmu.c
>>>>>>> @@ -761,6 +761,44 @@ static void smmu_pmu_get_acpi_options(struct 
>>>>>>> smmu_pmu *smmu_pmu)
>>>>>>>     dev_notice(smmu_pmu->dev, "option mask 0x%x\n", 
>>>>>>> smmu_pmu->options);
>>>>>>>     }
>>>>>>>     +static void __iomem *
>>>>>>> +smmu_pmu_get_and_ioremap_resource(struct platform_device *pdev,
>>>>>>> +  unsigned int index,
>>>>>>> +  struct resource **out_res)
>>>>>>> +{
>>>>>>> +    int ret;
>>>>>>> +    void __iomem *base;
>>>>>>> +    struct resource *res;
>>>>>>> +
>>>>>>> +    res = platform_get_resource(pdev, IORESOURCE_MEM, index);
>>>>>>> +    if (!res) {
>>>>>>> +    dev_err(>dev, "invalid resource\n");
>>>>>>> +    return IOMEM_ERR_PTR(-EINVAL);
>>>>>>> +    }
>>>>>>> +    if (out_res)
>>>>>>> +    *out_res = res;
>>>>>>> +
>>>>>>> +    ret = region_intersects(res->start, resource_size(res),
>>>>>>> +    IORESOURCE_MEM, IORES_DESC_NONE);
>>>>>>> +    if (ret == REGION_INTERSECTS) {
>>>>>>> +    /*
>>>>>>> + * The resource has already been reserved by the SMMUv3 driver.
>>>>>>> + * Don't reserve it again, just do devm_ioremap().
>>>>>>> + */
>>>>>>> +    base = devm_ioremap(>dev, res->start, 
>>>>>>> resource_size(res));
>>>>>>> +    } else {
>>>>>>> +    /*
>>>>>>> + * The resource may have not been reserved by any driver, or
>>>>>>> + * has been reserved but not type IORESOURCE_MEM. In the latter
>>>>>>> + * case, devm_ioremap_resource() reports a conflict and returns
>>>>>>> + * IOMEM_ERR_PTR(-EBUSY).
>>>>>>> + */
>>>>>>> +    base = devm_ioremap_resource(>dev, res);
>>>>>>> +    }
>>>>>>
>>>>>> What if the PMCG driver simply happens to probe first?
>>>>>
>>>>> There are 4 cases:
>>>>> 1) ARM_SMMU_V3=m, ARM_SMMU_V3_PMU=y
>>>>>  It's not allowed. Becase: ARM_SMMU_V3_PMU depends on ARM_SMMU_V3
>>>>>  config ARM_SMMU_V3_PMU
>>>>>    tristate "ARM SMMUv3 Performance Monitors Extension"
>>>>>    depends on ARM64 && ACPI && ARM_SMMU_V3
>>>>>
>>>>> 2) ARM

Re: [PATCH 1/1] iommu/arm-smmu-v3: add support for BBML

2021-01-23 Thread Leizhen (ThunderTown)



On 2021/1/22 20:51, Will Deacon wrote:
> On Thu, Nov 26, 2020 at 11:42:30AM +0800, Zhen Lei wrote:
>> When changing from a set of pages/smaller blocks to a larger block for an
>> address, the software should follow the sequence of BBML processing.
>>
>> When changing from a block to a set of pages/smaller blocks for an
>> address, there's no need to use nT bit. If an address in the large block
>> is accessed before page table switching, the TLB caches the large block
>> mapping. After the page table is switched and before TLB invalidation
>> finished, new access requests are still based on large block mapping.
>> After the block or page is invalidated, the system reads the small block
>> or page mapping from the memory; If the address in the large block is not
>> accessed before page table switching, the TLB has no cache. After the
>> page table is switched, a new access is initiated to read the small block
>> or page mapping from the memory.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 +
>>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
>>  drivers/iommu/io-pgtable-arm.c  | 46 -
>>  include/linux/io-pgtable.h  |  1 +
>>  4 files changed, 40 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index e634bbe60573..14a1a11565fb 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -1977,6 +1977,7 @@ static int arm_smmu_domain_finalise(struct 
>> iommu_domain *domain,
>>  .coherent_walk  = smmu->features & ARM_SMMU_FEAT_COHERENCY,
>>  .tlb= _smmu_flush_ops,
>>  .iommu_dev  = smmu->dev,
>> +.bbml   = smmu->bbml,
>>  };
>>  
>>  if (smmu_domain->non_strict)
>> @@ -3291,6 +3292,7 @@ static int arm_smmu_device_hw_probe(struct 
>> arm_smmu_device *smmu)
>>  
>>  /* IDR3 */
>>  reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
>> +smmu->bbml = FIELD_GET(IDR3_BBML, reg);
>>  if (FIELD_GET(IDR3_RIL, reg))
>>  smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
>>  
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> index d4b7f40ccb02..aa7eb460fa09 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> @@ -51,6 +51,7 @@
>>  #define IDR1_SIDSIZEGENMASK(5, 0)
>>  
>>  #define ARM_SMMU_IDR3   0xc
>> +#define IDR3_BBML   GENMASK(12, 11)
>>  #define IDR3_RIL(1 << 10)
>>  
>>  #define ARM_SMMU_IDR5   0x14
>> @@ -617,6 +618,7 @@ struct arm_smmu_device {
>>  
>>  int gerr_irq;
>>  int combined_irq;
>> +int bbml;
>>  
>>  unsigned long   ias; /* IPA */
>>  unsigned long   oas; /* PA */
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index a7a9bc08dcd1..341581337ad0 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -72,6 +72,7 @@
>>  
>>  #define ARM_LPAE_PTE_NSTABLE(((arm_lpae_iopte)1) << 63)
>>  #define ARM_LPAE_PTE_XN (((arm_lpae_iopte)3) << 53)
>> +#define ARM_LPAE_PTE_nT (((arm_lpae_iopte)1) << 16)
>>  #define ARM_LPAE_PTE_AF (((arm_lpae_iopte)1) << 10)
>>  #define ARM_LPAE_PTE_SH_NS  (((arm_lpae_iopte)0) << 8)
>>  #define ARM_LPAE_PTE_SH_OS  (((arm_lpae_iopte)2) << 8)
>> @@ -255,7 +256,7 @@ static size_t __arm_lpae_unmap(struct 
>> arm_lpae_io_pgtable *data,
>>  
>>  static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>>  phys_addr_t paddr, arm_lpae_iopte prot,
>> -int lvl, arm_lpae_iopte *ptep)
>> +int lvl, arm_lpae_iopte *ptep, arm_lpae_iopte 
>> nT)
>>  {
>>  arm_lpae_iopte pte = prot;
>>  
>> @@ -265,37 +266,60 @@ static void __arm_lpae_init_pte(struct 
>> arm_lpae_io_pgtable *data,
>>  pte |= ARM_LPAE_PTE_TYPE_BLOCK;
>>  
>>  pte |= paddr_to_iopte(paddr, data);
>> +pte |= nT;
>>  
>>  __arm_lpae_set_pte(ptep, pte, >iop.cfg);
>>  }
>>  
>> +static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int 
>> lvl,
>> +arm_lpae_iopte *ptep);
>>  static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>>   unsigned long iova, phys_addr_t paddr,
>>   arm_lpae_iopte prot, int lvl,
>>   arm_lpae_iopte *ptep)
>>  {
>>  arm_lpae_iopte pte = *ptep;
>> +struct io_pgtable_cfg *cfg = >iop.cfg;
>>  
>>  if 

Re: [PATCH 1/1] iommu/arm-smmu-v3: add support for BBML

2021-01-23 Thread Leizhen (ThunderTown)



On 2021/1/22 21:00, Robin Murphy wrote:
> On 2021-01-22 12:51, Will Deacon wrote:
>> On Thu, Nov 26, 2020 at 11:42:30AM +0800, Zhen Lei wrote:
>>> When changing from a set of pages/smaller blocks to a larger block for an
>>> address, the software should follow the sequence of BBML processing.
>>>
>>> When changing from a block to a set of pages/smaller blocks for an
>>> address, there's no need to use nT bit. If an address in the large block
>>> is accessed before page table switching, the TLB caches the large block
>>> mapping. After the page table is switched and before TLB invalidation
>>> finished, new access requests are still based on large block mapping.
>>> After the block or page is invalidated, the system reads the small block
>>> or page mapping from the memory; If the address in the large block is not
>>> accessed before page table switching, the TLB has no cache. After the
>>> page table is switched, a new access is initiated to read the small block
>>> or page mapping from the memory.
>>>
>>> Signed-off-by: Zhen Lei 
>>> ---
>>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 +
>>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
>>>   drivers/iommu/io-pgtable-arm.c  | 46 -
>>>   include/linux/io-pgtable.h  |  1 +
>>>   4 files changed, 40 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> index e634bbe60573..14a1a11565fb 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> @@ -1977,6 +1977,7 @@ static int arm_smmu_domain_finalise(struct 
>>> iommu_domain *domain,
>>>   .coherent_walk    = smmu->features & ARM_SMMU_FEAT_COHERENCY,
>>>   .tlb    = _smmu_flush_ops,
>>>   .iommu_dev    = smmu->dev,
>>> +    .bbml    = smmu->bbml,
>>>   };
>>>     if (smmu_domain->non_strict)
>>> @@ -3291,6 +3292,7 @@ static int arm_smmu_device_hw_probe(struct 
>>> arm_smmu_device *smmu)
>>>     /* IDR3 */
>>>   reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
>>> +    smmu->bbml = FIELD_GET(IDR3_BBML, reg);
>>>   if (FIELD_GET(IDR3_RIL, reg))
>>>   smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
>>>   diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
>>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> index d4b7f40ccb02..aa7eb460fa09 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> @@ -51,6 +51,7 @@
>>>   #define IDR1_SIDSIZE    GENMASK(5, 0)
>>>     #define ARM_SMMU_IDR3    0xc
>>> +#define IDR3_BBML    GENMASK(12, 11)
>>>   #define IDR3_RIL    (1 << 10)
>>>     #define ARM_SMMU_IDR5    0x14
>>> @@ -617,6 +618,7 @@ struct arm_smmu_device {
>>>     int    gerr_irq;
>>>   int    combined_irq;
>>> +    int    bbml;
>>>     unsigned long    ias; /* IPA */
>>>   unsigned long    oas; /* PA */
>>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>>> index a7a9bc08dcd1..341581337ad0 100644
>>> --- a/drivers/iommu/io-pgtable-arm.c
>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>> @@ -72,6 +72,7 @@
>>>     #define ARM_LPAE_PTE_NSTABLE    (((arm_lpae_iopte)1) << 63)
>>>   #define ARM_LPAE_PTE_XN    (((arm_lpae_iopte)3) << 53)
>>> +#define ARM_LPAE_PTE_nT    (((arm_lpae_iopte)1) << 16)
>>>   #define ARM_LPAE_PTE_AF    (((arm_lpae_iopte)1) << 10)
>>>   #define ARM_LPAE_PTE_SH_NS    (((arm_lpae_iopte)0) << 8)
>>>   #define ARM_LPAE_PTE_SH_OS    (((arm_lpae_iopte)2) << 8)
>>> @@ -255,7 +256,7 @@ static size_t __arm_lpae_unmap(struct 
>>> arm_lpae_io_pgtable *data,
>>>     static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>>>   phys_addr_t paddr, arm_lpae_iopte prot,
>>> -    int lvl, arm_lpae_iopte *ptep)
>>> +    int lvl, arm_lpae_iopte *ptep, arm_lpae_iopte nT)
>>>   {
>>>   arm_lpae_iopte pte = prot;
>>>   @@ -265,37 +266,60 @@ static void __arm_lpae_init_pte(struct 
>>> arm_lpae_io_pgtable *data,
>>>   pte |= ARM_LPAE_PTE_TYPE_BLOCK;
>>>     pte |= paddr_to_iopte(paddr, data);
>>> +    pte |= nT;
>>>     __arm_lpae_set_pte(ptep, pte, >iop.cfg);
>>>   }
>>>   +static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, 
>>> int lvl,
>>> +    arm_lpae_iopte *ptep);
>>>   static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>>>    unsigned long iova, phys_addr_t paddr,
>>>    arm_lpae_iopte prot, int lvl,
>>>    arm_lpae_iopte *ptep)
>>>   {
>>>   arm_lpae_iopte pte = *ptep;
>>> +    struct io_pgtable_cfg *cfg = >iop.cfg;
>>>     if (iopte_leaf(pte, lvl, data->iop.fmt)) {
>>>   /* We require an unmap first */
>>>   

Re: [PATCH 2/2] Revert "iommu/arm-smmu-v3: Don't reserve implementation defined register space"

2021-01-21 Thread Leizhen (ThunderTown)



On 2021/1/21 20:50, Robin Murphy wrote:
> On 2021-01-21 02:04, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2021/1/20 23:02, Robin Murphy wrote:
>>> On 2021-01-19 01:59, Zhen Lei wrote:
>>>> This reverts commit 52f3fab0067d6fa9e99c1b7f63265dd48ca76046.
>>>>
>>>> This problem has been fixed by another patch. The original method had side
>>>> effects, it was not mapped to the user-specified resource size. The code
>>>> will become more complex when ECMDQ is supported later.
>>>
>>> FWIW I don't think that's a significant issue either way - there could be 
>>> any number of imp-def pages between SMMU page 0 and the ECMDQ control 
>>> pages, so it will still be logical to map them as another separate thing 
>>> anyway.
>>
>> Yes, so now I'm thinking of preserving the SMMUv3 resources and eliminating 
>> the imp-def area. Then use another devm_ioremap() to cover the entire 
>> resource,assign it to smmu->base.
>> Otherwise, a base pointer needs to be defined for each separated register 
>> space,or call a function to convert each time.
> 
> But we'll almost certainly want to maintain a pointer to start of the ECMDQ 
> control page block anyway, since that's not fixed relative to smmu->base. 
> Therefore what's the harm in handling that via a dedicated mapping, once 
> we've determined that we *do* intend to use ECMDQs? Otherwise we end up with 
> in the complicated dance of trying to map "everything" up-front in order to 
> be able to read the ID registers to determine what the actual extent of 
> "everything" is supposed to be.

Currently, we only mapped the first 0xe00 size, so the 
SMMU_CMDQ_CONTROL_PAGE_XXXn registers space at offset 0x4000 should be mapped 
again.
The size of this ECMDQ resource is not fixed, depending on 
SMMU_IDR6.CMDQ_CONTROL_PAGE_LOG2NUMQ.
Processing its resource reservation to avoid resource conflict with PMCG is a 
bit more complicated.

> 
> (also this reminds me that I was going to remove arm_smmu_page1_fixup() 
> entirely - I'd totally forgotten about that...)

Ah, that patch you made is so clever.

> 
> Robin.
> 
>>>> Signed-off-by: Zhen Lei 
>>>> ---
>>>>    drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 
>>>> -
>>>>    drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  3 ---
>>>>    2 files changed, 4 insertions(+), 31 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>>>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> index 8ca7415d785d9bf..477f473842e5272 100644
>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> @@ -91,8 +91,9 @@ struct arm_smmu_option_prop {
>>>>    static inline void __iomem *arm_smmu_page1_fixup(unsigned long offset,
>>>>     struct arm_smmu_device *smmu)
>>>>    {
>>>> -    if (offset > SZ_64K)
>>>> -    return smmu->page1 + offset - SZ_64K;
>>>> +    if ((offset > SZ_64K) &&
>>>> +    (smmu->options & ARM_SMMU_OPT_PAGE0_REGS_ONLY))
>>>> +    offset -= SZ_64K;
>>>>      return smmu->base + offset;
>>>>    }
>>>> @@ -3486,18 +3487,6 @@ static int arm_smmu_set_bus_ops(struct iommu_ops 
>>>> *ops)
>>>>    return err;
>>>>    }
>>>>    -static void __iomem *arm_smmu_ioremap(struct device *dev, 
>>>> resource_size_t start,
>>>> -  resource_size_t size)
>>>> -{
>>>> -    struct resource res = {
>>>> -    .flags = IORESOURCE_MEM,
>>>> -    .start = start,
>>>> -    .end = start + size - 1,
>>>> -    };
>>>> -
>>>> -    return devm_ioremap_resource(dev, );
>>>> -}
>>>> -
>>>>    static int arm_smmu_device_probe(struct platform_device *pdev)
>>>>    {
>>>>    int irq, ret;
>>>> @@ -3533,23 +3522,10 @@ static int arm_smmu_device_probe(struct 
>>>> platform_device *pdev)
>>>>    }
>>>>    ioaddr = res->start;
>>>>    -    /*
>>>> - * Don't map the IMPLEMENTATION DEFINED regions, since they may 
>>>> contain
>>>> - * the PMCG registers which are reserved by the PMU driver.
>>>> - */
>>>> -    smmu->base = arm_smmu_ioremap(dev, ioaddr, ARM_SMM

Re: [PATCH 1/1] iommu/arm-smmu-v3: add support for BBML

2021-01-22 Thread Leizhen (ThunderTown)



On 2021/1/22 21:00, Robin Murphy wrote:
> On 2021-01-22 12:51, Will Deacon wrote:
>> On Thu, Nov 26, 2020 at 11:42:30AM +0800, Zhen Lei wrote:
>>> When changing from a set of pages/smaller blocks to a larger block for an
>>> address, the software should follow the sequence of BBML processing.
>>>
>>> When changing from a block to a set of pages/smaller blocks for an
>>> address, there's no need to use nT bit. If an address in the large block
>>> is accessed before page table switching, the TLB caches the large block
>>> mapping. After the page table is switched and before TLB invalidation
>>> finished, new access requests are still based on large block mapping.
>>> After the block or page is invalidated, the system reads the small block
>>> or page mapping from the memory; If the address in the large block is not
>>> accessed before page table switching, the TLB has no cache. After the
>>> page table is switched, a new access is initiated to read the small block
>>> or page mapping from the memory.
>>>
>>> Signed-off-by: Zhen Lei 
>>> ---
>>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 +
>>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
>>>   drivers/iommu/io-pgtable-arm.c  | 46 -
>>>   include/linux/io-pgtable.h  |  1 +
>>>   4 files changed, 40 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> index e634bbe60573..14a1a11565fb 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> @@ -1977,6 +1977,7 @@ static int arm_smmu_domain_finalise(struct 
>>> iommu_domain *domain,
>>>   .coherent_walk    = smmu->features & ARM_SMMU_FEAT_COHERENCY,
>>>   .tlb    = _smmu_flush_ops,
>>>   .iommu_dev    = smmu->dev,
>>> +    .bbml    = smmu->bbml,
>>>   };
>>>     if (smmu_domain->non_strict)
>>> @@ -3291,6 +3292,7 @@ static int arm_smmu_device_hw_probe(struct 
>>> arm_smmu_device *smmu)
>>>     /* IDR3 */
>>>   reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
>>> +    smmu->bbml = FIELD_GET(IDR3_BBML, reg);
>>>   if (FIELD_GET(IDR3_RIL, reg))
>>>   smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
>>>   diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
>>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> index d4b7f40ccb02..aa7eb460fa09 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> @@ -51,6 +51,7 @@
>>>   #define IDR1_SIDSIZE    GENMASK(5, 0)
>>>     #define ARM_SMMU_IDR3    0xc
>>> +#define IDR3_BBML    GENMASK(12, 11)
>>>   #define IDR3_RIL    (1 << 10)
>>>     #define ARM_SMMU_IDR5    0x14
>>> @@ -617,6 +618,7 @@ struct arm_smmu_device {
>>>     int    gerr_irq;
>>>   int    combined_irq;
>>> +    int    bbml;
>>>     unsigned long    ias; /* IPA */
>>>   unsigned long    oas; /* PA */
>>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>>> index a7a9bc08dcd1..341581337ad0 100644
>>> --- a/drivers/iommu/io-pgtable-arm.c
>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>> @@ -72,6 +72,7 @@
>>>     #define ARM_LPAE_PTE_NSTABLE    (((arm_lpae_iopte)1) << 63)
>>>   #define ARM_LPAE_PTE_XN    (((arm_lpae_iopte)3) << 53)
>>> +#define ARM_LPAE_PTE_nT    (((arm_lpae_iopte)1) << 16)
>>>   #define ARM_LPAE_PTE_AF    (((arm_lpae_iopte)1) << 10)
>>>   #define ARM_LPAE_PTE_SH_NS    (((arm_lpae_iopte)0) << 8)
>>>   #define ARM_LPAE_PTE_SH_OS    (((arm_lpae_iopte)2) << 8)
>>> @@ -255,7 +256,7 @@ static size_t __arm_lpae_unmap(struct 
>>> arm_lpae_io_pgtable *data,
>>>     static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>>>   phys_addr_t paddr, arm_lpae_iopte prot,
>>> -    int lvl, arm_lpae_iopte *ptep)
>>> +    int lvl, arm_lpae_iopte *ptep, arm_lpae_iopte nT)
>>>   {
>>>   arm_lpae_iopte pte = prot;
>>>   @@ -265,37 +266,60 @@ static void __arm_lpae_init_pte(struct 
>>> arm_lpae_io_pgtable *data,
>>>   pte |= ARM_LPAE_PTE_TYPE_BLOCK;
>>>     pte |= paddr_to_iopte(paddr, data);
>>> +    pte |= nT;
>>>     __arm_lpae_set_pte(ptep, pte, >iop.cfg);
>>>   }
>>>   +static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, 
>>> int lvl,
>>> +    arm_lpae_iopte *ptep);
>>>   static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>>>    unsigned long iova, phys_addr_t paddr,
>>>    arm_lpae_iopte prot, int lvl,
>>>    arm_lpae_iopte *ptep)
>>>   {
>>>   arm_lpae_iopte pte = *ptep;
>>> +    struct io_pgtable_cfg *cfg = >iop.cfg;
>>>     if (iopte_leaf(pte, lvl, data->iop.fmt)) {
>>>   /* We require an unmap first */
>>>   

Re: [PATCH 1/2] perf/smmuv3: Don't reserve the register space that overlaps with the SMMUv3

2021-01-19 Thread Leizhen (ThunderTown)



On 2021/1/19 20:32, Robin Murphy wrote:
> On 2021-01-19 01:59, Zhen Lei wrote:
>> Some SMMUv3 implementation embed the Perf Monitor Group Registers (PMCG)
>> inside the first 64kB region of the SMMU. Since SMMU and PMCG are managed
>> by two separate drivers, and this driver depends on ARM_SMMU_V3, so the
>> SMMU driver reserves the corresponding resource first, this driver should
>> not reserve the corresponding resource again. Otherwise, a resource
>> reservation conflict is reported during boot.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>   drivers/perf/arm_smmuv3_pmu.c | 42 
>> --
>>   1 file changed, 40 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
>> index 74474bb322c3f26..dcce085431c6ce8 100644
>> --- a/drivers/perf/arm_smmuv3_pmu.c
>> +++ b/drivers/perf/arm_smmuv3_pmu.c
>> @@ -761,6 +761,44 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu 
>> *smmu_pmu)
>>   dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
>>   }
>>   +static void __iomem *
>> +smmu_pmu_get_and_ioremap_resource(struct platform_device *pdev,
>> +  unsigned int index,
>> +  struct resource **out_res)
>> +{
>> +    int ret;
>> +    void __iomem *base;
>> +    struct resource *res;
>> +
>> +    res = platform_get_resource(pdev, IORESOURCE_MEM, index);
>> +    if (!res) {
>> +    dev_err(>dev, "invalid resource\n");
>> +    return IOMEM_ERR_PTR(-EINVAL);
>> +    }
>> +    if (out_res)
>> +    *out_res = res;
>> +
>> +    ret = region_intersects(res->start, resource_size(res),
>> +    IORESOURCE_MEM, IORES_DESC_NONE);
>> +    if (ret == REGION_INTERSECTS) {
>> +    /*
>> + * The resource has already been reserved by the SMMUv3 driver.
>> + * Don't reserve it again, just do devm_ioremap().
>> + */
>> +    base = devm_ioremap(>dev, res->start, resource_size(res));
>> +    } else {
>> +    /*
>> + * The resource may have not been reserved by any driver, or
>> + * has been reserved but not type IORESOURCE_MEM. In the latter
>> + * case, devm_ioremap_resource() reports a conflict and returns
>> + * IOMEM_ERR_PTR(-EBUSY).
>> + */
>> +    base = devm_ioremap_resource(>dev, res);
>> +    }
> 
> What if the PMCG driver simply happens to probe first?

There are 4 cases:
1) ARM_SMMU_V3=m, ARM_SMMU_V3_PMU=y
   It's not allowed. Becase: ARM_SMMU_V3_PMU depends on ARM_SMMU_V3
   config ARM_SMMU_V3_PMU
 tristate "ARM SMMUv3 Performance Monitors Extension"
 depends on ARM64 && ACPI && ARM_SMMU_V3

2) ARM_SMMU_V3=y, ARM_SMMU_V3_PMU=m
   No problem, SMMUv3 will be initialized first.

3) ARM_SMMU_V3=y, ARM_SMMU_V3_PMU=y
   vi drivers/Makefile
   60 obj-y   += iommu/
   172 obj-$(CONFIG_PERF_EVENTS)   += perf/

   This link sequence ensure that SMMUv3 driver will be initialized first.
   They are currently at the same initialization level.

4) ARM_SMMU_V3=m, ARM_SMMU_V3_PMU=m
   Sorry, I thought module dependencies were generated based on "depends on".
   But I tried it today,module dependencies are generated only when symbol
   dependencies exist. I should use MODULE_SOFTDEP() to explicitly mark the
   dependency. I will send V2 later.


> 
> Robin.
> 
>> +
>> +    return base;
>> +}
>> +
>>   static int smmu_pmu_probe(struct platform_device *pdev)
>>   {
>>   struct smmu_pmu *smmu_pmu;
>> @@ -793,7 +831,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
>>   .capabilities    = PERF_PMU_CAP_NO_EXCLUDE,
>>   };
>>   -    smmu_pmu->reg_base = devm_platform_get_and_ioremap_resource(pdev, 0, 
>> _0);
>> +    smmu_pmu->reg_base = smmu_pmu_get_and_ioremap_resource(pdev, 0, _0);
>>   if (IS_ERR(smmu_pmu->reg_base))
>>   return PTR_ERR(smmu_pmu->reg_base);
>>   @@ -801,7 +839,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
>>     /* Determine if page 1 is present */
>>   if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) {
>> -    smmu_pmu->reloc_base = devm_platform_ioremap_resource(pdev, 1);
>> +    smmu_pmu->reloc_base = smmu_pmu_get_and_ioremap_resource(pdev, 1, 
>> NULL);
>>   if (IS_ERR(smmu_pmu->reloc_base))
>>   return PTR_ERR(smmu_pmu->reloc_base);
>>   } else {
>>
> 
> .
> 



Re: [PATCH 1/2] perf/smmuv3: Don't reserve the register space that overlaps with the SMMUv3

2021-01-20 Thread Leizhen (ThunderTown)



On 2021/1/20 11:37, Leizhen (ThunderTown) wrote:
> 
> 
> On 2021/1/19 20:32, Robin Murphy wrote:
>> On 2021-01-19 01:59, Zhen Lei wrote:
>>> Some SMMUv3 implementation embed the Perf Monitor Group Registers (PMCG)
>>> inside the first 64kB region of the SMMU. Since SMMU and PMCG are managed
>>> by two separate drivers, and this driver depends on ARM_SMMU_V3, so the
>>> SMMU driver reserves the corresponding resource first, this driver should
>>> not reserve the corresponding resource again. Otherwise, a resource
>>> reservation conflict is reported during boot.
>>>
>>> Signed-off-by: Zhen Lei 
>>> ---
>>>   drivers/perf/arm_smmuv3_pmu.c | 42 
>>> --
>>>   1 file changed, 40 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
>>> index 74474bb322c3f26..dcce085431c6ce8 100644
>>> --- a/drivers/perf/arm_smmuv3_pmu.c
>>> +++ b/drivers/perf/arm_smmuv3_pmu.c
>>> @@ -761,6 +761,44 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu 
>>> *smmu_pmu)
>>>   dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
>>>   }
>>>   +static void __iomem *
>>> +smmu_pmu_get_and_ioremap_resource(struct platform_device *pdev,
>>> +  unsigned int index,
>>> +  struct resource **out_res)
>>> +{
>>> +    int ret;
>>> +    void __iomem *base;
>>> +    struct resource *res;
>>> +
>>> +    res = platform_get_resource(pdev, IORESOURCE_MEM, index);
>>> +    if (!res) {
>>> +    dev_err(>dev, "invalid resource\n");
>>> +    return IOMEM_ERR_PTR(-EINVAL);
>>> +    }
>>> +    if (out_res)
>>> +    *out_res = res;
>>> +
>>> +    ret = region_intersects(res->start, resource_size(res),
>>> +    IORESOURCE_MEM, IORES_DESC_NONE);
>>> +    if (ret == REGION_INTERSECTS) {
>>> +    /*
>>> + * The resource has already been reserved by the SMMUv3 driver.
>>> + * Don't reserve it again, just do devm_ioremap().
>>> + */
>>> +    base = devm_ioremap(>dev, res->start, resource_size(res));
>>> +    } else {
>>> +    /*
>>> + * The resource may have not been reserved by any driver, or
>>> + * has been reserved but not type IORESOURCE_MEM. In the latter
>>> + * case, devm_ioremap_resource() reports a conflict and returns
>>> + * IOMEM_ERR_PTR(-EBUSY).
>>> + */
>>> +    base = devm_ioremap_resource(>dev, res);
>>> +    }
>>
>> What if the PMCG driver simply happens to probe first?
> 
> There are 4 cases:
> 1) ARM_SMMU_V3=m, ARM_SMMU_V3_PMU=y
>It's not allowed. Becase: ARM_SMMU_V3_PMU depends on ARM_SMMU_V3
>config ARM_SMMU_V3_PMU
>  tristate "ARM SMMUv3 Performance Monitors Extension"
>  depends on ARM64 && ACPI && ARM_SMMU_V3
> 
> 2) ARM_SMMU_V3=y, ARM_SMMU_V3_PMU=m
>No problem, SMMUv3 will be initialized first.
> 
> 3) ARM_SMMU_V3=y, ARM_SMMU_V3_PMU=y
>vi drivers/Makefile
>60 obj-y   += iommu/
>172 obj-$(CONFIG_PERF_EVENTS)   += perf/
> 
>This link sequence ensure that SMMUv3 driver will be initialized first.
>They are currently at the same initialization level.
> 
> 4) ARM_SMMU_V3=m, ARM_SMMU_V3_PMU=m
>Sorry, I thought module dependencies were generated based on "depends on".
>But I tried it today,module dependencies are generated only when symbol
>dependencies exist. I should use MODULE_SOFTDEP() to explicitly mark the
>dependency. I will send V2 later.
> 

Hi Robin:
  I think I misunderstood your question. The probe() instead of module_init()
determines the time for reserving register space resources.  So we'd better
reserve multiple small blocks of resources in SMMUv3 but perform ioremap() for
the entire resource, if the probe() of the PMCG occurs first.
  I'll refine these patches to make both initialization sequences work well.
I'm trying to send V2 this week.

> 
>>
>> Robin.
>>
>>> +
>>> +    return base;
>>> +}
>>> +
>>>   static int smmu_pmu_probe(struct platform_device *pdev)
>>>   {
>>>   struct smmu_pmu *smmu_pmu;
>>> @@ -793,7 +831,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
>>>   .capabilities    = PERF_PMU_CAP_NO_EXCLUDE,
>>>   };
>>>   -    smmu_pmu->reg_base = devm_platform_get_and_ioremap_resource(pdev, 0, 
>>> _0);
>>> +    smmu_pmu->reg_base = smmu_pmu_get_and_ioremap_resource(pdev, 0, 
>>> _0);
>>>   if (IS_ERR(smmu_pmu->reg_base))
>>>   return PTR_ERR(smmu_pmu->reg_base);
>>>   @@ -801,7 +839,7 @@ static int smmu_pmu_probe(struct platform_device 
>>> *pdev)
>>>     /* Determine if page 1 is present */
>>>   if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) {
>>> -    smmu_pmu->reloc_base = devm_platform_ioremap_resource(pdev, 1);
>>> +    smmu_pmu->reloc_base = smmu_pmu_get_and_ioremap_resource(pdev, 1, 
>>> NULL);
>>>   if (IS_ERR(smmu_pmu->reloc_base))
>>>   return PTR_ERR(smmu_pmu->reloc_base);
>>>   } else {
>>>
>>
>> .
>>



Re: [PATCH 1/2] perf/smmuv3: Don't reserve the register space that overlaps with the SMMUv3

2021-01-20 Thread Leizhen (ThunderTown)



On 2021/1/20 21:27, Robin Murphy wrote:
> On 2021-01-20 09:26, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2021/1/20 11:37, Leizhen (ThunderTown) wrote:
>>>
>>>
>>> On 2021/1/19 20:32, Robin Murphy wrote:
>>>> On 2021-01-19 01:59, Zhen Lei wrote:
>>>>> Some SMMUv3 implementation embed the Perf Monitor Group Registers (PMCG)
>>>>> inside the first 64kB region of the SMMU. Since SMMU and PMCG are managed
>>>>> by two separate drivers, and this driver depends on ARM_SMMU_V3, so the
>>>>> SMMU driver reserves the corresponding resource first, this driver should
>>>>> not reserve the corresponding resource again. Otherwise, a resource
>>>>> reservation conflict is reported during boot.
>>>>>
>>>>> Signed-off-by: Zhen Lei 
>>>>> ---
>>>>>    drivers/perf/arm_smmuv3_pmu.c | 42 
>>>>> --
>>>>>    1 file changed, 40 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
>>>>> index 74474bb322c3f26..dcce085431c6ce8 100644
>>>>> --- a/drivers/perf/arm_smmuv3_pmu.c
>>>>> +++ b/drivers/perf/arm_smmuv3_pmu.c
>>>>> @@ -761,6 +761,44 @@ static void smmu_pmu_get_acpi_options(struct 
>>>>> smmu_pmu *smmu_pmu)
>>>>>    dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
>>>>>    }
>>>>>    +static void __iomem *
>>>>> +smmu_pmu_get_and_ioremap_resource(struct platform_device *pdev,
>>>>> +  unsigned int index,
>>>>> +  struct resource **out_res)
>>>>> +{
>>>>> +    int ret;
>>>>> +    void __iomem *base;
>>>>> +    struct resource *res;
>>>>> +
>>>>> +    res = platform_get_resource(pdev, IORESOURCE_MEM, index);
>>>>> +    if (!res) {
>>>>> +    dev_err(>dev, "invalid resource\n");
>>>>> +    return IOMEM_ERR_PTR(-EINVAL);
>>>>> +    }
>>>>> +    if (out_res)
>>>>> +    *out_res = res;
>>>>> +
>>>>> +    ret = region_intersects(res->start, resource_size(res),
>>>>> +    IORESOURCE_MEM, IORES_DESC_NONE);
>>>>> +    if (ret == REGION_INTERSECTS) {
>>>>> +    /*
>>>>> + * The resource has already been reserved by the SMMUv3 driver.
>>>>> + * Don't reserve it again, just do devm_ioremap().
>>>>> + */
>>>>> +    base = devm_ioremap(>dev, res->start, resource_size(res));
>>>>> +    } else {
>>>>> +    /*
>>>>> + * The resource may have not been reserved by any driver, or
>>>>> + * has been reserved but not type IORESOURCE_MEM. In the latter
>>>>> + * case, devm_ioremap_resource() reports a conflict and returns
>>>>> + * IOMEM_ERR_PTR(-EBUSY).
>>>>> + */
>>>>> +    base = devm_ioremap_resource(>dev, res);
>>>>> +    }
>>>>
>>>> What if the PMCG driver simply happens to probe first?
>>>
>>> There are 4 cases:
>>> 1) ARM_SMMU_V3=m, ARM_SMMU_V3_PMU=y
>>>     It's not allowed. Becase: ARM_SMMU_V3_PMU depends on ARM_SMMU_V3
>>>     config ARM_SMMU_V3_PMU
>>>   tristate "ARM SMMUv3 Performance Monitors Extension"
>>>   depends on ARM64 && ACPI && ARM_SMMU_V3
>>>
>>> 2) ARM_SMMU_V3=y, ARM_SMMU_V3_PMU=m
>>>     No problem, SMMUv3 will be initialized first.
>>>
>>> 3) ARM_SMMU_V3=y, ARM_SMMU_V3_PMU=y
>>>     vi drivers/Makefile
>>>     60 obj-y   += iommu/
>>>     172 obj-$(CONFIG_PERF_EVENTS)   += perf/
>>>
>>>     This link sequence ensure that SMMUv3 driver will be initialized first.
>>>     They are currently at the same initialization level.
>>>
>>> 4) ARM_SMMU_V3=m, ARM_SMMU_V3_PMU=m
>>>     Sorry, I thought module dependencies were generated based on "depends 
>>> on".
>>>     But I tried it today,module dependencies are generated only when symbol
>>>     dependencies exist. I should use MODULE_SOFTDEP() to explicitly mark the
>>>     dependency. I

Re: [v2] Old platforms: bring out your dead

2021-01-15 Thread Leizhen (ThunderTown)



On 2021/1/15 17:26, Arnd Bergmann wrote:
> On Fri, Jan 15, 2021 at 8:08 AM Wei Xu  wrote:
>> On 2021/1/14 0:14, Arnd Bergmann wrote:
>>> On Fri, Jan 8, 2021 at 11:55 PM Arnd Bergmann  wrote:
>>> * mmp -- added in 2009, DT support is active, but board files might go
>>> * cns3xxx -- added in 2010, last fixed in 2019, probably no users left
>>> * hisi (hip01/hip05) -- servers added in 2013, replaced with arm64 in 2016
>>
>> I think it is OK to drop the support of the hip01(arm32) and hip05(arm64).
>> Could you also help to drop the support of the hip04(arm32) which I think 
>> nobody use as well?
> 
> Thank you for your reply! I actually meant to write hip04 instead of hip05,
> so I was only asking about the two 32-bit targets. I would expect that
> hip05 still has a few users, but wouldn't mind removing that as well if you
> are sure there are none.
> 
> Since Zhen Lei is starting to upstream Kunpeng506 and Kunpeng509
> support, can you clarify how much reuse of IP blocks there is between
> hip04 and those? In particular, hip04 has custom code for (at least)
> platmcpm, clk, irqchip, ethernet, and hw_rng, probably more as those
> were only the ones I see on a quick grep.
> 
> If we remove hip04, should we remove all these drivers right away,
> or keep some of them around?

I think the drivers should be kept. Currently, at least hip04_eth.c and
irq-hip04.c are used. These drivers were originally written for Hip04, but
the drivers used by other boards maybe similar to them. Therefore, these
drivers are extended without adding new drivers.

> 
>   Arnd
> 
> .
> 



Re: [PATCH v3 2/3] dt-bindings: arm: hisilicon: Add binding for L3 cache controller

2021-01-12 Thread Leizhen (ThunderTown)



On 2021/1/12 21:55, Arnd Bergmann wrote:
> On Tue, Jan 12, 2021 at 1:35 PM Leizhen (ThunderTown)
>  wrote:
>> On 2021/1/12 16:46, Arnd Bergmann wrote:
>>> On Tue, Jan 12, 2021 at 2:56 AM Zhen Lei  wrote:
>>>
>>>> +---
>>>> +$id: http://devicetree.org/schemas/arm/hisilicon/l3cache.yaml#
>>>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>>>> +
>>>> +title: Hisilicon L3 cache controller
>>>> +
>>>> +maintainers:
>>>> +  - Wei Xu 
>>>> +
>>>> +description: |
>>>> +  The Hisilicon L3 outer cache controller supports a maximum of 36-bit 
>>>> physical
>>>> +  addresses. The data cached in the L3 outer cache can be operated based 
>>>> on the
>>>> +  physical address range or the entire cache.
>>>> +
>>>> +properties:
>>>> +  compatible:
>>>> +items:
>>>> +  - const: hisilicon,l3cache
>>>> +
>>>
>>> The compatible string needs to be a little more specific, I'm sure
>>> you cannot guarantee that this is the only L3 cache controller ever
>>> designed in the past or future by HiSilicon.
>>>
>>> Normally when you have an IP block that is itself unnamed but that is 
>>> specific
>>> to one or a few SoCs but that has no na, the convention is to include the 
>>> name
>>> of the first SoC that contained it.
>>
>> Right, thanks for your suggestion, I will rename it to 
>> "hisilicon,hi1381-l3cache"
>> and "hisilicon,hi1215-l3cache".

Sorry, Just received a response from the hardware developers, the SoC names 
need to
be changed:
hi1381 --> kunpeng509
hi1215 --> kunpeng506

So I want to rename the compatible string to "hisilicon,kunpeng-l3v1", Kunpeng 
L3
cache controller version 1. This is enough to distinguish other versions of 
cache
controller. It also facilitates the naming of the config option and files.

> 
> Sounds good.
> 
>>> Can you share which products actually use this L3 cache controller?
>>
>> This L3 cache controller is used on Hi1381 and Hi1215 board. I don't know 
>> where
>> these two boards are used. Our company is too large. Software is delivered 
>> level
>> by level. I'm only involved in the Kernel-related part.
>>
>>>
>>> On a related note, what does the memory map look like on this chip?
>>
>> memory@a0 {
>>  device_type = "memory";
>>  reg = <0x0 0xa0 0x0 0x1aa0>, <0x1 0xe000 0x0 0x1d00>, 
>> <0x0 0x1f40 0x0 0xb5c0>;
>> };
>>
>> Currently, the DTS is being maintained by ourselves, I'll try to upstream it 
>> later.
>>
>>> Do you support more than 4GB of total installed memory? If you
>>
>> Currently, the total size does not exceed 4 GB. However, the physical 
>> address is wider than 32 bits.
> 
> Ok, so it appears that the memory is actually contiguous in the first
> 3.5GB (with a few holes), plus the remaining 0.5GB being offset in
> the physical memory by 4GB (starting at 0x1e000 instead of
> 0xe000), presumably to allow the use of 32-bit DMA addresses.
> 
> This works fine for the moment, but it does require support for
> a nonlinear virt_to_phys()/phys_to_virt() translation after highmem
> gets removed, and you would get at most 3.75GB anyway, so it
> might be easier at that point to just drop the entire last block at
> 0x1e000, but this will depend on how well we get the 4G:4G
> code to work, and whether the users will still need kernel updates for
> this platform then.>
>  Arnd
> 
> .
> 



Re: [PATCH v3 2/3] dt-bindings: arm: hisilicon: Add binding for L3 cache controller

2021-01-13 Thread Leizhen (ThunderTown)



On 2021/1/13 15:44, Leizhen (ThunderTown) wrote:
> 
> 
> On 2021/1/12 21:55, Arnd Bergmann wrote:
>> On Tue, Jan 12, 2021 at 1:35 PM Leizhen (ThunderTown)
>>  wrote:
>>> On 2021/1/12 16:46, Arnd Bergmann wrote:
>>>> On Tue, Jan 12, 2021 at 2:56 AM Zhen Lei  
>>>> wrote:
>>>>
>>>>> +---
>>>>> +$id: http://devicetree.org/schemas/arm/hisilicon/l3cache.yaml#
>>>>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>>>>> +
>>>>> +title: Hisilicon L3 cache controller
>>>>> +
>>>>> +maintainers:
>>>>> +  - Wei Xu 
>>>>> +
>>>>> +description: |
>>>>> +  The Hisilicon L3 outer cache controller supports a maximum of 36-bit 
>>>>> physical
>>>>> +  addresses. The data cached in the L3 outer cache can be operated based 
>>>>> on the
>>>>> +  physical address range or the entire cache.
>>>>> +
>>>>> +properties:
>>>>> +  compatible:
>>>>> +items:
>>>>> +  - const: hisilicon,l3cache
>>>>> +
>>>>
>>>> The compatible string needs to be a little more specific, I'm sure
>>>> you cannot guarantee that this is the only L3 cache controller ever
>>>> designed in the past or future by HiSilicon.
>>>>
>>>> Normally when you have an IP block that is itself unnamed but that is 
>>>> specific
>>>> to one or a few SoCs but that has no na, the convention is to include the 
>>>> name
>>>> of the first SoC that contained it.
>>>
>>> Right, thanks for your suggestion, I will rename it to 
>>> "hisilicon,hi1381-l3cache"
>>> and "hisilicon,hi1215-l3cache".
> 
> Sorry, Just received a response from the hardware developers, the SoC names 
> need to
> be changed:
> hi1381 --> kunpeng509
> hi1215 --> kunpeng506
> 
> So I want to rename the compatible string to "hisilicon,kunpeng-l3v1", 
> Kunpeng L3

I thought about it. Let's name it "hisilicon,kunpeng-l3cache", and then add v2 
in
the future. Maybe the SoC name is changed later, and v2 is not required.

> cache controller version 1. This is enough to distinguish other versions of 
> cache
> controller. It also facilitates the naming of the config option and files.
> 
>>
>> Sounds good.
>>
>>>> Can you share which products actually use this L3 cache controller?
>>>
>>> This L3 cache controller is used on Hi1381 and Hi1215 board. I don't know 
>>> where
>>> these two boards are used. Our company is too large. Software is delivered 
>>> level
>>> by level. I'm only involved in the Kernel-related part.
>>>
>>>>
>>>> On a related note, what does the memory map look like on this chip?
>>>
>>> memory@a0 {
>>>  device_type = "memory";
>>>  reg = <0x0 0xa0 0x0 0x1aa0>, <0x1 0xe000 0x0 0x1d00>, 
>>> <0x0 0x1f40 0x0 0xb5c0>;
>>> };
>>>
>>> Currently, the DTS is being maintained by ourselves, I'll try to upstream 
>>> it later.
>>>
>>>> Do you support more than 4GB of total installed memory? If you
>>>
>>> Currently, the total size does not exceed 4 GB. However, the physical 
>>> address is wider than 32 bits.
>>
>> Ok, so it appears that the memory is actually contiguous in the first
>> 3.5GB (with a few holes), plus the remaining 0.5GB being offset in
>> the physical memory by 4GB (starting at 0x1e000 instead of
>> 0xe000), presumably to allow the use of 32-bit DMA addresses.
>>
>> This works fine for the moment, but it does require support for
>> a nonlinear virt_to_phys()/phys_to_virt() translation after highmem
>> gets removed, and you would get at most 3.75GB anyway, so it
>> might be easier at that point to just drop the entire last block at
>> 0x1e000, but this will depend on how well we get the 4G:4G
>> code to work, and whether the users will still need kernel updates for
>> this platform then.>
>>  Arnd
>>
>> .
>>
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> .
> 



Re: [PATCH v4 12/20] dt-bindings: arm: hisilicon: convert hisilicon,hi3798cv200-perictrl bindings to json-schema

2020-09-29 Thread Leizhen (ThunderTown)



On 2020/9/29 17:21, Leizhen (ThunderTown) wrote:
> 
> 
> On 2020/9/29 11:18, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2020/9/29 3:14, Rob Herring wrote:
>>> On Mon, Sep 28, 2020 at 11:13:16PM +0800, Zhen Lei wrote:
>>>> Convert the Hisilicon Hi3798CV200 Peripheral Controller binding to DT
>>>> schema format using json-schema.
>>>>
>>>> Signed-off-by: Zhen Lei 
>>>> ---
>>>>  .../controller/hisilicon,hi3798cv200-perictrl.txt  | 21 --
>>>>  .../controller/hisilicon,hi3798cv200-perictrl.yaml | 45 
>>>> ++
>>>>  2 files changed, 45 insertions(+), 21 deletions(-)
>>>>  delete mode 100644 
>>>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.txt
>>>>  create mode 100644 
>>>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml
>>>>
>>>> diff --git 
>>>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.txt
>>>>  
>>>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.txt
>>>> deleted file mode 100644
>>>> index 0d5282f4670658d..000
>>>> --- 
>>>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.txt
>>>> +++ /dev/null
>>>> @@ -1,21 +0,0 @@
>>>> -Hisilicon Hi3798CV200 Peripheral Controller
>>>> -
>>>> -The Hi3798CV200 Peripheral Controller controls peripherals, queries
>>>> -their status, and configures some functions of peripherals.
>>>> -
>>>> -Required properties:
>>>> -- compatible: Should contain "hisilicon,hi3798cv200-perictrl", "syscon"
>>>> -  and "simple-mfd".
>>>> -- reg: Register address and size of Peripheral Controller.
>>>> -- #address-cells: Should be 1.
>>>> -- #size-cells: Should be 1.
>>>> -
>>>> -Examples:
>>>> -
>>>> -  perictrl: peripheral-controller@8a2 {
>>>> -  compatible = "hisilicon,hi3798cv200-perictrl", "syscon",
>>>> -   "simple-mfd";
>>>> -  reg = <0x8a2 0x1000>;
>>>> -  #address-cells = <1>;
>>>> -  #size-cells = <1>;
>>>> -  };
>>>> diff --git 
>>>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml
>>>>  
>>>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml
>>>> new file mode 100644
>>>> index 000..4e547017e368393
>>>> --- /dev/null
>>>> +++ 
>>>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml
>>>> @@ -0,0 +1,45 @@
>>>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>>>> +%YAML 1.2
>>>> +---
>>>> +$id: 
>>>> http://devicetree.org/schemas/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml#
>>>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>>>> +
>>>> +title: Hisilicon Hi3798CV200 Peripheral Controller
>>>> +
>>>> +maintainers:
>>>> +  - Wei Xu 
>>>> +
>>>> +description: |
>>>> +  The Hi3798CV200 Peripheral Controller controls peripherals, queries
>>>> +  their status, and configures some functions of peripherals.
>>>> +
>>>> +properties:
>>>> +  compatible:
>>>> +items:
>>>> +  - const: hisilicon,hi3798cv200-perictrl
>>>> +  - const: syscon
>>>> +  - const: simple-mfd
>>>> +
>>>> +  reg:
>>>> +description: Register address and size
>>>> +maxItems: 1
>>>> +
>>>> +  '#address-cells':
>>>> +const: 1
>>>> +
>>>> +  '#size-cells':
>>>> +const: 1
>>>
>>> That implies child nodes. You need some sort of schema for them.
>>
>> OK, I will drop #address-cells and #size-cells in this binding.
> 
> I think I misunderstood. I shoud describe child nodes here.
> 
> It's National Day the day after tomorrow, total eight days off. It's so hurry.
> I'll give up this patch! And do it for v5.11

I searched the dtsi, these two properties are required by property "ranges", so
I will add it.

> 
">>
>>>
>>>> +
>>>> +required:
>>>> +  - compatible
>>>> +  - reg
>>>> +
>>>> +examples:
>>>> +  - |
>>>> +perictrl: peripheral-controller@8a2 {
>>>> +compatible = "hisilicon,hi3798cv200-perictrl", "syscon", 
>>>> "simple-mfd";
>>>> +reg = <0x8a2 0x1000>;
>>>> +#address-cells = <1>;
>>>> +#size-cells = <1>;
>>>> +};
>>>> +...
>>>> -- 
>>>> 1.8.3
>>>>
>>>>
>>>
>>> .
>>>
>>
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>> .
>>
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> .
> 



Re: [PATCH v5 15/17] dt-bindings: arm: hisilicon: convert Hi6220 domain controller bindings to json-schema

2020-09-29 Thread Leizhen (ThunderTown)
Hi, Rob:
  I'm so glad to see you applied my patches in this morning. However, this patch
is not applied and without any comment. Did you miss it?


On 2020/9/29 22:14, Zhen Lei wrote:
> Convert the Hisilicon Hi6220 domain controllers binding to DT schema
> format using json-schema. All of them are grouped into one yaml file, to
> help users understand differences and avoid repeated descriptions.
> 
> Signed-off-by: Zhen Lei 
> ---
>  .../hisilicon/controller/hi6220-domain-ctrl.yaml   | 64 
> ++
>  .../controller/hisilicon,hi6220-aoctrl.txt | 18 --
>  .../controller/hisilicon,hi6220-mediactrl.txt  | 18 --
>  .../controller/hisilicon,hi6220-pmctrl.txt | 18 --
>  4 files changed, 64 insertions(+), 54 deletions(-)
>  create mode 100644 
> Documentation/devicetree/bindings/arm/hisilicon/controller/hi6220-domain-ctrl.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-aoctrl.txt
>  delete mode 100644 
> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-mediactrl.txt
>  delete mode 100644 
> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-pmctrl.txt
> 
> diff --git 
> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hi6220-domain-ctrl.yaml
>  
> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hi6220-domain-ctrl.yaml
> new file mode 100644
> index 000..32c562720d877c9
> --- /dev/null
> +++ 
> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hi6220-domain-ctrl.yaml
> @@ -0,0 +1,64 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: 
> http://devicetree.org/schemas/arm/hisilicon/controller/hi6220-domain-ctrl.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Hisilicon Hi6220 domain controller
> +
> +maintainers:
> +  - Wei Xu 
> +
> +description: |
> +  Hisilicon designs some special domain controllers for mobile platform,
> +  such as: the power Always On domain controller, the Media domain
> +  controller(e.g. codec, G3D ...) and the Power Management domain
> +  controller.
> +
> +  The compatible names of each domain controller are as follows:
> +  Power Always ON domain controller  --> hisilicon,hi6220-aoctrl
> +  Media domain controller--> hisilicon,hi6220-mediactrl
> +  Power Management domain controller --> hisilicon,hi6220-pmctrl
> +
> +properties:
> +  compatible:
> +items:
> +  - enum:
> +  - hisilicon,hi6220-aoctrl
> +  - hisilicon,hi6220-mediactrl
> +  - hisilicon,hi6220-pmctrl
> +  - const: syscon
> +
> +  reg:
> +maxItems: 1
> +
> +  '#clock-cells':
> +const: 1
> +
> +required:
> +  - compatible
> +  - reg
> +  - '#clock-cells'
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +ao_ctrl@f780 {
> +compatible = "hisilicon,hi6220-aoctrl", "syscon";
> +reg = <0xf780 0x2000>;
> +#clock-cells = <1>;
> +};
> +
> +media_ctrl@f441 {
> +compatible = "hisilicon,hi6220-mediactrl", "syscon";
> +reg = <0xf441 0x1000>;
> +#clock-cells = <1>;
> +};
> +
> +pm_ctrl@f7032000 {
> +compatible = "hisilicon,hi6220-pmctrl", "syscon";
> +reg = <0xf7032000 0x1000>;
> +#clock-cells = <1>;
> +};
> +...
> diff --git 
> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-aoctrl.txt
>  
> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-aoctrl.txt
> deleted file mode 100644
> index 5a723c1d45f4a17..000
> --- 
> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-aoctrl.txt
> +++ /dev/null
> @@ -1,18 +0,0 @@
> -Hisilicon Hi6220 Power Always ON domain controller
> -
> -Required properties:
> -- compatible : "hisilicon,hi6220-aoctrl"
> -- reg : Register address and size
> -- #clock-cells: should be set to 1, many clock registers are defined
> -  under this controller and this property must be present.
> -
> -Hisilicon designs this system controller to control the power always
> -on domain for mobile platform.
> -
> -Example:
> - /*for Hi6220*/
> - ao_ctrl: ao_ctrl@f780 {
> - compatible = "hisilicon,hi6220-aoctrl", "syscon";
> - reg = <0x0 0xf780 0x0 0x2000>;
> - #clock-cells = <1>;
> - };
> diff --git 
> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-mediactrl.txt
>  
> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-mediactrl.txt
> deleted file mode 100644
> index dcfdcbcb6455771..000
> --- 
> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-mediactrl.txt
> +++ /dev/null
> @@ -1,18 +0,0 @@
> -Hisilicon Hi6220 Media domain controller
> -
> -Required properties:
> -- compatible : "hisilicon,hi6220-mediactrl"
> -- reg : Register address and 

Re: [PATCH v4 12/20] dt-bindings: arm: hisilicon: convert hisilicon,hi3798cv200-perictrl bindings to json-schema

2020-09-29 Thread Leizhen (ThunderTown)



On 2020/9/29 21:52, Rob Herring wrote:
> On Tue, Sep 29, 2020 at 8:25 AM Leizhen (ThunderTown)
>  wrote:
>>
>>
>>
>> On 2020/9/29 17:21, Leizhen (ThunderTown) wrote:
>>>
>>>
>>> On 2020/9/29 11:18, Leizhen (ThunderTown) wrote:
>>>>
>>>>
>>>> On 2020/9/29 3:14, Rob Herring wrote:
>>>>> On Mon, Sep 28, 2020 at 11:13:16PM +0800, Zhen Lei wrote:
>>>>>> Convert the Hisilicon Hi3798CV200 Peripheral Controller binding to DT
>>>>>> schema format using json-schema.
>>>>>>
>>>>>> Signed-off-by: Zhen Lei 
>>>>>> ---
>>>>>>  .../controller/hisilicon,hi3798cv200-perictrl.txt  | 21 --
>>>>>>  .../controller/hisilicon,hi3798cv200-perictrl.yaml | 45 
>>>>>> ++
>>>>>>  2 files changed, 45 insertions(+), 21 deletions(-)
>>>>>>  delete mode 100644 
>>>>>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.txt
>>>>>>  create mode 100644 
>>>>>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml
>>>>>>
>>>>>> diff --git 
>>>>>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.txt
>>>>>>  
>>>>>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.txt
>>>>>> deleted file mode 100644
>>>>>> index 0d5282f4670658d..000
>>>>>> --- 
>>>>>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.txt
>>>>>> +++ /dev/null
>>>>>> @@ -1,21 +0,0 @@
>>>>>> -Hisilicon Hi3798CV200 Peripheral Controller
>>>>>> -
>>>>>> -The Hi3798CV200 Peripheral Controller controls peripherals, queries
>>>>>> -their status, and configures some functions of peripherals.
>>>>>> -
>>>>>> -Required properties:
>>>>>> -- compatible: Should contain "hisilicon,hi3798cv200-perictrl", "syscon"
>>>>>> -  and "simple-mfd".
>>>>>> -- reg: Register address and size of Peripheral Controller.
>>>>>> -- #address-cells: Should be 1.
>>>>>> -- #size-cells: Should be 1.
>>>>>> -
>>>>>> -Examples:
>>>>>> -
>>>>>> -  perictrl: peripheral-controller@8a2 {
>>>>>> -  compatible = "hisilicon,hi3798cv200-perictrl", "syscon",
>>>>>> -   "simple-mfd";
>>>>>> -  reg = <0x8a2 0x1000>;
>>>>>> -  #address-cells = <1>;
>>>>>> -  #size-cells = <1>;
>>>>>> -  };
>>>>>> diff --git 
>>>>>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml
>>>>>>  
>>>>>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml
>>>>>> new file mode 100644
>>>>>> index 000..4e547017e368393
>>>>>> --- /dev/null
>>>>>> +++ 
>>>>>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml
>>>>>> @@ -0,0 +1,45 @@
>>>>>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>>>>>> +%YAML 1.2
>>>>>> +---
>>>>>> +$id: 
>>>>>> http://devicetree.org/schemas/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.yaml#
>>>>>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>>>>>> +
>>>>>> +title: Hisilicon Hi3798CV200 Peripheral Controller
>>>>>> +
>>>>>> +maintainers:
>>>>>> +  - Wei Xu 
>>>>>> +
>>>>>> +description: |
>>>>>> +  The Hi3798CV200 Peripheral Controller controls peripherals, queries
>>>>>> +  their status, and configures some functions of peripherals.
>>>>>> +
>>>>>> +properties:
>>>>>> +  compatible:
>>>>>> +items:
>>>>>> +  - const: hisilicon,hi3798cv200-perictrl
>>>>>> +  - const: syscon
>>>>>> +  - const: simple-mfd
>>>>>> +
>>>>>> +  reg:
>>>>>> +description: Register address and size
>>>>>> +maxItems: 1
>>>>>> +
>>>>>> +  '#address-cells':
>>>>>> +const: 1
>>>>>> +
>>>>>> +  '#size-cells':
>>>>>> +const: 1
>>>>>
>>>>> That implies child nodes. You need some sort of schema for them.
>>>>
>>>> OK, I will drop #address-cells and #size-cells in this binding.
>>>
>>> I think I misunderstood. I shoud describe child nodes here.
>>>
>>> It's National Day the day after tomorrow, total eight days off. It's so 
>>> hurry.
>>> I'll give up this patch! And do it for v5.11
>>
>> I searched the dtsi, these two properties are required by property "ranges", 
>> so
>> I will add it.
> 
> 'ranges' also implies there are child nodes as does 'simple-mfd', so
> whatever child nodes you have are missing and need to be documented
> too. Also, 'ranges' implies the child nodes are memory-mapped, but
> 'simple-mfd' implies they are not. 'simple-bus' is what should be used
> for memory-mapped children.

Sorry, The reason for the jet lag, I went straight home after I sent the
version 5 of these patches last night after 10 p.m. I saw you had applied
the new one. Thanks for the information you showed me here.

> 
> Rob
> 
> .
> 



Re: [PATCH v5 15/17] dt-bindings: arm: hisilicon: convert Hi6220 domain controller bindings to json-schema

2020-09-29 Thread Leizhen (ThunderTown)



On 2020/9/30 9:38, Leizhen (ThunderTown) wrote:
> Hi, Rob:
>   I'm so glad to see you applied my patches in this morning. However, this 
> patch
> is not applied and without any comment. Did you miss it?

Oh, I got it, missed the property "#reset-cells". What a shame! I will post the 
new one.

> 
> 
> On 2020/9/29 22:14, Zhen Lei wrote:
>> Convert the Hisilicon Hi6220 domain controllers binding to DT schema
>> format using json-schema. All of them are grouped into one yaml file, to
>> help users understand differences and avoid repeated descriptions.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  .../hisilicon/controller/hi6220-domain-ctrl.yaml   | 64 
>> ++
>>  .../controller/hisilicon,hi6220-aoctrl.txt | 18 --
>>  .../controller/hisilicon,hi6220-mediactrl.txt  | 18 --
>>  .../controller/hisilicon,hi6220-pmctrl.txt | 18 --
>>  4 files changed, 64 insertions(+), 54 deletions(-)
>>  create mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/hi6220-domain-ctrl.yaml
>>  delete mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-aoctrl.txt
>>  delete mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-mediactrl.txt
>>  delete mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-pmctrl.txt
>>
>> diff --git 
>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hi6220-domain-ctrl.yaml
>>  
>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hi6220-domain-ctrl.yaml
>> new file mode 100644
>> index 000..32c562720d877c9
>> --- /dev/null
>> +++ 
>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hi6220-domain-ctrl.yaml
>> @@ -0,0 +1,64 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: 
>> http://devicetree.org/schemas/arm/hisilicon/controller/hi6220-domain-ctrl.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Hisilicon Hi6220 domain controller
>> +
>> +maintainers:
>> +  - Wei Xu 
>> +
>> +description: |
>> +  Hisilicon designs some special domain controllers for mobile platform,
>> +  such as: the power Always On domain controller, the Media domain
>> +  controller(e.g. codec, G3D ...) and the Power Management domain
>> +  controller.
>> +
>> +  The compatible names of each domain controller are as follows:
>> +  Power Always ON domain controller  --> hisilicon,hi6220-aoctrl
>> +  Media domain controller--> hisilicon,hi6220-mediactrl
>> +  Power Management domain controller --> hisilicon,hi6220-pmctrl
>> +
>> +properties:
>> +  compatible:
>> +items:
>> +  - enum:
>> +  - hisilicon,hi6220-aoctrl
>> +  - hisilicon,hi6220-mediactrl
>> +  - hisilicon,hi6220-pmctrl
>> +  - const: syscon
>> +
>> +  reg:
>> +maxItems: 1
>> +
>> +  '#clock-cells':
>> +const: 1
>> +
>> +required:
>> +  - compatible
>> +  - reg
>> +  - '#clock-cells'
>> +
>> +additionalProperties: false
>> +
>> +examples:
>> +  - |
>> +ao_ctrl@f780 {
>> +compatible = "hisilicon,hi6220-aoctrl", "syscon";
>> +reg = <0xf780 0x2000>;
>> +#clock-cells = <1>;
>> +};
>> +
>> +media_ctrl@f441 {
>> +compatible = "hisilicon,hi6220-mediactrl", "syscon";
>> +reg = <0xf441 0x1000>;
>> +#clock-cells = <1>;
>> +};
>> +
>> +pm_ctrl@f7032000 {
>> +compatible = "hisilicon,hi6220-pmctrl", "syscon";
>> +reg = <0xf7032000 0x1000>;
>> +#clock-cells = <1>;
>> +};
>> +...
>> diff --git 
>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-aoctrl.txt
>>  
>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-aoctrl.txt
>> deleted file mode 100644
>> index 5a723c1d45f4a17..000
>> --- 
>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi6220-aoctrl.txt
>> +++ /dev/null
>> @@ -1,18 +0,0 @@
>> -Hisilicon Hi6220 Power Always ON domain controller
>> -
>> -Required properties:
>> -- compatible : "hisilicon,hi6220-aoctrl"
&

Re: [PATCH v6 01/17] dt-bindings: mfd: syscon: add some compatible strings for Hisilicon

2020-09-30 Thread Leizhen (ThunderTown)



On 2020/9/30 15:11, Lee Jones wrote:
> On Wed, 30 Sep 2020, Zhen Lei wrote:
> 
>> Add some compatible strings for Hisilicon controllers:
>> hisilicon,hi6220-sramctrl  --> Hi6220 SRAM controller
>> hisilicon,pcie-sas-subctrl --> HiP05/HiP06 PCIe-SAS subsystem controller
>> hisilicon,peri-subctrl --> HiP05/HiP06 PERI subsystem controller
>> hisilicon,dsa-subctrl  --> HiP05/HiP06 DSA subsystem controller
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  Documentation/devicetree/bindings/mfd/syscon.yaml | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> This was already applied by the time you re-sent it.
> 
> Any reason for sending it again?

Path 15 are modified. The Document patches except Patch 15 are applied,
but the config/DTS patches are not applied(They are applied after I re-sent).

> 



Re: [PATCH 6/6] dt-bindings: misc: correct the property name cmd-gpios to cmd-gpio

2020-10-14 Thread Leizhen (ThunderTown)



On 2020/10/14 21:50, Rob Herring wrote:
> On Wed, Oct 14, 2020 at 09:29:26AM +0800, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2020/10/14 1:32, Dan Murphy wrote:
>>> Zhen
>>>
>>> On 10/13/20 11:08 AM, Zhen Lei wrote:
>>>> The property name used in arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts is
>>>> cmd-gpio.
>>>>
>>>> arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts:235:
>>>> cmd-gpio = < 155 GPIO_ACTIVE_HIGH>;
>>>>
>>>> Signed-off-by: Zhen Lei 
>>>> ---
>>>>   Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml | 6 +++---
>>>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml 
>>>> b/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml
>>>> index b3c45c046ba5e37..c7a06a9650db2ed 100644
>>>> --- a/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml
>>>> +++ b/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml
>>>> @@ -24,7 +24,7 @@ properties:
>>>>     compatible:
>>>>   const: olpc,xo1.75-ec
>>>>   -  cmd-gpios:
>>>> +  cmd-gpio:
>>>
>>> Preference is gpios not gpio. But Rob H accept or reject
>>
>> Look at the search result below. It seems that the driver have not been 
>> merged into mainline.
> 
> Yes, in drivers/platform/olpc/olpc-xo175-ec.c.
> 
> Your mistake is the gpiod api takes just 'cmd' as the GPIO core handles 
> both forms.

OK, thanks for your information. I have found that it defined by 
gpio_suffixes[].

> 
>> But the property name is really used as cmd-gpio at 
>> mmp2-olpc-xo-1-75.dts:235, I don't think
>> the mmp2-olpc-xo-1-75.dts can make a mistake. Otherwise, the driver will not 
>> work properly.
>> Meanwhile, Both names cmd-gpios and cmd-gpio seem to be in use. But I prefer 
>> cmd-gpio, after
>> all, only one gpio is assigned now. The motorola,cmd-gpios add "s" because 
>> it contains 3 gpio.
> 
> The preference is it is always '-gpios' just like it's always 
> 'interrupts' or 'clocks'.
> 
> However, whether to change this is really up to the OLPC folks. Given 
> the driver has always supported both forms, it should be okay to change 
> the dts. Though there could be other users besides the kernel.

If both "cmd-gpios" and "cmd-gpio" are supported, should we use enum to list 
both
of them in yaml? or use patternProperties?

I'm going to send v2 based on this idea.

> 
> Rob
> 
> .
> 



Re: [PATCH 2/6] dt-bindings: mfd: google,cros-ec: explicitly allow additional properties

2020-10-15 Thread Leizhen (ThunderTown)



On 2020/10/14 21:38, Rob Herring wrote:
> On Wed, Oct 14, 2020 at 12:08:41AM +0800, Zhen Lei wrote:
>> There are so many properties have not been described in this yaml file,
>> and a lot of errors will be reported. Especially, some yaml files such as
>> google,cros-ec-typec.yaml, extcon-usbc-cros-ec.yaml can not pass the
>> self-check, because of the examples. So temporarily allow additional
>> properties to keep the comprehensive dt_binding_check result clean.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  Documentation/devicetree/bindings/mfd/google,cros-ec.yaml | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> There's proper fixes for these under review.

That's a good news.

> 
> Rob
> 
> .
> 



Re: [PATCH 6/6] dt-bindings: misc: correct the property name cmd-gpios to cmd-gpio

2020-10-15 Thread Leizhen (ThunderTown)



On 2020/10/15 15:12, Lubomir Rintel wrote:
> Hi,
> 
> On Wed, Oct 14, 2020 at 12:08:45AM +0800, Zhen Lei wrote:
>> The property name used in arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts is
>> cmd-gpio.
>>
>> arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts:235:
>> cmd-gpio = < 155 GPIO_ACTIVE_HIGH>;
>>
>> Signed-off-by: Zhen Lei 
> 
> Thanks for the patch.
> 
> I've sent out an equivalent one some time ago:
> https://lore.kernel.org/lkml/20200925234805.228251-3-lkund...@v3.sk/
> 
> In any case, either is fine with me.

Geert Uytterhoeven just replied me that the *-gpio form is deprecated. So your
patch is the correct one.

> 
> Acked-by: Lubomir Rintel 
> 
>> ---
>>  Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml 
>> b/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml
>> index b3c45c046ba5e37..c7a06a9650db2ed 100644
>> --- a/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml
>> +++ b/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml
>> @@ -24,7 +24,7 @@ properties:
>>compatible:
>>  const: olpc,xo1.75-ec
>>  
>> -  cmd-gpios:
>> +  cmd-gpio:
>>  description: GPIO uspecifier of the CMD pin
>>  maxItems: 1
>>  
>> @@ -32,7 +32,7 @@ properties:
>>  
>>  required:
>>- compatible
>> -  - cmd-gpios
>> +  - cmd-gpio
>>  
>>  additionalProperties: false
>>  
>> @@ -49,7 +49,7 @@ examples:
>>slave {
>>  compatible = "olpc,xo1.75-ec";
>>  spi-cpha;
>> -cmd-gpios = < 155 GPIO_ACTIVE_HIGH>;
>> +cmd-gpio = < 155 GPIO_ACTIVE_HIGH>;
>>};
>>  };
>>  
>> -- 
>> 1.8.3
>>
>>
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> .
> 



Re: [PATCH v2 1/1] dt-bindings: misc: add support for both property names cmd-gpios and cmd-gpio

2020-10-15 Thread Leizhen (ThunderTown)



On 2020/10/15 15:01, Geert Uytterhoeven wrote:
> Hi Zhen,
> 
> Thanks for your patch!
> 
> On Thu, Oct 15, 2020 at 6:52 AM Zhen Lei  wrote:
>> The definition "gpio_suffixes[] = { "gpios", "gpio" }" shows that both
>> property names "cmd-gpios" and "cmd-gpio" are supported. But currently
>> only "cmd-gpios" is allowed in this yaml, and the name used in
>> mmp2-olpc-xo-1-75.dts is cmd-gpio. As a result, the following errors is
>> reported.
>>
>> slave: 'cmd-gpios' is a required property
>> slave: 'cmd-gpio' does not match any of the regexes: 'pinctrl-[0-9]+'
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml | 14 
>> ++
>>  1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml 
>> b/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml
>> index b3c45c046ba5e37..dd549380a085709 100644
>> --- a/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml
>> +++ b/Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml
>> @@ -24,15 +24,21 @@ properties:
>>compatible:
>>  const: olpc,xo1.75-ec
>>
>> -  cmd-gpios:
>> +  spi-cpha: true
>> +
>> +patternProperties:
>> +  "^cmd-gpio[s]?$":
>>  description: GPIO uspecifier of the CMD pin
>>  maxItems: 1
> 
> In general, the *-gpio form is deprecated.  So why complicate the DT
> bindings by adding support for deprecated properties?

I just don't know this information. So this patch can be ignored.

> 
>   1. Explicitly allowing deprecated properties means new users may be
>  added,
>   2. Once all in-tree DTS files are converted, the warnings will be gone
>  anyway,
>   3. Out-of-tree DTB will still work, as it's very unlikely support for
>  the "gpio" suffix can/will be dropped anytime soon,
>   4. If anyone runs the validator on out-of-tree DTS files, the most
>  probable intention is to fix any detected issues anyway, and the
>  files can be updated, too,
>   5. If any out-of-tree code or tooling relies on the *-gpio form, it
>  may already be broken.
> 
>> -  spi-cpha: true
>> -
>>  required:
>>- compatible
>> -  - cmd-gpios
>> +
>> +oneOf:
>> +  - required:
>> +  - cmd-gpio
>> +  - required:
>> +  - cmd-gpios
>>
>>  additionalProperties: false
> 
> Gr{oetje,eeting}s,
> 
> Geert
> 



Re: [PATCH 5/6] ARM: dts: mmp2-olpc-xo-1-75: explicitly add #address-cells=<0> for slave mode

2020-10-15 Thread Leizhen (ThunderTown)
Hi Lubomir:
 Can you review this patch? The results of all other patches are clear.


On 2020/10/14 0:08, Zhen Lei wrote:
> Delete the old property "#address-cells" and then explicitly add it with
> zero value. The value of "#size-cells" is already zero, so keep it no
> change.
> 
> Signed-off-by: Zhen Lei 
> ---
>  arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts 
> b/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts
> index f1a41152e9dd70d..be88b6e551d58e9 100644
> --- a/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts
> +++ b/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts
> @@ -224,7 +224,7 @@
>  
>   {
>   /delete-property/ #address-cells;
> - /delete-property/ #size-cells;
> + #address-cells = <0>;
>   spi-slave;
>   status = "okay";
>   ready-gpio = < 125 GPIO_ACTIVE_HIGH>;
> 



Re: [PATCH 1/2] arm64: dts: broadcom: remove an unused property dma-ranges

2020-10-16 Thread Leizhen (ThunderTown)



On 2020/10/14 22:02, Arnd Bergmann wrote:
> On Wed, Oct 14, 2020 at 3:36 PM Leizhen (ThunderTown)
>  wrote:
>> On 2020/10/14 15:38, Arnd Bergmann wrote:
>>> On Wed, Oct 14, 2020 at 5:15 AM Florian Fainelli  
>>> wrote:
>>>> On 10/12/2020 11:06 PM, Zhen Lei wrote:
>>>>> stingray-usb.dtsi is finally included by three dts files:
>>>>> bcm958802a802x.dts, bcm958742k.dts and bcm958742t.dts. I searched all
>>>>> these three entire expanded dts files, and each of them contains only one
>>>>> dma-ranges. No conversion range is specified, so it cannot work properly.
>>>>> I think this property "dma-ranges" is added by mistake, just remove it.
>>>>> Otherwise, the following error will be reported when any YAML detection
>>>>> is performed on arm64.
>>>>>
>>>>> arch/arm64/boot/dts/broadcom/stingray/stingray-usb.dtsi:7.3-14: Warning \
>>>>> (dma_ranges_format): /usb:dma-ranges: empty "dma-ranges" property but \
>>>>> its #address-cells (1) differs from / (2)
>>>>> arch/arm64/boot/dts/broadcom/stingray/stingray-usb.dtsi:7.3-14: Warning \
>>>>> (dma_ranges_format): /usb:dma-ranges: empty "dma-ranges" property but \
>>>>> its #size-cells (1) differs from / (2)
>>>>>
>>>>> Signed-off-by: Zhen Lei 
>>>>
>>>> This looks fine to me, Scott, Ray do you want to Ack this patch before I
>>>> take it?
>>>
>>> Does it mean that there are no devices on this bus that can do DMA?
>>>
>>> Usually there should be a dma-ranges property to identify that DMA
>>> is possible and what the limits are, though we have failed to enforce
>>> that.
>>
>> Documentation/devicetree/bindings/iommu/iommu.txt +79
>> When an "iommus" property is specified in a device tree node, the IOMMU will
>> be used for address translation. If a "dma-ranges" property exists in the
>> device's parent node it will be ignored. An exception to this rule is if the
>> referenced IOMMU is disabled, in which case the "dma-ranges" property of the
>> parent shall take effect.
>>
>> The dma-ranges is only required by IOMMU disabled case. And should exist in
>> the parent node of IOMMU device. But this deleted dma-ranges is under the usb
>> bus node.
> 
> The USB hosts here don't use an IOMMU though, right?

Generally, USB devices are accessed through the IOMMU. However, even in this
case, dma-ranges is not necessarily required. There are many examples of this
in arch/arm64/boot/dts/. For example: arch/arm64/boot/dts/arm/juno.dt.yaml.

Not sure, but maybe I found the answer.

vi drivers/of/address.c +457
457   Thus we treat the absence of
458  * "ranges" as equivalent to an empty "ranges" property which means
459  * a 1:1 translation at that level.

466  * This quirk also applies for 'dma-ranges' which frequently exist 
in
467  * child nodes without 'dma-ranges' in the parent nodes. --RobH

475 if (ranges == NULL || rlen == 0) {
476 offset = of_read_number(addr, na);
477 memset(addr, 0, pna * 4);
478 pr_debug("empty ranges; 1:1 translation\n");
479 goto finish;
480 }

By the way: At first, I thought that these errors was detected by YAML. Now,
I found that it was generated by "make dtbs". That's why it was reported by
any YAML. Thus, the need to fix these errors is even more urgent.


> 
>>> Also note that the #address-cells=<1> means that any device under
>>> this bus is assumed to only support 32-bit addressing, and DMA will
>>> have to go through a slow swiotlb in the absence of an IOMMU.
>>
>> The dma_alloc_coherent() will allocate memory with GFP_DMA32 flag and
>> try the 0-4G first. The reserved swiotlb buffer memory is used only
>> when the allocation failed.
> 
> The swiotlb is primarily about the streaming mappings with dma_map_*(),
> which has to copy all data sent to the device. dma_alloc_coherent()
> is a rare operation and less impacted by DMA limitations.

OK, I got it.

> 
>   Arnd
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> .
> 



Re: [PATCH 1/2] arm64: dts: broadcom: remove an unused property dma-ranges

2020-10-16 Thread Leizhen (ThunderTown)



On 2020/10/16 15:06, Leizhen (ThunderTown) wrote:
> 
> 
> On 2020/10/14 22:02, Arnd Bergmann wrote:
>> On Wed, Oct 14, 2020 at 3:36 PM Leizhen (ThunderTown)
>>  wrote:
>>> On 2020/10/14 15:38, Arnd Bergmann wrote:
>>>> On Wed, Oct 14, 2020 at 5:15 AM Florian Fainelli  
>>>> wrote:
>>>>> On 10/12/2020 11:06 PM, Zhen Lei wrote:
>>>>>> stingray-usb.dtsi is finally included by three dts files:
>>>>>> bcm958802a802x.dts, bcm958742k.dts and bcm958742t.dts. I searched all
>>>>>> these three entire expanded dts files, and each of them contains only one
>>>>>> dma-ranges. No conversion range is specified, so it cannot work properly.
>>>>>> I think this property "dma-ranges" is added by mistake, just remove it.
>>>>>> Otherwise, the following error will be reported when any YAML detection
>>>>>> is performed on arm64.
>>>>>>
>>>>>> arch/arm64/boot/dts/broadcom/stingray/stingray-usb.dtsi:7.3-14: Warning \
>>>>>> (dma_ranges_format): /usb:dma-ranges: empty "dma-ranges" property but \
>>>>>> its #address-cells (1) differs from / (2)
>>>>>> arch/arm64/boot/dts/broadcom/stingray/stingray-usb.dtsi:7.3-14: Warning \
>>>>>> (dma_ranges_format): /usb:dma-ranges: empty "dma-ranges" property but \
>>>>>> its #size-cells (1) differs from / (2)
>>>>>>
>>>>>> Signed-off-by: Zhen Lei 
>>>>>
>>>>> This looks fine to me, Scott, Ray do you want to Ack this patch before I
>>>>> take it?
>>>>
>>>> Does it mean that there are no devices on this bus that can do DMA?
>>>>
>>>> Usually there should be a dma-ranges property to identify that DMA
>>>> is possible and what the limits are, though we have failed to enforce
>>>> that.
>>>
>>> Documentation/devicetree/bindings/iommu/iommu.txt +79
>>> When an "iommus" property is specified in a device tree node, the IOMMU will
>>> be used for address translation. If a "dma-ranges" property exists in the
>>> device's parent node it will be ignored. An exception to this rule is if the
>>> referenced IOMMU is disabled, in which case the "dma-ranges" property of the
>>> parent shall take effect.
>>>
>>> The dma-ranges is only required by IOMMU disabled case. And should exist in
>>> the parent node of IOMMU device. But this deleted dma-ranges is under the 
>>> usb
>>> bus node.
>>
>> The USB hosts here don't use an IOMMU though, right?
> 
> Generally, USB devices are accessed through the IOMMU. However, even in this
> case, dma-ranges is not necessarily required. There are many examples of this
> in arch/arm64/boot/dts/. For example: arch/arm64/boot/dts/arm/juno.dt.yaml.
> 
> Not sure, but maybe I found the answer.
> 
> vi drivers/of/address.c +457
> 457   Thus we treat the absence of
> 458  * "ranges" as equivalent to an empty "ranges" property which 
> means
> 459  * a 1:1 translation at that level.
> 
> 466  * This quirk also applies for 'dma-ranges' which frequently 
> exist in
> 467  * child nodes without 'dma-ranges' in the parent nodes. --RobH
> 
> 475 if (ranges == NULL || rlen == 0) {
> 476 offset = of_read_number(addr, na);
> 477 memset(addr, 0, pna * 4);
> 478 pr_debug("empty ranges; 1:1 translation\n");
> 479 goto finish;
> 480 }
> 
> By the way: At first, I thought that these errors was detected by YAML. Now,
> I found that it was generated by "make dtbs". That's why it was reported by
> any YAML. Thus, the need to fix these errors is even more urgent.
> 
> 
>>
>>>> Also note that the #address-cells=<1> means that any device under
>>>> this bus is assumed to only support 32-bit addressing, and DMA will

Hi, Arnd:
  I known what you mean now. I will rewrite the patch. Thanks.

>>>> have to go through a slow swiotlb in the absence of an IOMMU.
>>>
>>> The dma_alloc_coherent() will allocate memory with GFP_DMA32 flag and
>>> try the 0-4G first. The reserved swiotlb buffer memory is used only
>>> when the allocation failed.
>>
>> The swiotlb is primarily about the streaming mappings with dma_map_*(),
>> which has to copy all data sent to the device. dma_alloc_coherent()
>> is a rare operation and less impacted by DMA limitations.
> 
> OK, I got it.
> 
>>
>>   Arnd
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>> .
>>
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> .
> 



Re: [PATCH v6 01/17] dt-bindings: mfd: syscon: add some compatible strings for Hisilicon

2020-10-10 Thread Leizhen (ThunderTown)



On 2020/10/1 14:59, Lee Jones wrote:
> On Wed, 30 Sep 2020, Leizhen (ThunderTown) wrote:
> 
>>
>>
>> On 2020/9/30 15:11, Lee Jones wrote:
>>> On Wed, 30 Sep 2020, Zhen Lei wrote:
>>>
>>>> Add some compatible strings for Hisilicon controllers:
>>>> hisilicon,hi6220-sramctrl  --> Hi6220 SRAM controller
>>>> hisilicon,pcie-sas-subctrl --> HiP05/HiP06 PCIe-SAS subsystem controller
>>>> hisilicon,peri-subctrl --> HiP05/HiP06 PERI subsystem controller
>>>> hisilicon,dsa-subctrl  --> HiP05/HiP06 DSA subsystem controller
>>>>
>>>> Signed-off-by: Zhen Lei 
>>>> ---
>>>>  Documentation/devicetree/bindings/mfd/syscon.yaml | 5 -
>>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> This was already applied by the time you re-sent it.
>>>
>>> Any reason for sending it again?
>>
>> Path 15 are modified. The Document patches except Patch 15 are applied,
>> but the config/DTS patches are not applied(They are applied after I re-sent).
> 
> Could you please only send patches which have not been applied.

No experience. I'll pay attention next time.

> 



Re: [PATCH v6 14/17] dt-bindings: arm: hisilicon: convert hisilicon,hip04-bootwrapper bindings to json-schema

2020-10-10 Thread Leizhen (ThunderTown)



On 2020/10/1 14:41, Krzysztof Kozlowski wrote:
> On Wed, Sep 30, 2020 at 11:17:09AM +0800, Zhen Lei wrote:
>> Convert the Hisilicon Bootwrapper boot method binding to DT schema format
>> using json-schema.
>>
>> The property boot-method contains two groups of physical address range
>> information: bootwrapper and relocation. The "uint32-array" type is not
>> suitable for it, because the field "address" and "size" may occupy one or
>> two cells respectively. Use "minItems: 1" and "maxItems: 2" to allow it
>> can be written in "" or ", "
>> format.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  .../hisilicon/controller/hip04-bootwrapper.yaml| 34 
>> ++
>>  .../controller/hisilicon,hip04-bootwrapper.txt |  9 --
>>  2 files changed, 34 insertions(+), 9 deletions(-)
>>  create mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/hip04-bootwrapper.yaml
>>  delete mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hip04-bootwrapper.txt
>>
>> diff --git 
>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hip04-bootwrapper.yaml
>>  
>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hip04-bootwrapper.yaml
>> new file mode 100644
>> index 000..7378159e61df998
>> --- /dev/null
>> +++ 
>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hip04-bootwrapper.yaml
>> @@ -0,0 +1,34 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: 
>> http://devicetree.org/schemas/arm/hisilicon/controller/hip04-bootwrapper.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Bootwrapper boot method
>> +
>> +maintainers:
>> +  - Wei Xu 
>> +
>> +description: Bootwrapper boot method (software protocol on SMP)
>> +
>> +properties:
>> +  compatible:
>> +items:
>> +  - const: hisilicon,hip04-bootwrapper
>> +
>> +  boot-method:
>> +description: |
>> +  Address and size of boot method.
>> +  [0]: bootwrapper physical address
>> +  [1]: bootwrapper size
>> +  [2]: relocation physical address
>> +  [3]: relocation size
> 
> Intead: items with each item description (bootwrapper address,
> relocation address). This way also min/max Items should not be needed.

I think it's needed. "reg" also specifies maxItems.

> 
> Best regards,
> Krzysztof
> 
> 
>> +minItems: 1
>> +maxItems: 2
>> +
> 
> .
> 



Re: linux-next: manual merge of the devicetree tree with the mfd tree

2020-10-10 Thread Leizhen (ThunderTown)



On 2020/10/1 20:31, Rob Herring wrote:
> On Thu, Oct 1, 2020 at 1:26 AM Krzysztof Kozlowski  wrote:
>>
>> On Thu, 1 Oct 2020 at 08:22, Stephen Rothwell  wrote:
>>>
>>> Hi all,
>>>
>>> Today's linux-next merge of the devicetree tree got a conflict in:
>>>
>>>   Documentation/devicetree/bindings/mfd/syscon.yaml
>>>
>>> between commit:
>>>
>>>   18394297562a ("dt-bindings: mfd: syscon: Merge Samsung Exynos Sysreg 
>>> bindings")
>>>   05027df1b94f ("dt-bindings: mfd: syscon: Document Exynos3 and Exynos5433 
>>> compatibles")
>>>
>>> from the mfd tree and commit:
>>>
>>>   35b096dd6353 ("dt-bindings: mfd: syscon: add some compatible strings for 
>>> Hisilicon")
>>>
>>> from the devicetree tree.
>>>
>>> I fixed it up (see below) and can carry the fix as necessary. This
>>> is now fixed as far as linux-next is concerned, but any non trivial
>>> conflicts should be mentioned to your upstream maintainer when your tree
>>> is submitted for merging.  You may also want to consider cooperating
>>> with the maintainer of the conflicting tree to minimise any particularly
>>> complex conflicts.
>>>
>>> --
>>> Cheers,
>>> Stephen Rothwell
>>>
>>> diff --cc Documentation/devicetree/bindings/mfd/syscon.yaml
>>> index 0f21943dea28,fc2e85004d36..
>>> --- a/Documentation/devicetree/bindings/mfd/syscon.yaml
>>> +++ b/Documentation/devicetree/bindings/mfd/syscon.yaml
>>> @@@ -40,11 -40,10 +40,14 @@@ properties
>>> - allwinner,sun50i-a64-system-controller
>>> - microchip,sparx5-cpu-syscon
>>> - mstar,msc313-pmsleep
>>>  +  - samsung,exynos3-sysreg
>>>  +  - samsung,exynos4-sysreg
>>>  +  - samsung,exynos5-sysreg
>>>  +  - samsung,exynos5433-sysreg
>>> -
>>> +   - hisilicon,hi6220-sramctrl
>>> +   - hisilicon,pcie-sas-subctrl
>>> +   - hisilicon,peri-subctrl
>>> +   - hisilicon,dsa-subctrl
>>
>> Thanks Stephen, looks good.
>>
>> Zhei,
>> However the Huawei compatibles in the original patch were added not
>> alphabetically which messes the order and increases the possibility of
>> conflicts. It would be better if the entries were kept ordered.
> 
> I've fixed up the order.

Thanks.

> 
> Rob
> 
> .
> 



Re: [PATCH v6 11/17] dt-bindings: arm: hisilicon: convert hisilicon,cpuctrl bindings to json-schema

2020-10-10 Thread Leizhen (ThunderTown)



On 2020/10/1 14:40, Krzysztof Kozlowski wrote:
> On Wed, Sep 30, 2020 at 11:17:06AM +0800, Zhen Lei wrote:
>> Convert the Hisilicon CPU controller binding to DT schema format using
>> json-schema.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  .../bindings/arm/hisilicon/controller/cpuctrl.yaml | 29 
>> ++
>>  .../arm/hisilicon/controller/hisilicon,cpuctrl.txt |  8 --
>>  2 files changed, 29 insertions(+), 8 deletions(-)
>>  create mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/cpuctrl.yaml
>>  delete mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,cpuctrl.txt
>>
>> diff --git 
>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/cpuctrl.yaml 
>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/cpuctrl.yaml
>> new file mode 100644
>> index 000..f6a314db3a59416
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/hisilicon/controller/cpuctrl.yaml
>> @@ -0,0 +1,29 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/arm/hisilicon/controller/cpuctrl.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Hisilicon CPU controller
>> +
>> +maintainers:
>> +  - Wei Xu 
>> +
>> +description: |
>> +  The clock registers and power registers of secondary cores are defined
>> +  in CPU controller, especially in HIX5HD2 SoC.
>> +
>> +properties:
>> +  compatible:
>> +items:
>> +  - const: hisilicon,cpuctrl
>> +
>> +  reg:
>> +maxItems: 1
>> +
>> +required:
>> +  - compatible
>> +  - reg
> 
> Your own DTS file (arch/arm/boot/dts/hisi-x5hd2.dtsi) does not validate
> against this dtschema.

OK, I saw it. I just sent out a set of patches, to clean up all 
Hisilicon-related
errors detected by DT schema on arm32. Because many new YAML files are generated
this time, so I use the dtbs_check to check all the files at a times. The error
information did not contain the compatible string, So I didn't see it.

> 
> Best regards,
> Krzysztof
> 
> .
> 



Re: [PATCH v6 16/17] dt-bindings: arm: hisilicon: convert hisilicon,hi3798cv200-perictrl bindings to json-schema

2020-10-10 Thread Leizhen (ThunderTown)



On 2020/10/1 14:35, Krzysztof Kozlowski wrote:
> On Wed, Sep 30, 2020 at 11:17:11AM +0800, Zhen Lei wrote:
>> Convert the Hisilicon Hi3798CV200 Peripheral Controller binding to DT
>> schema format using json-schema.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  .../hisilicon/controller/hi3798cv200-perictrl.yaml | 64 
>> ++
>>  .../controller/hisilicon,hi3798cv200-perictrl.txt  | 21 ---
>>  2 files changed, 64 insertions(+), 21 deletions(-)
>>  create mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/hi3798cv200-perictrl.yaml
>>  delete mode 100644 
>> Documentation/devicetree/bindings/arm/hisilicon/controller/hisilicon,hi3798cv200-perictrl.txt
>>
>> diff --git 
>> a/Documentation/devicetree/bindings/arm/hisilicon/controller/hi3798cv200-perictrl.yaml
>>  
>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hi3798cv200-perictrl.yaml
>> new file mode 100644
>> index 000..cba1937aad9a8d3
>> --- /dev/null
>> +++ 
>> b/Documentation/devicetree/bindings/arm/hisilicon/controller/hi3798cv200-perictrl.yaml
>> @@ -0,0 +1,64 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: 
>> http://devicetree.org/schemas/arm/hisilicon/controller/hi3798cv200-perictrl.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Hisilicon Hi3798CV200 Peripheral Controller
>> +
>> +maintainers:
>> +  - Wei Xu 
>> +
>> +description: |
>> +  The Hi3798CV200 Peripheral Controller controls peripherals, queries
>> +  their status, and configures some functions of peripherals.
>> +
>> +properties:
>> +  compatible:
>> +items:
>> +  - const: hisilicon,hi3798cv200-perictrl
>> +  - const: syscon
>> +  - const: simple-mfd
>> +
>> +  reg:
>> +maxItems: 1
>> +
>> +  "#address-cells":
>> +const: 1
>> +
>> +  "#size-cells":
>> +const: 1
>> +
>> +  ranges: true
>> +
>> +required:
>> +  - compatible
>> +  - reg
>> +  - "#address-cells"
>> +  - "#size-cells"
>> +  - ranges
>> +
>> +additionalProperties:
>> +  type: object
> 
> You need to describe all additional properties or objects.

OK, I will do it in v5.11

> 
> Best regards,
> Krzysztof
> 
> .
> 



Re: [PATCH v3 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary

2020-09-16 Thread Leizhen (ThunderTown)



On 2020/9/16 19:15, Ard Biesheuvel wrote:
> (+ Arnd, Nico)
> 
> On Wed, 16 Sep 2020 at 05:51, Zhen Lei  wrote:
>>
>> Currently, only support the kernels where the base of physical memory is
>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits
>> unrotated value. But we can use one more "add/sub" instructions to handle
>> bits 23-16. The performance will be slightly affected.
>>
>> Since most boards meet 16 MiB alignment, so add a new configuration
>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if
>> anyone really needs it.
>>
>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are
>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in
>> the whole head.S file. So choose it.
>>
>> Because the calculation of "y = x + __pv_offset[63:24]" have been done,
>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why
>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub()
>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t"
>> (above y).
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  arch/arm/Kconfig  | 17 -
>>  arch/arm/include/asm/memory.h | 16 +---
>>  arch/arm/kernel/head.S| 25 +++--
>>  3 files changed, 48 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>> index e00d94b16658765..073dafa428f3c87 100644
>> --- a/arch/arm/Kconfig
>> +++ b/arch/arm/Kconfig
>> @@ -240,12 +240,27 @@ config ARM_PATCH_PHYS_VIRT
>>   kernel in system memory.
>>
>>   This can only be used with non-XIP MMU kernels where the base
>> - of physical memory is at a 16MB boundary.
>> + of physical memory is at a 16MiB boundary.
>>
>>   Only disable this option if you know that you do not require
>>   this feature (eg, building a kernel for a single machine) and
>>   you need to shrink the kernel to the minimal size.
>>
>> +config ARM_PATCH_PHYS_VIRT_RADICAL
>> +   bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary"
>> +   depends on ARM_PATCH_PHYS_VIRT
>> +   depends on !THUMB2_KERNEL
> 
> Why is this not implemented for Thumb2 too?

No Thumb2 boards.

> 
> Also, as Russell points out as well, this may end up being enabled for
> all multiarch kernels, so it makes sense to explore whether we can
> enable this unconditionally. 

Yes, In fact, I think we can consider enabling this unconditionally after
the THUMB2 branch is implemented. Performance and code size should not be
a problem.

> Do you have any numbers wrt the impact on
> text size? I would assume it is negligible, but numbers help.

The text size increased a bit more than 2 KB (2164 Bytes), about 0.0146%.

make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- distclean defconfig

Before:
$ size vmlinux
   textdata bss dec hex filename
14781964   7508366  420080 2271041015a888a vmlinux

After:
$ size vmlinux
   textdata bss dec hex filename
14784128   7508366  420080 2271257415a90fe vmlinux


> 
> Being able to decompress the image to any 2MiB aligned base address is
> also quite useful for EFI boot, and it may also help to get rid of the
> TEXT_OFFSET hacks we have for some platforms in the future.>
> 
>> +   help
>> + This can only be used with non-XIP MMU kernels where the base
>> + of physical memory is at a 64KiB boundary.
>> +
>> + Compared with ARM_PATCH_PHYS_VIRT, one or two more instructions
>> + need to be added to implement the conversion of bits 23-16 of
>> + the VA/PA in phys-to-virt and virt-to-phys. The performance is
>> + slightly affected.
>> +
> 
> Does it affect performance in other ways beyond code size/Icache density?

I just want to say it will slightly slower than !ARM_PATCH_PHYS_VIRT_RADICAL,
because one or two more instructions. It certainly cannot affect system 
performance.

Because of your doubts, I think I should remove the statement: "The performance 
is
slightly affected."

> 
>> + If unsure say N here.
>> +
>>  config NEED_MACH_IO_H
>> bool
>> help
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 99035b5891ef442..f97b37303a00f60 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -173,6 +173,7 @@
>>   * so that all we need to do is modify the 8-bit constant field.
>>   */
>>  #define __PV_BITS_31_240x8100
>> +#define __PV_BITS_23_160x0081
>>  #define __PV_BITS_7_0  0x81
>>
>>  extern unsigned long __pv_phys_pfn_offset;
>> @@ -201,7 +202,7 @@
>> : "=r" (t)  \
>> : "I" (__PV_BITS_7_0))
>>
>> -#define __pv_add_carry_stub(x, y)  \
>> +#define __pv_add_carry_stub(x, y, type)\
>> __asm__ volatile("@ __pv_add_carry_stub\n"   

Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary

2020-09-16 Thread Leizhen (ThunderTown)



On 2020/9/16 15:57, Russell King - ARM Linux admin wrote:
> On Wed, Sep 16, 2020 at 09:57:15AM +0800, Leizhen (ThunderTown) wrote:
>> On 2020/9/16 3:01, Russell King - ARM Linux admin wrote:
>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote:
>>>> Currently, only support the kernels where the base of physical memory is
>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits
>>>> unrotated value. But we can use one more "add/sub" instructions to handle
>>>> bits 23-16. The performance will be slightly affected.
>>>>
>>>> Since most boards meet 16 MiB alignment, so add a new configuration
>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if
>>>> anyone really needs it.
>>>>
>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are
>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in
>>>> the whole head.S file. So choose it.
>>>>
>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done,
>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why
>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub()
>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t"
>>>> (above y).
>>>>
>>>> Signed-off-by: Zhen Lei 
>>>> ---
>>>>  arch/arm/Kconfig  | 18 +-
>>>>  arch/arm/include/asm/memory.h | 16 +---
>>>>  arch/arm/kernel/head.S| 25 +++--
>>>>  3 files changed, 49 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>>>> index e00d94b16658765..19fc2c746e2ce29 100644
>>>> --- a/arch/arm/Kconfig
>>>> +++ b/arch/arm/Kconfig
>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT
>>>>  kernel in system memory.
>>>>  
>>>>  This can only be used with non-XIP MMU kernels where the base
>>>> -of physical memory is at a 16MB boundary.
>>>> +of physical memory is at a 16MiB boundary.
>>>>  
>>>>  Only disable this option if you know that you do not require
>>>>  this feature (eg, building a kernel for a single machine) and
>>>>  you need to shrink the kernel to the minimal size.
>>>>  
>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL
>>>> +  bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary"
>>>> +  default n
>>>
>>> Please drop the "default n" - this is the default anyway.
>>
>> OK, I will remove it.
>>
>>>
>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t 
>>>> x)
>>>> * in place where 'r' 32 bit operand is expected.
>>>> */
>>>>__pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24);
>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL
>>>> +  __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16);
>>>
>>> t is already unsigned long, so this cast is not necessary.
>>
>> Oh, yes, yes, I copied from the above statement, but forgot to remove it.
>>
>>>
>>> I've been debating whether it would be better to use "movw" for this
>>> for ARMv7.  In other words:
>>>
>>> movwtmp, #16-bit
>>> adds%Q0, %1, tmp, lsl #16
>>> adc %R0, %R0, #0
>>>
>>> It would certainly be less instructions, but at the cost of an
>>> additional register - and we'd have to change the fixup code to
>>> know about movw.
>>
>> It's one less instruction for 64KiB boundary && (sizeof(phys_addr_t) == 8),
>> and no increase or decrease for 64KiB boundary && (sizeof(phys_addr_t) == 4),
>> but one more instruction for 16MiB boundary.
>>
>> And maybe: 16MiB is widely used, but 64KiB is rarely used.
>>
>> So I'm inclined to the current revision.
> 
> Multiplatform kernels (which will be what distros build) will have to
> enable this option if they wish to support this platform. So, in that
> case it doesn't just impacting a single platform, but all platforms.

I will try movw. But it may take a few days, because I feel that the changes
will be a little big.

> 



Re: [PATCH 1/2] dt-bindings: interrupt-controller: add Hisilicon SD5203 vector interrupt controller

2020-09-16 Thread Leizhen (ThunderTown)



On 2020/9/15 14:12, Leizhen (ThunderTown) wrote:
> 
> 
> On 2020/9/15 4:31, Rob Herring wrote:
>> On Thu, Sep 03, 2020 at 08:05:03PM +0800, Zhen Lei wrote:
>>> Add DT bindings for the Hisilicon SD5203 vector interrupt controller.
>>>
>>> Signed-off-by: Zhen Lei 
>>> ---
>>>  .../hisilicon,sd5203-vic.txt  | 27 +++
>>
>> Bindings should be in DT schema format now.

Do I need to change the existing "snps,dw-apb-ictl.txt" to DT schema format?

> 
> Hi, Rob Herring:
> 
> As Marc Zyngier's suggestion, I discarded adding an independent SD5203-VIC
> driver, but make the dw-apb-ictl irqchip driver to support hierarchy irq 
> domain.
> So this new file was also dropped. Now, I updated the descriptions in the 
> existing
> file "snps,dw-apb-ictl.txt" in the following versions.
> 
>>
>>>  1 file changed, 27 insertions(+)
>>>  create mode 100644 
>>> Documentation/devicetree/bindings/interrupt-controller/hisilicon,sd5203-vic.txt
>>>
>>> diff --git 
>>> a/Documentation/devicetree/bindings/interrupt-controller/hisilicon,sd5203-vic.txt
>>>  
>>> b/Documentation/devicetree/bindings/interrupt-controller/hisilicon,sd5203-vic.txt
>>> new file mode 100644
>>> index ..a08292e868b0
>>> --- /dev/null
>>> +++ 
>>> b/Documentation/devicetree/bindings/interrupt-controller/hisilicon,sd5203-vic.txt
>>> @@ -0,0 +1,27 @@
>>> +Hisilicon SD5203 vector interrupt controller (VIC)
>>> +
>>> +Hisilicon SD5203 VIC based on Synopsys DesignWare APB interrupt 
>>> controller, but
>>> +there's something special:
>>> +1. The maximum number of irqs supported is 32. The registers ENABLE, MASK 
>>> and
>>> +   FINALSTATUS are 32 bits.
>>> +2. There is only one VIC, it's used as primary interrupt controller.
>>> +
>>> +Required properties:
>>> +- compatible: shall be "hisilicon,sd5203-vic"
>>> +- reg: physical base address of the controller and length of memory mapped
>>> +  region starting with ENABLE_LOW register
>>> +- interrupt-controller: identifies the node as an interrupt controller
>>> +- #interrupt-cells: number of cells to encode an interrupt-specifier, 
>>> shall be 1
>>> +
>>> +The interrupt sources map to the corresponding bits in the interrupt
>>> +registers, i.e.
>>> +- 0 maps to bit 0 of low interrupts,
>>> +- 1 maps to bit 1 of low interrupts,
>>> +
>>> +Example:
>>> +   vic: interrupt-controller@1013 {
>>> +   compatible = "hisilicon,sd5203-vic";
>>> +   reg = <0x1013 0x1000>;
>>> +   interrupt-controller;
>>> +   #interrupt-cells = <1>;
>>> +   };
>>> -- 
>>> 2.26.0.106.g9fadedd
>>>
>>>
>>
>> .
>>



Re: [PATCH v4 1/4] genirq: define an empty function set_handle_irq() if !GENERIC_IRQ_MULTI_HANDLER

2020-09-16 Thread Leizhen (ThunderTown)



On 2020/9/15 16:43, Zhen Lei wrote:
> To avoid compilation error if an irqchip driver references the function
> set_handle_irq() but may not select GENERIC_IRQ_MULTI_HANDLER on some
> systems.

Hi, Marc:
  Do you agree with this method?

  Otherwise, I should use "#ifdef CONFIG_GENERIC_IRQ_MULTI_HANDLER ... #endif"
to perform the compilation isolation. This may make the code less beautiful.

> 
> For example, the Synopsys DesignWare APB interrupt controller
> (dw_apb_ictl) is used as the secondary interrupt controller on arc, csky,
> arm64, and most arm32 SoCs, and it's also used as the primary interrupt
> controller on Hisilicon SD5203 (an arm32 SoC). The latter need to use
> set_handle_irq() to register the top-level IRQ handler, but this multi
> irq handler registration mechanism is not implemented on arc system.
> 
> The input parameter "handle_irq" maybe defined as static and only
> set_handle_irq() references it. This will trigger "defined but not used"
> warning. So add "(void)handle_irq" to suppress it.
> 
> Signed-off-by: Zhen Lei 
> ---
>  include/linux/irq.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 1b7f4dfee35b397..0848a2aaa9b40b1 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -1252,6 +1252,8 @@ void irq_matrix_free(struct irq_matrix *m, unsigned int 
> cpu,
>   * top-level IRQ handler.
>   */
>  extern void (*handle_arch_irq)(struct pt_regs *) __ro_after_init;
> +#else
> +#define set_handle_irq(handle_irq)   do { (void)handle_irq; } while (0)
>  #endif
>  
>  #endif /* _LINUX_IRQ_H */
> 



Re: [PATCH v4 1/4] genirq: define an empty function set_handle_irq() if !GENERIC_IRQ_MULTI_HANDLER

2020-09-17 Thread Leizhen (ThunderTown)



On 2020/9/17 17:32, Marc Zyngier wrote:
> On 2020-09-17 04:46, Leizhen (ThunderTown) wrote:
>> On 2020/9/15 16:43, Zhen Lei wrote:
>>> To avoid compilation error if an irqchip driver references the function
>>> set_handle_irq() but may not select GENERIC_IRQ_MULTI_HANDLER on some
>>> systems.
>>
>> Hi, Marc:
>>   Do you agree with this method?
>>
>>   Otherwise, I should use "#ifdef CONFIG_GENERIC_IRQ_MULTI_HANDLER ... 
>> #endif"
>> to perform the compilation isolation. This may make the code less beautiful.
>>
>>>
>>> For example, the Synopsys DesignWare APB interrupt controller
>>> (dw_apb_ictl) is used as the secondary interrupt controller on arc, csky,
>>> arm64, and most arm32 SoCs, and it's also used as the primary interrupt
>>> controller on Hisilicon SD5203 (an arm32 SoC). The latter need to use
>>> set_handle_irq() to register the top-level IRQ handler, but this multi
>>> irq handler registration mechanism is not implemented on arc system.
>>>
>>> The input parameter "handle_irq" maybe defined as static and only
>>> set_handle_irq() references it. This will trigger "defined but not used"
>>> warning. So add "(void)handle_irq" to suppress it.
>>>
>>> Signed-off-by: Zhen Lei 
>>> ---
>>>  include/linux/irq.h | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/include/linux/irq.h b/include/linux/irq.h
>>> index 1b7f4dfee35b397..0848a2aaa9b40b1 100644
>>> --- a/include/linux/irq.h
>>> +++ b/include/linux/irq.h
>>> @@ -1252,6 +1252,8 @@ void irq_matrix_free(struct irq_matrix *m, unsigned 
>>> int cpu,
>>>   * top-level IRQ handler.
>>>   */
>>>  extern void (*handle_arch_irq)(struct pt_regs *) __ro_after_init;
>>> +#else
>>> +#define set_handle_irq(handle_irq)    do { (void)handle_irq; } while (0)
>>>  #endif
>>>
>>>  #endif /* _LINUX_IRQ_H */
>>>
> 
> You shouldn't just make it a NOP. Consider adding a WARN_ON(1), so that
> people can realize this cannot work without the required architecture support.

Oh, right. I will add it.

> 
>     M.



Re: [PATCH v3 0/7] bugfix and optimize for drivers/nvdimm

2020-08-27 Thread Leizhen (ThunderTown)
Hi all:
  Any comment? I want to merge patches 1 and 2 into one, then send
other patches separately.

On 2020/8/20 10:16, Zhen Lei wrote:
> v2 --> v3:
> 1. Fix spelling error of patch 1 subject: memmory --> memory
> 2. Add "Reviewed-by: Oliver O'Halloran " into patch 1
> 3. Rewrite patch descriptions of Patch 1, 3, 4
> 4. Add 3 new trivial patches 5-7, I just found that yesterday.
> 5. Unify all "subsystem" names to "libnvdimm:"
> 
> v1 --> v2:
> 1. Add Fixes for Patch 1-2
> 2. Slightly change the subject and description of Patch 1
> 3. Add a new trivial Patch 4, I just found that yesterday.
> 
> v1:
> I found a memleak when I learned the drivers/nvdimm code today. And I also
> added a sanity check for priv->bus_desc.provider_name, because strdup()
> maybe failed. Patch 3 is a trivial source code optimization.
> 
> 
> Zhen Lei (7):
>   libnvdimm: fix memory leaks in of_pmem.c
>   libnvdimm: add sanity check for provider_name in
> of_pmem_region_probe()
>   libnvdimm: simplify walk_to_nvdimm_bus()
>   libnvdimm: reduce an unnecessary if branch in nd_region_create()
>   libnvdimm: reduce an unnecessary if branch in nd_region_activate()
>   libnvdimm: make sure EXPORT_SYMBOL_GPL(nvdimm_flush) close to its
> function
>   libnvdimm: slightly simplify available_slots_show()
> 
>  drivers/nvdimm/bus.c |  7 +++
>  drivers/nvdimm/dimm_devs.c   |  5 ++---
>  drivers/nvdimm/of_pmem.c |  7 +++
>  drivers/nvdimm/region_devs.c | 13 -
>  4 files changed, 16 insertions(+), 16 deletions(-)
> 



Re: [PATCH 3/4] libnvdimm: eliminate two unnecessary zero initializations in badrange.c

2020-08-27 Thread Leizhen (ThunderTown)
I will drop this patch, because badrange_add() is unlikely to be called.
There's no need to care about trivial performance improvements.

On 2020/8/20 22:30, Zhen Lei wrote:
> Currently, the "struct badrange_entry" has three members: start, length,
> list. In append_badrange_entry(), "start" and "length" will be assigned
> later, and "list" does not need to be initialized before calling
> list_add_tail(). That means, the kzalloc() in badrange_add() or
> alloc_and_append_badrange_entry() can be replaced with kmalloc(), because
> the zero initialization is not required.
> 
> Signed-off-by: Zhen Lei 
> ---
>  drivers/nvdimm/badrange.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvdimm/badrange.c b/drivers/nvdimm/badrange.c
> index 7f78b659057902d..13145001c52ff39 100644
> --- a/drivers/nvdimm/badrange.c
> +++ b/drivers/nvdimm/badrange.c
> @@ -37,7 +37,7 @@ static int alloc_and_append_badrange_entry(struct badrange 
> *badrange,
>  {
>   struct badrange_entry *bre;
>  
> - bre = kzalloc(sizeof(*bre), flags);
> + bre = kmalloc(sizeof(*bre), flags);
>   if (!bre)
>   return -ENOMEM;
>  
> @@ -49,7 +49,7 @@ int badrange_add(struct badrange *badrange, u64 addr, u64 
> length)
>  {
>   struct badrange_entry *bre, *bre_new;
>  
> - bre_new = kzalloc(sizeof(*bre_new), GFP_KERNEL);
> + bre_new = kmalloc(sizeof(*bre_new), GFP_KERNEL);
>  
>   spin_lock(>lock);
>  
> 



Re: [PATCH v3 5/7] libnvdimm: reduce an unnecessary if branch in nd_region_activate()

2020-08-27 Thread Leizhen (ThunderTown)
I will drop this patch, because I have a doubt:
Suppose the nd_region->ndr_mappings is 4, and for each nd_region->mapping[],
the value of num_flush is "0, 0, 4, 0", so the flush_data_size is "1 + 1 + 5 + 
1", * sizeof(void *).
But in ndrd_get_flush_wpq() or ndrd_set_flush_wpq(), the expression is
"ndrd->flush_wpq[dimm * num + (hint & mask)]", I don't think the memory "ndrd" 
allocated is enough.
Please refer call chain: nd_region_activate() --> nvdimm_map_flush() --> 
ndrd_set_flush_wpq()

for (i = 0; i < nd_region->ndr_mappings; i++) {
struct nd_mapping *nd_mapping = _region->mapping[i];
struct nvdimm *nvdimm = nd_mapping->nvdimm;

/* at least one null hint slot per-dimm for the "no-hint" case 
*/
flush_data_size += sizeof(void *);
num_flush = min_not_zero(num_flush, nvdimm->num_flush);
if (!nvdimm->num_flush)
continue;
flush_data_size += nvdimm->num_flush * sizeof(void *);
}

ndrd = devm_kzalloc(dev, sizeof(*ndrd) + flush_data_size, GFP_KERNEL);




On 2020/8/20 10:16, Zhen Lei wrote:
> According to the original code logic:
> if (!nvdimm->num_flush) {
>   flush_data_size += sizeof(void *);
>   //nvdimm->num_flush is zero now, add 1) have no side effects
> } else {
>   flush_data_size += sizeof(void *);
> 1)flush_data_size += nvdimm->num_flush * sizeof(void *);
> }
> 
> Obviously, the above code snippet can be reduced to one statement:
> flush_data_size += (nvdimm->num_flush + 1) * sizeof(void *);
> 
> No functional change.
> 
> Signed-off-by: Zhen Lei 
> ---
>  drivers/nvdimm/region_devs.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index 7cf9c7d857909ce..49be115c9189eff 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -77,11 +77,8 @@ int nd_region_activate(struct nd_region *nd_region)
>   }
>  
>   /* at least one null hint slot per-dimm for the "no-hint" case 
> */
> - flush_data_size += sizeof(void *);
> + flush_data_size += (nvdimm->num_flush + 1) * sizeof(void *);
>   num_flush = min_not_zero(num_flush, nvdimm->num_flush);
> - if (!nvdimm->num_flush)
> - continue;
> - flush_data_size += nvdimm->num_flush * sizeof(void *);
>   }
>   nvdimm_bus_unlock(_region->dev);
>  
> 



Re: [PATCH v2 1/1] samples/seccomp: eliminate two compile warnings in user-trap.c

2020-09-08 Thread Leizhen (ThunderTown)



On 2020/9/9 7:42, Kees Cook wrote:
> On Wed, Sep 02, 2020 at 09:33:06AM +0800, Leizhen (ThunderTown) wrote:
>> On 2020/9/1 16:39, Zhen Lei wrote:
>>> samples/seccomp/user-trap.c is compiled with $(userccflags), and the
>>> latter does not contain -fno-strict-aliasing, so the warnings reported as
>>> below. Due to add "userccflags += -fno-strict-aliasing" will impact other
>>> files, so use __attribute__((__may_alias__)) to suppress it exactly.
>>>
>>> My gcc version is 5.5.0 20171010.
>>>
>>> --
>>> samples/seccomp/user-trap.c: In function ‘send_fd’:
>>> samples/seccomp/user-trap.c:50:2: warning: dereferencing type-punned 
>>> pointer will break strict-aliasing rules [-Wstrict-aliasing]
>>>   *((int *)CMSG_DATA(cmsg)) = fd;
>>>   ^
>>> samples/seccomp/user-trap.c: In function ‘recv_fd’:
>>> samples/seccomp/user-trap.c:83:2: warning: dereferencing type-punned 
>>> pointer will break strict-aliasing rules [-Wstrict-aliasing]
>>>   return *((int *)CMSG_DATA(cmsg));
>>>   ^
>>
>> Doesn't anyone care about this? Or is it that everyone hasn't encountered 
>> this problem?
>> Why do these two warnings occur every time I compiled?
> 
> Hi!
> 
> I think the samples have been a bit ignored lately because they have a
> lot of weird build issues with regard to native vs compat and needing
> the kernel headers to be built first, etc.
> 
> That said, yes, I'd like to fix warnings. However, I can't reproduce
> this. How are you building? I tried x86_64 and cross-compiled to i386.

I can reproduce it both on X86 and ARM64.

On X86:
make distclean allmodconfig
make -j64 2>err.txt
vi err.txt

$ arch
x86_64
$ ls -l samples/seccomp/user-trap
user-trapuser-trap.c
$ gcc -v
gcc version 5.5.0 20171010 (Ubuntu 5.5.0-12ubuntu5~16.04)

On ARM64:
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- distclean allmodconfig
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j64 2>err.tx
vi err.txt

$ ls -l samples/seccomp/user-trap
user-trapuser-trap.c
$ aarch64-linux-gnu-gcc -v
gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
> 



Re: [PATCH v4 0/1] libnvdimm: fix memory leaks in of_pmem.c

2020-09-01 Thread Leizhen (ThunderTown)



On 2020/9/1 18:14, Markus Elfring wrote:
>> v3 --> v4
>> 1. Merge patch 1 and 2 into one:
> 
> How do you think about to omit a cover letter for a single patch?

After all, the code hasn't changed except this merge.

> 
> Regards,
> Markus
> 
> 



Re: [PATCH v2 1/1] samples/seccomp: eliminate two compile warnings in user-trap.c

2020-09-01 Thread Leizhen (ThunderTown)
Doesn't anyone care about this? Or is it that everyone hasn't encountered this 
problem?
Why do these two warnings occur every time I compiled?

On 2020/9/1 16:39, Zhen Lei wrote:
> samples/seccomp/user-trap.c is compiled with $(userccflags), and the
> latter does not contain -fno-strict-aliasing, so the warnings reported as
> below. Due to add "userccflags += -fno-strict-aliasing" will impact other
> files, so use __attribute__((__may_alias__)) to suppress it exactly.
> 
> My gcc version is 5.5.0 20171010.
> 
> --
> samples/seccomp/user-trap.c: In function ‘send_fd’:
> samples/seccomp/user-trap.c:50:2: warning: dereferencing type-punned pointer 
> will break strict-aliasing rules [-Wstrict-aliasing]
>   *((int *)CMSG_DATA(cmsg)) = fd;
>   ^
> samples/seccomp/user-trap.c: In function ‘recv_fd’:
> samples/seccomp/user-trap.c:83:2: warning: dereferencing type-punned pointer 
> will break strict-aliasing rules [-Wstrict-aliasing]
>   return *((int *)CMSG_DATA(cmsg));
>   ^
> 
> Signed-off-by: Zhen Lei 
> ---
>  samples/seccomp/user-trap.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/samples/seccomp/user-trap.c b/samples/seccomp/user-trap.c
> index 20291ec6489f31e..e36696b7f41517f 100644
> --- a/samples/seccomp/user-trap.c
> +++ b/samples/seccomp/user-trap.c
> @@ -23,6 +23,8 @@
>  
>  #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
>  
> +typedef int __attribute__((__may_alias__)) __int_alias_t;
> +
>  static int seccomp(unsigned int op, unsigned int flags, void *args)
>  {
>   errno = 0;
> @@ -47,7 +49,7 @@ static int send_fd(int sock, int fd)
>   cmsg->cmsg_level = SOL_SOCKET;
>   cmsg->cmsg_type = SCM_RIGHTS;
>   cmsg->cmsg_len = CMSG_LEN(sizeof(int));
> - *((int *)CMSG_DATA(cmsg)) = fd;
> + *(__int_alias_t *)CMSG_DATA(cmsg) = fd;
>   msg.msg_controllen = cmsg->cmsg_len;
>  
>   if (sendmsg(sock, , 0) < 0) {
> @@ -80,7 +82,7 @@ static int recv_fd(int sock)
>  
>   cmsg = CMSG_FIRSTHDR();
>  
> - return *((int *)CMSG_DATA(cmsg));
> + return *(__int_alias_t *)CMSG_DATA(cmsg);
>  }
>  
>  static int user_trap_syscall(int nr, unsigned int flags)
> 



Re: [PATCH v4 00/23] device-dax: Support sub-dividing soft-reserved ranges

2020-08-21 Thread Leizhen (ThunderTown)



On 8/22/2020 7:21 AM, Andrew Morton wrote:
> On Wed, 19 Aug 2020 18:53:57 -0700 Dan Williams  
> wrote:
> 
>>> I think I am missing some important pieces. Bear with me.
>>
>> No worries, also bear with me, I'm going to be offline intermittently
>> until at least mid-September. Hopefully Joao and/or Vishal can jump in
>> on this discussion.
> 
> Ordinarily I'd prefer a refresh for 2+ week-old series such as
> this.
> 
> But given that v4 all applies OK and that Dan has pending outages, I'll
> scoop up this version, even though at least one change has been suggested.
> 
> Also, this series has killed Zhen Lei's little cleanup
> (http://lkml.kernel.org/r/20200817065926.2239-1-thunder.leiz...@huawei.com).
> I don't think the affected code was moved elsewhere, so I'll drop that
> patch.

OK, this patch is really optional.

> 
> 
> .
> 



Re: [PATCH v3 2/3] irqchip: dw-apb-ictl: support hierarchy irq domain

2020-09-14 Thread Leizhen (ThunderTown)



On 2020/9/13 23:10, Marc Zyngier wrote:
> On Wed, 09 Sep 2020 07:58:35 +0100,
> Zhen Lei  wrote:
>>
>> Add support to use dw-apb-ictl as primary interrupt controller.
>>
>> Suggested-by: Marc Zyngier 
>> Signed-off-by: Zhen Lei 
>> Tested-by: Haoyu Lv 
>> ---
>>  drivers/irqchip/Kconfig   |  2 +-
>>  drivers/irqchip/irq-dw-apb-ictl.c | 76 +++
>>  2 files changed, 69 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
>> index bfc9719dbcdc..7c2d1c8fa551 100644
>> --- a/drivers/irqchip/Kconfig
>> +++ b/drivers/irqchip/Kconfig
>> @@ -148,7 +148,7 @@ config DAVINCI_CP_INTC
>>  config DW_APB_ICTL
>>  bool
>>  select GENERIC_IRQ_CHIP
>> -select IRQ_DOMAIN
>> +select IRQ_DOMAIN_HIERARCHY
>>  
>>  config FARADAY_FTINTC010
>>  bool
>> diff --git a/drivers/irqchip/irq-dw-apb-ictl.c 
>> b/drivers/irqchip/irq-dw-apb-ictl.c
>> index 5458004242e9..3c7bebe1b947 100644
>> --- a/drivers/irqchip/irq-dw-apb-ictl.c
>> +++ b/drivers/irqchip/irq-dw-apb-ictl.c
>> @@ -17,6 +17,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #define APB_INT_ENABLE_L0x00
>>  #define APB_INT_ENABLE_H0x04
>> @@ -26,6 +27,27 @@
>>  #define APB_INT_FINALSTATUS_H   0x34
>>  #define APB_INT_BASE_OFFSET 0x04
>>  
>> +/* irq domain of the primary interrupt controller. */
>> +static struct irq_domain *dw_apb_ictl_irq_domain;
>> +
>> +static void __irq_entry dw_apb_ictl_handle_irq(struct pt_regs *regs)
>> +{
>> +struct irq_domain *d = dw_apb_ictl_irq_domain;
>> +int n;
>> +
>> +for (n = 0; n < d->revmap_size; n += 32) {
>> +struct irq_chip_generic *gc = irq_get_domain_generic_chip(d, n);
>> +u32 stat = readl_relaxed(gc->reg_base + APB_INT_FINALSTATUS_L);
>> +
>> +while (stat) {
>> +u32 hwirq = ffs(stat) - 1;
>> +
>> +handle_domain_irq(d, hwirq, regs);
>> +stat &= ~BIT(hwirq);
>> +}
>> +}
>> +}
>> +
>>  static void dw_apb_ictl_handle_irq_cascaded(struct irq_desc *desc)
>>  {
>>  struct irq_domain *d = irq_desc_get_handler_data(desc);
>> @@ -50,6 +72,30 @@ static void dw_apb_ictl_handle_irq_cascaded(struct 
>> irq_desc *desc)
>>  chained_irq_exit(chip, desc);
>>  }
>>  
>> +static int dw_apb_ictl_irq_domain_alloc(struct irq_domain *domain, unsigned 
>> int virq,
>> +unsigned int nr_irqs, void *arg)
>> +{
>> +int i, ret;
>> +irq_hw_number_t hwirq;
>> +unsigned int type = IRQ_TYPE_NONE;
>> +struct irq_fwspec *fwspec = arg;
>> +
>> +ret = irq_domain_translate_onecell(domain, fwspec, , );
>> +if (ret)
>> +return ret;
>> +
>> +for (i = 0; i < nr_irqs; i++)
>> +irq_map_generic_chip(domain, virq + i, hwirq + i);
>> +
>> +return 0;
>> +}
>> +
>> +static const struct irq_domain_ops dw_apb_ictl_irq_domain_ops = {
>> +.translate = irq_domain_translate_onecell,
>> +.alloc = dw_apb_ictl_irq_domain_alloc,
>> +.free = irq_domain_free_irqs_top,
>> +};
>> +
>>  #ifdef CONFIG_PM
>>  static void dw_apb_ictl_resume(struct irq_data *d)
>>  {
>> @@ -75,13 +121,20 @@ static int __init dw_apb_ictl_init(struct device_node 
>> *np,
>>  void __iomem *iobase;
>>  int ret, nrirqs, parent_irq, i;
>>  u32 reg;
>> -const struct irq_domain_ops *domain_ops = _generic_chip_ops;
>> -
>> -/* Map the parent interrupt for the chained handler */
>> -parent_irq = irq_of_parse_and_map(np, 0);
>> -if (parent_irq <= 0) {
>> -pr_err("%pOF: unable to parse irq\n", np);
>> -return -EINVAL;
>> +const struct irq_domain_ops *domain_ops;
>> +
>> +if (!parent || (np == parent)) {
>> +/* It's used as the primary interrupt controller */
>> +parent_irq = 0;
>> +domain_ops = _apb_ictl_irq_domain_ops;
>> +} else {
>> +/* Map the parent interrupt for the chained handler */
>> +parent_irq = irq_of_parse_and_map(np, 0);
>> +if (parent_irq <= 0) {
>> +pr_err("%pOF: unable to parse irq\n", np);
>> +return -EINVAL;
>> +}
>> +domain_ops = _generic_chip_ops;
>>  }
>>  
>>  ret = of_address_to_resource(np, 0, );
>> @@ -144,10 +197,17 @@ static int __init dw_apb_ictl_init(struct device_node 
>> *np,
>>  gc->chip_types[0].chip.irq_mask = irq_gc_mask_set_bit;
>>  gc->chip_types[0].chip.irq_unmask = irq_gc_mask_clr_bit;
>>  gc->chip_types[0].chip.irq_resume = dw_apb_ictl_resume;
>> +if (!parent_irq)
>> +gc->chip_types[0].chip.irq_eoi = irq_gc_noop;
> 
> Again: what is that for? The level flow doesn't use any EOI callback.

OK, I will remove it. Yes, irq_eoi is only needed by handle_fasteoi_irq().

> 
>>  }
>>  
>> -irq_set_chained_handler_and_data(parent_irq,
>> +if (parent_irq) {
>> 

Re: [PATCH 3/3] ARM: dts: add SD5203 dts

2020-09-14 Thread Leizhen (ThunderTown)



On 2020/9/14 17:29, Wei Xu wrote:
> Hi Zhen,
> 
> On 2020/9/3 20:27, Zhen Lei wrote:
>> From: Kefeng Wang 
>>
>> Add sd5203.dts for Hisilicon SD5203 SoC platform.
>>
>> Signed-off-by: Kefeng Wang 
>> Signed-off-by: Zhen Lei 
>> ---
>>  arch/arm/boot/dts/Makefile   |  2 +
>>  arch/arm/boot/dts/sd5203.dts | 90 
>>  2 files changed, 92 insertions(+)
>>  create mode 100644 arch/arm/boot/dts/sd5203.dts
>>
>> diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
>> index 4572db3fa5ae..1d1262df5c55 100644
>> --- a/arch/arm/boot/dts/Makefile
>> +++ b/arch/arm/boot/dts/Makefile
>> @@ -357,6 +357,8 @@ dtb-$(CONFIG_ARCH_MPS2) += \
>>  mps2-an399.dtb
>>  dtb-$(CONFIG_ARCH_MOXART) += \
>>  moxart-uc7112lx.dtb
>> +dtb-$(CONFIG_ARCH_SD5203) += \
>> +sd5203.dtb
>>  dtb-$(CONFIG_SOC_IMX1) += \
>>  imx1-ads.dtb \
>>  imx1-apf9328.dtb
>> diff --git a/arch/arm/boot/dts/sd5203.dts b/arch/arm/boot/dts/sd5203.dts
>> new file mode 100644
>> index ..99da46072f72
>> --- /dev/null
>> +++ b/arch/arm/boot/dts/sd5203.dts
>> @@ -0,0 +1,90 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2020 Hisilicon Limited.
>> + *
>> + * DTS file for Hisilicon SD5203 Board
>> + */
>> +
>> +/dts-v1/;
>> +
>> +/ {
>> +model = "Hisilicon SD5203";
>> +compatible = "hisilicon,sd5203";
> 
> Can you please add the binding document as well?

OK, I will do it.

> 
>> +interrupt-parent = <>;
>> +#address-cells = <1>;
>> +#size-cells = <1>;
>> +
>> +chosen {
>> +bootargs="console=ttyS0,9600 
>> earlycon=uart8250,mmio32,0x1600d000";
>> +};
>> +
>> +aliases {
>> +serial0 = 
>> +};
>> +
>> +cpu {
>> +compatible = "arm,arm926ej-s";
>> +device_type = "cpu";
>> +};
>> +
>> +memory@3000 {
>> +device_type = "memory";
>> +reg = <0x3000 0x800>;
>> +};
>> +
>> +soc {
>> +#address-cells = <1>;
>> +#size-cells = <1>;
>> +compatible = "simple-bus";
>> +ranges;
>> +
>> +vic: interrupt-controller@1013 {
>> +compatible = "hisilicon,sd5203-vic";

As Marc Zyngier's suggestion, I discarded adding an independent SD5203-VIC
driver, but make the dw-apb-ictl irqchip driver to support hierarchy irq domain.
I will send V4 of this irqchip driver today.

Here is the link of V3:
https://lkml.org/lkml/2020/9/9/94

> 
> Ditto.
> Thanks!
> 
> Best Regards,
> Wei
> 
>> +reg = <0x1013 0x1000>;
>> +interrupt-controller;
>> +#interrupt-cells = <1>;
>> +};
>> +
>> +refclk125mhz: refclk125mhz {
>> +compatible = "fixed-clock";
>> +#clock-cells = <0>;
>> +clock-frequency = <12500>;
>> +};
>> +
>> +timer0: timer@16002000 {
>> +compatible = "arm,sp804", "arm,primecell";
>> +reg = <0x16002000 0x1000>;
>> +interrupts = <4>;
>> +clocks = <>;
>> +clock-names = "apb_pclk";
>> +};
>> +
>> +timer1: timer@16003000 {
>> +compatible = "arm,sp804", "arm,primecell";
>> +reg = <0x16003000 0x1000>;
>> +interrupts = <5>;
>> +clocks = <>;
>> +clock-names = "apb_pclk";
>> +};
>> +
>> +uart0: serial@1600D000 {
>> +compatible = "snps,dw-apb-uart";
>> +reg = <0x1600D000 0x1000>;
>> +bus_id = "uart0";
>> +clocks = <>;
>> +clock-names = "apb_pclk";
>> +reg-shift = <2>;
>> +interrupts = <17>;
>> +};
>> +
>> +uart1: serial@1600C000 {
>> +compatible = "snps,dw-apb-uart";
>> +reg = <0x1600C000 0x1000>;
>> +clocks = <>;
>> +clock-names = "apb_pclk";
>> +reg-shift = <2>;
>> +interrupts = <16>;
>> +status = "disabled";
>> +};
>> +};
>> +};
>>
> 
> .
> 



Re: [PATCH v2 0/9] clocksource: sp804: add support for Hisilicon sp804 timer

2020-09-14 Thread Leizhen (ThunderTown)
Hi, Daniel Lezcano, Thomas Gleixner:
  Do you have time to review these patches?

On 2020/9/12 19:45, Zhen Lei wrote:
> v1 --> v2:
> 1. Split the Patch 3 of v1 into three patches: Patch 3-5
> 2. Change compatible "hisi,sp804" to "hisilicon,sp804" in Patch 7.
> 3. Add dt-binding description of "hisilicon,sp804", Patch 9
> 
> Other patches are not changed.
> 
> 
> v1:
> The ARM SP804 supports a maximum of 32-bit counter, but Hisilicon extends
> it to 64-bit. That means, the registers: TimerXload, TimerXValue and
> TimerXBGLoad are 64bits, all other registers are the same as those in the
> SP804. The driver code can be completely reused except that the register
> offset is different
> 
> The register offset differences between ARM-SP804 and HISI-SP804 are as 
> follows:
> 
>   ARM-SP804   HISI-SP804
> TIMER_LOAD  0x00  HISI_TIMER_LOAD 0x00
>   HISI_TIMER_LOAD_H   0x04
> TIMER_VALUE 0x04  HISI_TIMER_VALUE0x08
>   HISI_TIMER_VALUE_H  0x0c
> TIMER_CTRL  0x08  HISI_TIMER_CTRL 0x10
> TIMER_INTCLR0x0c  HISI_TIMER_INTCLR   0x14
> TIMER_RIS   0x10  HISI_TIMER_RIS  0x18
> TIMER_MIS   0x14  HISI_TIMER_MIS  0x1c
> TIMER_BGLOAD0x18  HISI_TIMER_BGLOAD   0x20
>   HISI_TIMER_BGLOAD_H 0x24
> TIMER_2_BASE0x20  HISI_TIMER_2_BASE   0x40
> 
> 
> In order to make the timer-sp804 driver support both ARM-SP804 and HISI-SP804.
> Create a new structure "sp804_clkevt" to record the calculated registers
> address in advance, avoid judging and calculating the register address every
> place that is used.
> 
> For example:
>   struct sp804_timer arm_sp804_timer = {
>   .ctrl   = TIMER_CTRL,
>   };
> 
>   struct sp804_timer hisi_sp804_timer = {
>   .ctrl   = HISI_TIMER_CTRL,
>   };
> 
>   struct sp804_clkevt clkevt;
> 
> In the initialization phase:
>   if (hisi_sp804)
>   clkevt.ctrl = base + hisi_sp804_timer.ctrl;
>   else if (arm_sp804)
>   clkevt.ctrl = base + arm_sp804_timer.ctrl;
> 
> After initialization:
> - writel(0, base + TIMER_CTRL);
> + writel(0, clkevt.ctrl);
> 
> 
> Additional information:
> These patch series are the V2 of 
> https://lore.kernel.org/patchwork/cover/681876/
> And many of the main ideas in https://lore.kernel.org/patchwork/patch/681875/ 
> have been considered.
> Thanks for Daniel Lezcano's review comments.
> 
> Kefeng Wang (1):
>   clocksource: sp804: cleanup clk_get_sys()
> 
> Zhen Lei (8):
>   clocksource: sp804: remove unused sp804_timer_disable() and
> timer-sp804.h
>   clocksource: sp804: delete the leading "__" of some functions
>   clocksource: sp804: remove a mismatched comment
>   clocksource: sp804: prepare for support non-standard register offset
>   clocksource: sp804: support non-standard register offset
>   clocksource: sp804: add support for Hisilicon sp804 timer
>   clocksource: sp804: enable Hisilicon sp804 timer 64bit mode
>   dt-bindings: sp804: add support for Hisilicon sp804 timer
> 
>  .../devicetree/bindings/timer/arm,sp804.txt   |   2 +
>  drivers/clocksource/timer-sp.h|  47 +
>  drivers/clocksource/timer-sp804.c | 195 --
>  include/clocksource/timer-sp804.h |  29 ---
>  4 files changed, 181 insertions(+), 92 deletions(-)
>  delete mode 100644 include/clocksource/timer-sp804.h
> 



Re: [PATCH 1/2] dt-bindings: interrupt-controller: add Hisilicon SD5203 vector interrupt controller

2020-09-15 Thread Leizhen (ThunderTown)



On 2020/9/15 4:31, Rob Herring wrote:
> On Thu, Sep 03, 2020 at 08:05:03PM +0800, Zhen Lei wrote:
>> Add DT bindings for the Hisilicon SD5203 vector interrupt controller.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  .../hisilicon,sd5203-vic.txt  | 27 +++
> 
> Bindings should be in DT schema format now.

Hi, Rob Herring:

As Marc Zyngier's suggestion, I discarded adding an independent SD5203-VIC
driver, but make the dw-apb-ictl irqchip driver to support hierarchy irq domain.
So this new file was also dropped. Now, I updated the descriptions in the 
existing
file "snps,dw-apb-ictl.txt" in the following versions.

> 
>>  1 file changed, 27 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/interrupt-controller/hisilicon,sd5203-vic.txt
>>
>> diff --git 
>> a/Documentation/devicetree/bindings/interrupt-controller/hisilicon,sd5203-vic.txt
>>  
>> b/Documentation/devicetree/bindings/interrupt-controller/hisilicon,sd5203-vic.txt
>> new file mode 100644
>> index ..a08292e868b0
>> --- /dev/null
>> +++ 
>> b/Documentation/devicetree/bindings/interrupt-controller/hisilicon,sd5203-vic.txt
>> @@ -0,0 +1,27 @@
>> +Hisilicon SD5203 vector interrupt controller (VIC)
>> +
>> +Hisilicon SD5203 VIC based on Synopsys DesignWare APB interrupt controller, 
>> but
>> +there's something special:
>> +1. The maximum number of irqs supported is 32. The registers ENABLE, MASK 
>> and
>> +   FINALSTATUS are 32 bits.
>> +2. There is only one VIC, it's used as primary interrupt controller.
>> +
>> +Required properties:
>> +- compatible: shall be "hisilicon,sd5203-vic"
>> +- reg: physical base address of the controller and length of memory mapped
>> +  region starting with ENABLE_LOW register
>> +- interrupt-controller: identifies the node as an interrupt controller
>> +- #interrupt-cells: number of cells to encode an interrupt-specifier, shall 
>> be 1
>> +
>> +The interrupt sources map to the corresponding bits in the interrupt
>> +registers, i.e.
>> +- 0 maps to bit 0 of low interrupts,
>> +- 1 maps to bit 1 of low interrupts,
>> +
>> +Example:
>> +vic: interrupt-controller@1013 {
>> +compatible = "hisilicon,sd5203-vic";
>> +reg = <0x1013 0x1000>;
>> +interrupt-controller;
>> +#interrupt-cells = <1>;
>> +};
>> -- 
>> 2.26.0.106.g9fadedd
>>
>>
> 
> .
> 



Re: [PATCH v4 1/1] libnvdimm: fix memory leaks in of_pmem.c

2020-09-19 Thread Leizhen (ThunderTown)
Hi, all:
  Is this patch acceptable?


On 2020/9/1 16:14, Zhen Lei wrote:
> Currently, in the last error path of of_pmem_region_probe() and in
> of_pmem_region_remove(), free the memory allocated by kstrdup() is
> missing. Add kfree(priv->bus_desc.provider_name) to fix it.
> 
> In addition, add a sanity check to kstrdup() to prevent a
> NULL-pointer dereference.
> 
> Fixes: 49bddc73d15c ("libnvdimm/of_pmem: Provide a unique name for bus 
> provider")
> Signed-off-by: Zhen Lei 
> Reviewed-by: Oliver O'Halloran 
> ---
>  drivers/nvdimm/of_pmem.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/nvdimm/of_pmem.c b/drivers/nvdimm/of_pmem.c
> index 10dbdcdfb9ce913..13c4c274ca6ea88 100644
> --- a/drivers/nvdimm/of_pmem.c
> +++ b/drivers/nvdimm/of_pmem.c
> @@ -31,11 +31,17 @@ static int of_pmem_region_probe(struct platform_device 
> *pdev)
>   return -ENOMEM;
>  
>   priv->bus_desc.provider_name = kstrdup(pdev->name, GFP_KERNEL);
> + if (!priv->bus_desc.provider_name) {
> + kfree(priv);
> + return -ENOMEM;
> + }
> +
>   priv->bus_desc.module = THIS_MODULE;
>   priv->bus_desc.of_node = np;
>  
>   priv->bus = bus = nvdimm_bus_register(>dev, >bus_desc);
>   if (!bus) {
> + kfree(priv->bus_desc.provider_name);
>   kfree(priv);
>   return -ENODEV;
>   }
> @@ -83,6 +89,7 @@ static int of_pmem_region_remove(struct platform_device 
> *pdev)
>   struct of_pmem_private *priv = platform_get_drvdata(pdev);
>  
>   nvdimm_bus_unregister(priv->bus);
> + kfree(priv->bus_desc.provider_name);
>   kfree(priv);
>  
>   return 0;
> 



Re: [PATCH v3 0/9] clocksource: sp804: add support for Hisilicon sp804 timer

2020-09-19 Thread Leizhen (ThunderTown)



On 2020/9/19 19:51, Daniel Lezcano wrote:
> On 18/09/2020 15:22, Zhen Lei wrote:
> 
> [ ... ]
> 
>>
>> Zhen Lei (8):
>>   clocksource: sp804: remove unused sp804_timer_disable() and
>> timer-sp804.h
>>   clocksource: sp804: delete the leading "__" of some functions
>>   clocksource: sp804: remove a mismatched comment
>>   clocksource: sp804: prepare for support non-standard register offset
>>   clocksource: sp804: support non-standard register offset
>>   clocksource: sp804: add support for Hisilicon sp804 timer
>>   clocksource: sp804: enable Hisilicon sp804 timer 64bit mode
>>   dt-bindings: sp804: add support for Hisilicon sp804 timer
> 
> Applied all patches, except 9/9 which should go through Rob's tree as
> the yaml conversion is there.

OK, I will send 9/9 to Rob Herring individually. Thanks.

> 
> 



Re: [PATCH v3 9/9] dt-bindings: sp804: add support for Hisilicon sp804 timer

2020-09-19 Thread Leizhen (ThunderTown)



On 2020/9/19 19:43, Daniel Lezcano wrote:
> On 18/09/2020 15:22, Zhen Lei wrote:
>> Some Hisilicon SoCs, such as Hi1212, use the Hisilicon extended sp804
>> timer.
>>
>> Signed-off-by: Zhen Lei 
>> ---
> 
> I'm not able to apply this patch, the file does not exists.

Hi Rob Herring:
  I will send you a new one, because I didn't notice that there was a
"select" property in arm,sp804.yaml.

> 
> 



Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary

2020-09-20 Thread Leizhen (ThunderTown)



On 2020/9/17 22:00, Ard Biesheuvel wrote:
> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin
>  wrote:
>>
>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote:
>>> Currently, only support the kernels where the base of physical memory is
>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits
>>> unrotated value. But we can use one more "add/sub" instructions to handle
>>> bits 23-16. The performance will be slightly affected.
>>>
>>> Since most boards meet 16 MiB alignment, so add a new configuration
>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if
>>> anyone really needs it.
>>>
>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are
>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in
>>> the whole head.S file. So choose it.
>>>
>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done,
>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why
>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub()
>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t"
>>> (above y).
>>>
>>> Signed-off-by: Zhen Lei 
>>> ---
>>>  arch/arm/Kconfig  | 18 +-
>>>  arch/arm/include/asm/memory.h | 16 +---
>>>  arch/arm/kernel/head.S| 25 +++--
>>>  3 files changed, 49 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>>> index e00d94b16658765..19fc2c746e2ce29 100644
>>> --- a/arch/arm/Kconfig
>>> +++ b/arch/arm/Kconfig
>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT
>>> kernel in system memory.
>>>
>>> This can only be used with non-XIP MMU kernels where the base
>>> -   of physical memory is at a 16MB boundary.
>>> +   of physical memory is at a 16MiB boundary.
>>>
>>> Only disable this option if you know that you do not require
>>> this feature (eg, building a kernel for a single machine) and
>>> you need to shrink the kernel to the minimal size.
>>>
>>> +config ARM_PATCH_PHYS_VIRT_RADICAL
>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary"
>>> + default n
>>
>> Please drop the "default n" - this is the default anyway.
>>
>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t 
>>> x)
>>>* in place where 'r' 32 bit operand is expected.
>>>*/
>>>   __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24);
>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL
>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16);
>>
>> t is already unsigned long, so this cast is not necessary.
>>
>> I've been debating whether it would be better to use "movw" for this
>> for ARMv7.  In other words:
>>
>> movwtmp, #16-bit
>> adds%Q0, %1, tmp, lsl #16
>> adc %R0, %R0, #0
>>
>> It would certainly be less instructions, but at the cost of an
>> additional register - and we'd have to change the fixup code to
>> know about movw.
>>
>> Thoughts?
>>
> 
> Since LPAE implies v7, we can use movw unconditionally, which is nice.
> 
> There is no need to use an additional temp register, as we can use the
> register holding the high word. (There is no need for the mov_hi macro
> to be separate)
> 
> 0: movw%R0, #low offset >> 16
>adds%Q0, %1, %R0, lsl #16
> 1: mov %R0, #high offset
>adc %R0, %R0, #0
>.pushsection .pv_table,"a"
>.long 0b, 1b
>.popsection
> 
> The only problem is distinguishing the two mov instructions from each

The #high offset can also consider use movw, it just save two bytes in
the thumb2 scenario. We can store different imm16 value for high_offset
and low_offset, so that we can distinguish them in __fixup_a_pv_table().

This will make the final implementation of the code look more clear and
consistent, especially THUMB2.

Let me try it.

> other, but that should not be too hard I think.
> 
> .
> 



Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary

2020-09-21 Thread Leizhen (ThunderTown)



On 2020/9/21 14:47, Ard Biesheuvel wrote:
> On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown)
>  wrote:
>>
>>
>>
>> On 2020/9/17 22:00, Ard Biesheuvel wrote:
>>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin
>>>  wrote:
>>>>
>>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote:
>>>>> Currently, only support the kernels where the base of physical memory is
>>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits
>>>>> unrotated value. But we can use one more "add/sub" instructions to handle
>>>>> bits 23-16. The performance will be slightly affected.
>>>>>
>>>>> Since most boards meet 16 MiB alignment, so add a new configuration
>>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if
>>>>> anyone really needs it.
>>>>>
>>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are
>>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in
>>>>> the whole head.S file. So choose it.
>>>>>
>>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done,
>>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why
>>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub()
>>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t"
>>>>> (above y).
>>>>>
>>>>> Signed-off-by: Zhen Lei 
>>>>> ---
>>>>>  arch/arm/Kconfig  | 18 +-
>>>>>  arch/arm/include/asm/memory.h | 16 +---
>>>>>  arch/arm/kernel/head.S| 25 +++--
>>>>>  3 files changed, 49 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>>>>> index e00d94b16658765..19fc2c746e2ce29 100644
>>>>> --- a/arch/arm/Kconfig
>>>>> +++ b/arch/arm/Kconfig
>>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT
>>>>> kernel in system memory.
>>>>>
>>>>> This can only be used with non-XIP MMU kernels where the base
>>>>> -   of physical memory is at a 16MB boundary.
>>>>> +   of physical memory is at a 16MiB boundary.
>>>>>
>>>>> Only disable this option if you know that you do not require
>>>>> this feature (eg, building a kernel for a single machine) and
>>>>> you need to shrink the kernel to the minimal size.
>>>>>
>>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL
>>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary"
>>>>> + default n
>>>>
>>>> Please drop the "default n" - this is the default anyway.
>>>>
>>>>> @@ -236,6 +243,9 @@ static inline unsigned long 
>>>>> __phys_to_virt(phys_addr_t x)
>>>>>* in place where 'r' 32 bit operand is expected.
>>>>>*/
>>>>>   __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24);
>>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL
>>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16);
>>>>
>>>> t is already unsigned long, so this cast is not necessary.
>>>>
>>>> I've been debating whether it would be better to use "movw" for this
>>>> for ARMv7.  In other words:
>>>>
>>>> movwtmp, #16-bit
>>>> adds%Q0, %1, tmp, lsl #16
>>>> adc %R0, %R0, #0
>>>>
>>>> It would certainly be less instructions, but at the cost of an
>>>> additional register - and we'd have to change the fixup code to
>>>> know about movw.
>>>>
>>>> Thoughts?
>>>>
>>>
>>> Since LPAE implies v7, we can use movw unconditionally, which is nice.
>>>
>>> There is no need to use an additional temp register, as we can use the
>>> register holding the high word. (There is no need for the mov_hi macro
>>> to be separate)
>>>
>>> 0: movw%R0, #low offset >> 16
>>>adds%Q0, %1, %R0, lsl #16
>>> 1: mov %R0, #high offset
>>>adc %R0, %R0, #0
>>>.pushsection .pv_table,"a"
>>>.long 0b, 1b
>>>.popsection
>>>
>>> The only problem is distinguishing the two mov instructions from each
>>
>> The #high offset can also consider use movw, it just save two bytes in
>> the thumb2 scenario. We can store different imm16 value for high_offset
>> and low_offset, so that we can distinguish them in __fixup_a_pv_table().
>>
>> This will make the final implementation of the code look more clear and
>> consistent, especially THUMB2.
>>
>> Let me try it.
>>
> 
> Hello Zhen Lei,
> 
> I am looking into this as well:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2
> 
> Could you please test this version on your hardware?

OK, I will test it on my boards.

> 
> .
> 



Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary

2020-09-15 Thread Leizhen (ThunderTown)



On 2020/9/16 3:01, Russell King - ARM Linux admin wrote:
> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote:
>> Currently, only support the kernels where the base of physical memory is
>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits
>> unrotated value. But we can use one more "add/sub" instructions to handle
>> bits 23-16. The performance will be slightly affected.
>>
>> Since most boards meet 16 MiB alignment, so add a new configuration
>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if
>> anyone really needs it.
>>
>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are
>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in
>> the whole head.S file. So choose it.
>>
>> Because the calculation of "y = x + __pv_offset[63:24]" have been done,
>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why
>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub()
>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t"
>> (above y).
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  arch/arm/Kconfig  | 18 +-
>>  arch/arm/include/asm/memory.h | 16 +---
>>  arch/arm/kernel/head.S| 25 +++--
>>  3 files changed, 49 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>> index e00d94b16658765..19fc2c746e2ce29 100644
>> --- a/arch/arm/Kconfig
>> +++ b/arch/arm/Kconfig
>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT
>>kernel in system memory.
>>  
>>This can only be used with non-XIP MMU kernels where the base
>> -  of physical memory is at a 16MB boundary.
>> +  of physical memory is at a 16MiB boundary.
>>  
>>Only disable this option if you know that you do not require
>>this feature (eg, building a kernel for a single machine) and
>>you need to shrink the kernel to the minimal size.
>>  
>> +config ARM_PATCH_PHYS_VIRT_RADICAL
>> +bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary"
>> +default n
> 
> Please drop the "default n" - this is the default anyway.

OK, I will remove it.

> 
>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
>>   * in place where 'r' 32 bit operand is expected.
>>   */
>>  __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24);
>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL
>> +__pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16);
> 
> t is already unsigned long, so this cast is not necessary.

Oh, yes, yes, I copied from the above statement, but forgot to remove it.

> 
> I've been debating whether it would be better to use "movw" for this
> for ARMv7.  In other words:
> 
>   movwtmp, #16-bit
>   adds%Q0, %1, tmp, lsl #16
>   adc %R0, %R0, #0
> 
> It would certainly be less instructions, but at the cost of an
> additional register - and we'd have to change the fixup code to
> know about movw.

It's one less instruction for 64KiB boundary && (sizeof(phys_addr_t) == 8),
and no increase or decrease for 64KiB boundary && (sizeof(phys_addr_t) == 4),
but one more instruction for 16MiB boundary.

And maybe: 16MiB is widely used, but 64KiB is rarely used.

So I'm inclined to the current revision.

> 
> Thoughts?
> 



Re: [PATCH v2] dt-bindings: leds: Document commonly used LED triggers

2020-12-12 Thread Leizhen (ThunderTown)



On 2020/12/10 16:24, Manivannan Sadhasivam wrote:
> This commit documents the LED triggers used commonly in the SoCs. Not
> all triggers are documented as some of them are very application specific.
> Most of the triggers documented here are currently used in devicetrees
> of many SoCs.
> 
> While at it, let's also sort the triggers in ascending order.
> 
> Signed-off-by: Manivannan Sadhasivam 
> ---
> 
> Changes in v2:
> 
> * Added more triggers, fixed the regex
> * Sorted triggers in ascending order
> 
>  .../devicetree/bindings/leds/common.yaml  | 78 ++-
>  1 file changed, 60 insertions(+), 18 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/leds/common.yaml 
> b/Documentation/devicetree/bindings/leds/common.yaml
> index f1211e7045f1..3c2e2208c1da 100644
> --- a/Documentation/devicetree/bindings/leds/common.yaml
> +++ b/Documentation/devicetree/bindings/leds/common.yaml
> @@ -79,24 +79,66 @@ properties:
>the LED.
>  $ref: /schemas/types.yaml#definitions/string
>  
> -enum:
> -# LED will act as a back-light, controlled by the framebuffer system
> -  - backlight
> -# LED will turn on (but for leds-gpio see "default-state" property in
> -# Documentation/devicetree/bindings/leds/leds-gpio.yaml)
> -  - default-on
> -# LED "double" flashes at a load average based rate
> -  - heartbeat
> -# LED indicates disk activity
> -  - disk-activity
> -# LED indicates IDE disk activity (deprecated), in new 
> implementations
> -# use "disk-activity"
> -  - ide-disk
> -# LED flashes at a fixed, configurable rate
> -  - timer
> -# LED alters the brightness for the specified duration with one 
> software
> -# timer (requires "led-pattern" property)
> -  - pattern
> +oneOf:
> +  - items:
> +  - enum:
> +# LED indicates mic mute state
> +  - audio-micmute
> +# LED indicates audio mute state
> +  - audio-mute
> +# LED will act as a back-light, controlled by the 
> framebuffer system
> +  - backlight
> +# LED indicates bluetooth power state
> +  - bluetooth-power
> +# LED indicates activity of all CPUs
> +  - cpu
> +# LED will turn on (but for leds-gpio see "default-state" 
> property in
> +# Documentation/devicetree/bindings/leds/leds-gpio.yaml)
> +  - default-on
> +# LED indicates disk activity
> +  - disk-activity
> +# LED indicates disk read activity
> +  - disk-read
> +# LED indicates disk write activity
> +  - disk-write
> +# LED indicates camera flash state
> +  - flash
> +# LED "double" flashes at a load average based rate
> +  - heartbeat
> +# LED indicates IDE disk activity (deprecated), in new 
> implementations
> +# use "disk-activity"
> +  - ide-disk
> +# LED indicates MTD memory activity
> +  - mtd
> +# LED indicates NAND memory activity (deprecated),
> +# in new implementations use "mtd"
> +  - nand-disk
> +# No trigger assigned to the LED. This is the default mode
> +# if trigger is absent
> +  - none
> +# LED alters the brightness for the specified duration with 
> one software
> +# timer (requires "led-pattern" property)
> +  - pattern
> +# LED flashes at a fixed, configurable rate
> +  - timer
> +# LED indicates camera torch state
> +  - torch
> +# LED indicates USB gadget activity
> +  - usb-gadget
> +# LED indicates USB host activity
> +  - usb-host
> +  - items:
> +# LED indicates activity of [N]th CPU
> +  - pattern: "^cpu[0-9]{1,2}$"
> +  - items:
> +# LED indicates power status of [N]th Bluetooth HCI device
> +  - pattern: "^hci[0-9]{1,2}-power$"
> +  - items:
> +# LED indicates [N]th MMC storage activity
> +  - pattern: "^mmc[0-9]{1,2}$"
> +  - items:
> +# LED indicates [N]th WLAN Tx activity
> +  - pattern: "^phy[0-9]{1,2}tx$"

Only the last three are not listed:
phy0rx
ir-power-click
ir-user-click

And the first one is easily supported by:
-# LED indicates [N]th WLAN Tx activity
-  - pattern: "^phy[0-9]{1,2}tx$"
+# LED indicates [N]th WLAN Tx/Rx activity
+  - pattern: "^phy[0-9]{1,2}(tx|rx)$"

Tested-by: Zhen Lei 

>  
>led-pattern:
>  description: |
> 



Re: [PATCH v2] dt-bindings: leds: Document commonly used LED triggers

2020-12-12 Thread Leizhen (ThunderTown)



On 2020/12/13 10:39, Leizhen (ThunderTown) wrote:
> 
> 
> On 2020/12/10 16:24, Manivannan Sadhasivam wrote:
>> This commit documents the LED triggers used commonly in the SoCs. Not
>> all triggers are documented as some of them are very application specific.
>> Most of the triggers documented here are currently used in devicetrees
>> of many SoCs.
>>
>> While at it, let's also sort the triggers in ascending order.
>>
>> Signed-off-by: Manivannan Sadhasivam 
>> ---
>>
>> Changes in v2:
>>
>> * Added more triggers, fixed the regex
>> * Sorted triggers in ascending order
>>
>>  .../devicetree/bindings/leds/common.yaml  | 78 ++-
>>  1 file changed, 60 insertions(+), 18 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/leds/common.yaml 
>> b/Documentation/devicetree/bindings/leds/common.yaml
>> index f1211e7045f1..3c2e2208c1da 100644
>> --- a/Documentation/devicetree/bindings/leds/common.yaml
>> +++ b/Documentation/devicetree/bindings/leds/common.yaml
>> @@ -79,24 +79,66 @@ properties:
>>the LED.
>>  $ref: /schemas/types.yaml#definitions/string
>>  
>> -enum:
>> -# LED will act as a back-light, controlled by the framebuffer system
>> -  - backlight
>> -# LED will turn on (but for leds-gpio see "default-state" property 
>> in
>> -# Documentation/devicetree/bindings/leds/leds-gpio.yaml)
>> -  - default-on
>> -# LED "double" flashes at a load average based rate
>> -  - heartbeat
>> -# LED indicates disk activity
>> -  - disk-activity
>> -# LED indicates IDE disk activity (deprecated), in new 
>> implementations
>> -# use "disk-activity"
>> -  - ide-disk
>> -# LED flashes at a fixed, configurable rate
>> -  - timer
>> -# LED alters the brightness for the specified duration with one 
>> software
>> -# timer (requires "led-pattern" property)
>> -  - pattern
>> +oneOf:
>> +  - items:
>> +  - enum:
>> +# LED indicates mic mute state
>> +  - audio-micmute
>> +# LED indicates audio mute state
>> +  - audio-mute
>> +# LED will act as a back-light, controlled by the 
>> framebuffer system
>> +  - backlight
>> +# LED indicates bluetooth power state
>> +  - bluetooth-power
>> +# LED indicates activity of all CPUs
>> +  - cpu
>> +# LED will turn on (but for leds-gpio see "default-state" 
>> property in
>> +# Documentation/devicetree/bindings/leds/leds-gpio.yaml)
>> +  - default-on
>> +# LED indicates disk activity
>> +  - disk-activity
>> +# LED indicates disk read activity
>> +  - disk-read
>> +# LED indicates disk write activity
>> +  - disk-write
>> +# LED indicates camera flash state
>> +  - flash
>> +# LED "double" flashes at a load average based rate
>> +  - heartbeat
>> +# LED indicates IDE disk activity (deprecated), in new 
>> implementations
>> +# use "disk-activity"
>> +  - ide-disk
>> +# LED indicates MTD memory activity
>> +  - mtd
>> +# LED indicates NAND memory activity (deprecated),
>> +# in new implementations use "mtd"
>> +  - nand-disk
>> +# No trigger assigned to the LED. This is the default mode
>> +# if trigger is absent
>> +  - none
>> +# LED alters the brightness for the specified duration with 
>> one software
>> +# timer (requires "led-pattern" property)
>> +  - pattern
>> +# LED flashes at a fixed, configurable rate
>> +  - timer
>> +# LED indicates camera torch state
>> +  - torch
>> +# LED indicates USB gadget activity
>> +  - usb-gadget
>> +# LED indicates USB host activity
>> +  - usb-host
>> +  - items:
>> +# LED indicates activity of [N]th CPU
>> +  - pattern: &

Re: [PATCH 1/1] ARM: LPAE: use phys_addr_t instead of unsigned long in outercache hooks

2020-12-25 Thread Leizhen (ThunderTown)



On 2020/12/25 19:44, Zhen Lei wrote:
> The outercache of some Hisilicon SOCs support physical addresses wider
> than 32-bits. The unsigned long datatype is not sufficient for mapping
> physical addresses >= 4GB. The commit ad6b9c9d78b9 ("ARM: 6671/1: LPAE:
> use phys_addr_t instead of unsigned long in outercache functions") has
> already modified the outercache functions. But the parameters of the
> outercache hooks are not changed. This patch use phys_addr_t instead of
> unsigned long in outercache hooks: inv_range, clean_range, flush_range.
> 
> To ensure the outercache that does not support LPAE works properly, do
> cast phys_addr_t to unsigned long by adding a middle-tier function.
> For example:
> -static void l2c220_inv_range(unsigned long start, unsigned long end)
> +static void __l2c220_inv_range(unsigned long start, unsigned long end)
>  {
>   ...
>  }
> +static void l2c220_inv_range(phys_addr_t start, phys_addr_t end)
> +{
> +  __l2c220_inv_range(start, end);
> +}
> 
> Note that the outercache functions have been doing this cast before this
> patch. So now, the cast is just moved to the middle-tier function.
> 
> No functional change.

This patch will impact the outercache drivers that have not been merged into
the kernel. They should also update the datatype of the outercache hooks.

Another compatible solution is to add three new outercache hooks, as follows:

diff --git a/arch/arm/include/asm/outercache.h 
b/arch/arm/include/asm/outercache.h
index 3364637755e86aa..83344d0428fa5b6 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -17,6 +17,9 @@ struct outer_cache_fns {
 void (*inv_range)(unsigned long, unsigned long);
 void (*clean_range)(unsigned long, unsigned long);
 void (*flush_range)(unsigned long, unsigned long);
+  void (*lpae_inv_range)(phys_addr_t, phys_addr_t);
+  void (*lpae_clean_range)(phys_addr_t, phys_addr_t);
+  void (*lpae_flush_range)(phys_addr_t, phys_addr_t);
 void (*flush_all)(void);
 void (*disable)(void);
 #ifdef CONFIG_OUTER_CACHE_SYNC
@@ -41,6 +44,8 @@ static inline void outer_inv_range(phys_addr_t start, 
phys_addr_t end)
 {
 if (outer_cache.inv_range)
 outer_cache.inv_range(start, end);
+  else if (outer_cache.lpae_inv_range)
+  outer_cache.lpae_inv_range(start, end);
 }

 /**
@@ -52,6 +57,8 @@ static inline void outer_clean_range(phys_addr_t start, 
phys_addr_t end)
 {
 if (outer_cache.clean_range)
 outer_cache.clean_range(start, end);
+  else if (outer_cache.lpae_clean_range)
+  outer_cache.lpae_clean_range(start, end);
 }

 /**
@@ -63,6 +70,8 @@ static inline void outer_flush_range(phys_addr_t start, 
phys_addr_t end)
 {
 if (outer_cache.flush_range)
 outer_cache.flush_range(start, end);
+  else if (outer_cache.lpae_flush_range)
+  outer_cache.lpae_flush_range(start, end);
 }

 /**



> 
> Signed-off-by: Zhen Lei 
> ---
>  arch/arm/include/asm/outercache.h |  6 +--
>  arch/arm/mm/cache-feroceon-l2.c   | 21 --
>  arch/arm/mm/cache-l2x0.c  | 83 
> ---
>  arch/arm/mm/cache-tauros2.c   | 21 --
>  arch/arm/mm/cache-uniphier.c  |  6 +--
>  arch/arm/mm/cache-xsc3l2.c| 21 --
>  6 files changed, 129 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/arm/include/asm/outercache.h 
> b/arch/arm/include/asm/outercache.h
> index 3364637755e86aa..4cee1ea0c15449a 100644
> --- a/arch/arm/include/asm/outercache.h
> +++ b/arch/arm/include/asm/outercache.h
> @@ -14,9 +14,9 @@
>  struct l2x0_regs;
>  
>  struct outer_cache_fns {
> - void (*inv_range)(unsigned long, unsigned long);
> - void (*clean_range)(unsigned long, unsigned long);
> - void (*flush_range)(unsigned long, unsigned long);
> + void (*inv_range)(phys_addr_t, phys_addr_t);
> + void (*clean_range)(phys_addr_t, phys_addr_t);
> + void (*flush_range)(phys_addr_t, phys_addr_t);
>   void (*flush_all)(void);
>   void (*disable)(void);
>  #ifdef CONFIG_OUTER_CACHE_SYNC
> diff --git a/arch/arm/mm/cache-feroceon-l2.c b/arch/arm/mm/cache-feroceon-l2.c
> index 5c1b7a7b9af6300..ab1d8051bf832c9 100644
> --- a/arch/arm/mm/cache-feroceon-l2.c
> +++ b/arch/arm/mm/cache-feroceon-l2.c
> @@ -168,7 +168,7 @@ static unsigned long calc_range_end(unsigned long start, 
> unsigned long end)
>   return range_end;
>  }
>  
> -static void feroceon_l2_inv_range(unsigned long start, unsigned long end)
> +static void __feroceon_l2_inv_range(unsigned long start, unsigned long end)
>  {
>   /*
>* Clean and invalidate partial first cache line.
> @@ -198,7 +198,12 @@ static void feroceon_l2_inv_range(unsigned long start, 
> unsigned long end)
>   dsb();
>  }
>  
> -static void feroceon_l2_clean_range(unsigned long start, unsigned long end)
> +static void feroceon_l2_inv_range(phys_addr_t start, phys_addr_t end)
> +{
> + 

Re: [PATCH v2 1/1] ARM: dts: mmp2-olpc-xo-1-75: clear the warnings when make dtbs

2020-12-08 Thread Leizhen (ThunderTown)



On 2020/12/8 21:58, Arnd Bergmann wrote:
> On Mon, Dec 7, 2020 at 9:47 AM Zhen Lei  wrote:
>>
>> The check_spi_bus_bridge() in scripts/dtc/checks.c requires that the node
>> have "spi-slave" property must with "#address-cells = <0>" and
>> "#size-cells = <0>". But currently both "#address-cells" and "#size-cells"
>> properties are deleted, the corresponding default values are 2 and 1. As a
>> result, the check fails and below warnings is displayed.
>>
>> arch/arm/boot/dts/mmp2.dtsi:472.23-480.6: Warning (spi_bus_bridge): \
>> /soc/apb@d400/spi@d4037000: incorrect #address-cells for SPI bus
>>   also defined at arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts:225.7-237.3
>> arch/arm/boot/dts/mmp2.dtsi:472.23-480.6: Warning (spi_bus_bridge): \
>> /soc/apb@d400/spi@d4037000: incorrect #size-cells for SPI bus
>>   also defined at arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts:225.7-237.3
>> arch/arm/boot/dts/mmp2-olpc-xo-1-75.dtb: Warning (spi_bus_reg): \
>> Failed prerequisite 'spi_bus_bridge'
>>
>> Because the value of "#size-cells" is already defined as zero in the node
>> "ssp3: spi@d4037000" in arch/arm/boot/dts/mmp2.dtsi. So we only need to
>> explicitly add "#address-cells = <0>" and keep "#size-cells" no change.
>>
>> Signed-off-by: Zhen Lei 
> 
> Right, I already sent the same patch earlier.

Oh, sorry, I don't known it. If you send it earlier, please apply your patch!

> 
> Lubomir, can I apply this to the fixes branch?

This fix is really should be considered to merge into v5.10.

> 
>>  arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts 
>> b/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts
>> index adde62d6fce73b9..82da44dacba7172 100644
>> --- a/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts
>> +++ b/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts
>> @@ -224,7 +224,7 @@
>>
>>   {
>> /delete-property/ #address-cells;
>> -   /delete-property/ #size-cells;
>> +   #address-cells = <0>;
>> spi-slave;
>> status = "okay";
>> ready-gpios = < 125 GPIO_ACTIVE_HIGH>;
>> --
>> 1.8.3
>>
>>
> 
> .
> 



Re: [RESEND PATCH v3 1/4] iommu/iova: Add free_all_cpu_cached_iovas()

2020-12-09 Thread Leizhen (ThunderTown)
On 2020/11/17 18:25, John Garry wrote:
> Add a helper function to free the CPU rcache for all online CPUs.
> 
> There also exists a function of the same name in
> drivers/iommu/intel/iommu.c, but the parameters are different, and there
> should be no conflict.
> 
> Signed-off-by: John Garry 
> ---
>  drivers/iommu/iova.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 30d969a4c5fd..81b7399dd5e8 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -227,6 +227,14 @@ static int __alloc_and_insert_iova_range(struct 
> iova_domain *iovad,
>   return -ENOMEM;
>  }
>  
> +static void free_all_cpu_cached_iovas(struct iova_domain *iovad)
> +{
> + unsigned int cpu;
> +
> + for_each_online_cpu(cpu)
> + free_cpu_cached_iovas(cpu, iovad);
> +}
> +
>  static struct kmem_cache *iova_cache;
>  static unsigned int iova_cache_users;
>  static DEFINE_MUTEX(iova_cache_mutex);
> @@ -422,15 +430,12 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned 
> long size,
>  retry:
>   new_iova = alloc_iova(iovad, size, limit_pfn, true);
>   if (!new_iova) {
> - unsigned int cpu;
> -
>   if (!flush_rcache)
>   return 0;
>  
>   /* Try replenishing IOVAs by flushing rcache. */
>   flush_rcache = false;
> - for_each_online_cpu(cpu)
> - free_cpu_cached_iovas(cpu, iovad);
> + free_all_cpu_cached_iovas(iovad);
>   goto retry;
>   }
>  

Reviewed-by: Zhen Lei 

> 



Re: [RESEND PATCH v3 2/4] iommu/iova: Avoid double-negatives in magazine helpers

2020-12-09 Thread Leizhen (ThunderTown)



On 2020/11/17 18:25, John Garry wrote:
> A similar crash to the following could be observed if initial CPU rcache
> magazine allocations fail in init_iova_rcaches():
> 
> Unable to handle kernel NULL pointer dereference at virtual address 
> 
> Mem abort info:
>ESR = 0x9604
>EC = 0x25: DABT (current EL), IL = 32 bits
>SET = 0, FnV = 0
>EA = 0, S1PTW = 0
> Data abort info:
>ISV = 0, ISS = 0x0004
>CM = 0, WnR = 0
> [] user address but active_mm is swapper
> Internal error: Oops: 9604 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 11 PID: 696 Comm: irq/40-hisi_sas Not tainted 5.9.0-rc7-dirty #109
> Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 
> 03/15/2019
> Call trace:
>   free_iova_fast+0xfc/0x280
>   iommu_dma_free_iova+0x64/0x70
>   __iommu_dma_unmap+0x9c/0xf8
>   iommu_dma_unmap_sg+0xa8/0xc8
>   dma_unmap_sg_attrs+0x28/0x50
>   cq_thread_v3_hw+0x2dc/0x528
>   irq_thread_fn+0x2c/0xa0
>   irq_thread+0x130/0x1e0
>   kthread+0x154/0x158
>   ret_from_fork+0x10/0x34
> 
> Code: f9400060 f102001f 54000981 d421 (f9400043)
> 
>  ---[ end trace 4afcbdfc61b60467 ]---
> 
> The issue is that expression !iova_magazine_full(NULL) evaluates true; this
> falls over in in __iova_rcache_insert() when we attempt to cache a mag
> and cpu_rcache->loaded == NULL:
> 
> if (!iova_magazine_full(cpu_rcache->loaded)) {
>   can_insert = true;
> ...
> 
> if (can_insert)
>   iova_magazine_push(cpu_rcache->loaded, iova_pfn);
> 
> As above, can_insert is evaluated true, which it shouldn't be, and we try
> to insert pfns in a NULL mag, which is not safe.
> 
> To avoid this, stop using double-negatives, like !iova_magazine_full() and
> !iova_magazine_empty(), and use positive tests, like
> iova_magazine_has_space() and iova_magazine_has_pfns(), respectively; these
> can safely deal with cpu_rcache->{loaded, prev} = NULL.
> 
> Signed-off-by: John Garry 
> ---
>  drivers/iommu/iova.c | 29 +
>  1 file changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 81b7399dd5e8..1f3f0f8b12e0 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -827,14 +827,18 @@ iova_magazine_free_pfns(struct iova_magazine *mag, 
> struct iova_domain *iovad)
>   mag->size = 0;
>  }
>  
> -static bool iova_magazine_full(struct iova_magazine *mag)
> +static bool iova_magazine_has_space(struct iova_magazine *mag)
>  {
> - return (mag && mag->size == IOVA_MAG_SIZE);
> + if (!mag)
> + return false;
> + return mag->size < IOVA_MAG_SIZE;
>  }
>  
> -static bool iova_magazine_empty(struct iova_magazine *mag)
> +static bool iova_magazine_has_pfns(struct iova_magazine *mag)
>  {
> - return (!mag || mag->size == 0);
> + if (!mag)
> + return false;
> + return mag->size;
>  }
>  
>  static unsigned long iova_magazine_pop(struct iova_magazine *mag,
> @@ -843,7 +847,7 @@ static unsigned long iova_magazine_pop(struct 
> iova_magazine *mag,
>   int i;
>   unsigned long pfn;
>  
> - BUG_ON(iova_magazine_empty(mag));
> + BUG_ON(!iova_magazine_has_pfns(mag));
>  
>   /* Only fall back to the rbtree if we have no suitable pfns at all */
>   for (i = mag->size - 1; mag->pfns[i] > limit_pfn; i--)
> @@ -859,7 +863,7 @@ static unsigned long iova_magazine_pop(struct 
> iova_magazine *mag,
>  
>  static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
>  {
> - BUG_ON(iova_magazine_full(mag));
> + BUG_ON(!iova_magazine_has_space(mag));
>  
>   mag->pfns[mag->size++] = pfn;
>  }
> @@ -905,9 +909,9 @@ static bool __iova_rcache_insert(struct iova_domain 
> *iovad,
>   cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
>   spin_lock_irqsave(_rcache->lock, flags);
>  
> - if (!iova_magazine_full(cpu_rcache->loaded)) {
> + if (iova_magazine_has_space(cpu_rcache->loaded)) {
>   can_insert = true;
> - } else if (!iova_magazine_full(cpu_rcache->prev)) {
> + } else if (iova_magazine_has_space(cpu_rcache->prev)) {
>   swap(cpu_rcache->prev, cpu_rcache->loaded);
>   can_insert = true;
>   } else {
> @@ -916,8 +920,9 @@ static bool __iova_rcache_insert(struct iova_domain 
> *iovad,
>   if (new_mag) {
>   spin_lock(>lock);
>   if (rcache->depot_size < MAX_GLOBAL_MAGS) {
> - rcache->depot[rcache->depot_size++] =
> - cpu_rcache->loaded;
> + if (cpu_rcache->loaded)

Looks like it just needs to change this place. Compiler ensures that mag->size
will not be accessed when mag is NULL.

static bool iova_magazine_full(struct iova_magazine *mag)
{
return (mag && mag->size == IOVA_MAG_SIZE);
}

static bool iova_magazine_empty(struct iova_magazine *mag)
{
return (!mag || mag->size == 

Re: [RESEND PATCH v3 3/4] iommu/iova: Flush CPU rcache for when a depot fills

2020-12-09 Thread Leizhen (ThunderTown)



On 2020/11/17 18:25, John Garry wrote:
> Leizhen reported some time ago that IOVA performance may degrade over time
> [0], but unfortunately his solution to fix this problem was not given
> attention.
> 
> To summarize, the issue is that as time goes by, the CPU rcache and depot
> rcache continue to grow. As such, IOVA RB tree access time also continues
> to grow.
> 
> At a certain point, a depot may become full, and also some CPU rcaches may
> also be full when inserting another IOVA is attempted. For this scenario,
> currently the "loaded" CPU rcache is freed and a new one is created. This
> freeing means that many IOVAs in the RB tree need to be freed, which
> makes IO throughput performance fall off a cliff in some storage scenarios:
> 
> Jobs: 12 (f=12): [] [0.0% done] [6314MB/0KB/0KB /s] [1616K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [5669MB/0KB/0KB /s] [1451K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6673MB/0KB/0KB /s] [1708K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6761MB/0KB/0KB /s] [1731K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6685MB/0KB/0KB /s] [1711K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6178MB/0KB/0KB /s] [1582K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6731MB/0KB/0KB /s] [1723K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [2387MB/0KB/0KB /s] [611K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [2689MB/0KB/0KB /s] [688K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [2278MB/0KB/0KB /s] [583K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [1288MB/0KB/0KB /s] [330K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [1632MB/0KB/0KB /s] [418K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [1765MB/0KB/0KB /s] [452K/0/0 
> iops]
> 
> And continue in this fashion, without recovering. Note that in this
> example it was required to wait 16 hours for this to occur. Also note that
> IO throughput also becomes gradually becomes more unstable leading up to
> this point.
> 
> This problem is only seen for non-strict mode. For strict mode, the rcaches
> stay quite compact.
> 
> As a solution to this issue, judge that the IOVA caches have grown too big
> when cached magazines need to be free, and just flush all the CPUs rcaches
> instead.
> 
> The depot rcaches, however, are not flushed, as they can be used to
> immediately replenish active CPUs.
> 
> In future, some IOVA compaction could be implemented to solve the
> instabilty issue, which I figure could be quite complex to implement.
> 
> [0] 
> https://lore.kernel.org/linux-iommu/20190815121104.29140-3-thunder.leiz...@huawei.com/
> 
> Analyzed-by: Zhen Lei 
> Reported-by: Xiang Chen 
> Signed-off-by: John Garry 
> ---
>  drivers/iommu/iova.c | 16 ++--
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 1f3f0f8b12e0..386005055aca 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -901,7 +901,6 @@ static bool __iova_rcache_insert(struct iova_domain 
> *iovad,
>struct iova_rcache *rcache,
>unsigned long iova_pfn)
>  {
> - struct iova_magazine *mag_to_free = NULL;
>   struct iova_cpu_rcache *cpu_rcache;
>   bool can_insert = false;
>   unsigned long flags;
> @@ -923,13 +922,12 @@ static bool __iova_rcache_insert(struct iova_domain 
> *iovad,
>   if (cpu_rcache->loaded)
>   rcache->depot[rcache->depot_size++] =
>   cpu_rcache->loaded;
> - } else {
> - mag_to_free = cpu_rcache->loaded;
> + can_insert = true;
> + cpu_rcache->loaded = new_mag;
>   }
>   spin_unlock(>lock);
> -
> - cpu_rcache->loaded = new_mag;
> - can_insert = true;
> + if (!can_insert)
> + iova_magazine_free(new_mag);
>   }
>   }
>  
> @@ -938,10 +936,8 @@ static bool __iova_rcache_insert(struct iova_domain 
> *iovad,
>  
>   spin_unlock_irqrestore(_rcache->lock, flags);
>  
> - if (mag_to_free) {
> - iova_magazine_free_pfns(mag_to_free, iovad);
> - iova_magazine_free(mag_to_free);
mag_to_free has been stripped out, that's why lock protection is not required 
here.

> - }
> + if 

Re: [RESEND PATCH v3 3/4] iommu/iova: Flush CPU rcache for when a depot fills

2020-12-09 Thread Leizhen (ThunderTown)



On 2020/12/9 19:22, John Garry wrote:
> On 09/12/2020 09:13, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2020/11/17 18:25, John Garry wrote:
>>> Leizhen reported some time ago that IOVA performance may degrade over time
>>> [0], but unfortunately his solution to fix this problem was not given
>>> attention.
>>>
>>> To summarize, the issue is that as time goes by, the CPU rcache and depot
>>> rcache continue to grow. As such, IOVA RB tree access time also continues
>>> to grow.
>>>
>>> At a certain point, a depot may become full, and also some CPU rcaches may
>>> also be full when inserting another IOVA is attempted. For this scenario,
>>> currently the "loaded" CPU rcache is freed and a new one is created. This
>>> freeing means that many IOVAs in the RB tree need to be freed, which
>>> makes IO throughput performance fall off a cliff in some storage scenarios:
>>>
>>> Jobs: 12 (f=12): [] [0.0% done] [6314MB/0KB/0KB /s] [1616K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [5669MB/0KB/0KB /s] [1451K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [6673MB/0KB/0KB /s] [1708K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [6761MB/0KB/0KB /s] [1731K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [6685MB/0KB/0KB /s] [1711K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [6178MB/0KB/0KB /s] [1582K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [6731MB/0KB/0KB /s] [1723K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [2387MB/0KB/0KB /s] [611K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [2689MB/0KB/0KB /s] [688K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [2278MB/0KB/0KB /s] [583K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [1288MB/0KB/0KB /s] [330K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [1632MB/0KB/0KB /s] [418K/0/0 
>>> iops]
>>> Jobs: 12 (f=12): [] [0.0% done] [1765MB/0KB/0KB /s] [452K/0/0 
>>> iops]
>>>
>>> And continue in this fashion, without recovering. Note that in this
>>> example it was required to wait 16 hours for this to occur. Also note that
>>> IO throughput also becomes gradually becomes more unstable leading up to
>>> this point.
>>>
>>> This problem is only seen for non-strict mode. For strict mode, the rcaches
>>> stay quite compact.
>>>
>>> As a solution to this issue, judge that the IOVA caches have grown too big
>>> when cached magazines need to be free, and just flush all the CPUs rcaches
>>> instead.
>>>
>>> The depot rcaches, however, are not flushed, as they can be used to
>>> immediately replenish active CPUs.
>>>
>>> In future, some IOVA compaction could be implemented to solve the
>>> instabilty issue, which I figure could be quite complex to implement.
>>>
>>> [0] 
>>> https://lore.kernel.org/linux-iommu/20190815121104.29140-3-thunder.leiz...@huawei.com/
>>>
>>> Analyzed-by: Zhen Lei 
>>> Reported-by: Xiang Chen 
>>> Signed-off-by: John Garry 
> 
> Thanks for having a look
> 
>>> ---
>>>   drivers/iommu/iova.c | 16 ++--
>>>   1 file changed, 6 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>> index 1f3f0f8b12e0..386005055aca 100644
>>> --- a/drivers/iommu/iova.c
>>> +++ b/drivers/iommu/iova.c
>>> @@ -901,7 +901,6 @@ static bool __iova_rcache_insert(struct iova_domain 
>>> *iovad,
>>>    struct iova_rcache *rcache,
>>>    unsigned long iova_pfn)
>>>   {
>>> -    struct iova_magazine *mag_to_free = NULL;
>>>   struct iova_cpu_rcache *cpu_rcache;
>>>   bool can_insert = false;
>>>   unsigned long flags;
>>> @@ -923,13 +922,12 @@ static bool __iova_rcache_insert(str

Re: [RESEND PATCH v3 2/4] iommu/iova: Avoid double-negatives in magazine helpers

2020-12-09 Thread Leizhen (ThunderTown)



On 2020/12/9 19:39, John Garry wrote:
> On 09/12/2020 09:03, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2020/11/17 18:25, John Garry wrote:
>>> A similar crash to the following could be observed if initial CPU rcache
>>> magazine allocations fail in init_iova_rcaches():
>>>
>>> Unable to handle kernel NULL pointer dereference at virtual address 
>>> 
>>> Mem abort info:
>>>     ESR = 0x9604
>>>     EC = 0x25: DABT (current EL), IL = 32 bits
>>>     SET = 0, FnV = 0
>>>     EA = 0, S1PTW = 0
>>> Data abort info:
>>>     ISV = 0, ISS = 0x0004
>>>     CM = 0, WnR = 0
>>> [] user address but active_mm is swapper
>>> Internal error: Oops: 9604 [#1] PREEMPT SMP
>>> Modules linked in:
>>> CPU: 11 PID: 696 Comm: irq/40-hisi_sas Not tainted 5.9.0-rc7-dirty #109
>>> Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 
>>> 03/15/2019
>>> Call trace:
>>>    free_iova_fast+0xfc/0x280
>>>    iommu_dma_free_iova+0x64/0x70
>>>    __iommu_dma_unmap+0x9c/0xf8
>>>    iommu_dma_unmap_sg+0xa8/0xc8
>>>    dma_unmap_sg_attrs+0x28/0x50
>>>    cq_thread_v3_hw+0x2dc/0x528
>>>    irq_thread_fn+0x2c/0xa0
>>>    irq_thread+0x130/0x1e0
>>>    kthread+0x154/0x158
>>>    ret_from_fork+0x10/0x34
>>>
>>> Code: f9400060 f102001f 54000981 d421 (f9400043)
>>>
>>>   ---[ end trace 4afcbdfc61b60467 ]---
>>>
>>> The issue is that expression !iova_magazine_full(NULL) evaluates true; this
>>> falls over in in __iova_rcache_insert() when we attempt to cache a mag
>>> and cpu_rcache->loaded == NULL:
>>>
>>> if (!iova_magazine_full(cpu_rcache->loaded)) {
>>> can_insert = true;
>>> ...
>>>
>>> if (can_insert)
>>> iova_magazine_push(cpu_rcache->loaded, iova_pfn);
>>>
>>> As above, can_insert is evaluated true, which it shouldn't be, and we try
>>> to insert pfns in a NULL mag, which is not safe.
>>>
>>> To avoid this, stop using double-negatives, like !iova_magazine_full() and
>>> !iova_magazine_empty(), and use positive tests, like
>>> iova_magazine_has_space() and iova_magazine_has_pfns(), respectively; these
>>> can safely deal with cpu_rcache->{loaded, prev} = NULL.
>>>
>>> Signed-off-by: John Garry 
> 
> Thanks for checking here...
> 
>>> ---
>>>   drivers/iommu/iova.c | 29 +
>>>   1 file changed, 17 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>> index 81b7399dd5e8..1f3f0f8b12e0 100644
>>> --- a/drivers/iommu/iova.c
>>> +++ b/drivers/iommu/iova.c
>>> @@ -827,14 +827,18 @@ iova_magazine_free_pfns(struct iova_magazine *mag, 
>>> struct iova_domain *iovad)
>>>   mag->size = 0;
>>>   }
>>>   -static bool iova_magazine_full(struct iova_magazine *mag)
>>> +static bool iova_magazine_has_space(struct iova_magazine *mag)
>>>   {
>>> -    return (mag && mag->size == IOVA_MAG_SIZE);
>>> +    if (!mag)
>>> +    return false;
>>> +    return mag->size < IOVA_MAG_SIZE;
>>>   }
>>>   -static bool iova_magazine_empty(struct iova_magazine *mag)
>>> +static bool iova_magazine_has_pfns(struct iova_magazine *mag)
>>>   {
>>> -    return (!mag || mag->size == 0);
>>> +    if (!mag)
>>> +    return false;
>>> +    return mag->size;
>>>   }
>>>     static unsigned long iova_magazine_pop(struct iova_magazine *mag,
>>> @@ -843,7 +847,7 @@ static unsigned long iova_magazine_pop(struct 
>>> iova_magazine *mag,
>>>   int i;
>>>   unsigned long pfn;
>>>   -    BUG_ON(iova_magazine_empty(mag));
>>> +    BUG_ON(!iova_magazine_has_pfns(mag));
>>>     /* Only fall back to the rbtree if we have no suitable pfns at all 
>>> */
>>>   for (i = mag->size - 1; mag->pfns[i] > limit_pfn; i--)
>>> @@ -859,7 +863,7 @@ static unsigned long iova_magazine_pop(struct 
>>> iova_magazine *mag,
>>>     static void iova_magazine_push(struct iova_magazine *mag, unsigned long 
>>> pfn)
>>>   {
>>> -    BUG_ON(iova_magazine_full(mag));
>>> +    BUG_ON(!iova_magazine_has_space(mag));
>>>     mag->pfns[mag->size++] = pfn;
>>>   }
&g

Re: [PATCH 1/1] dt-bindings: leds: add onboard LED triggers of 96Boards

2020-12-09 Thread Leizhen (ThunderTown)



On 2020/12/10 11:31, Manivannan Sadhasivam wrote:
> Hi,
> 
> On Thu, Dec 10, 2020 at 11:12:03AM +0800, Zhen Lei wrote:
>> For all 96Boards, the following standard is used for onboard LEDs.
>>
>> green:user1  default-trigger: heartbeat
>> green:user2  default-trigger: mmc0/disk-activity(onboard-storage)
>> green:user3  default-trigger: mmc1 (SD-card)
>> green:user4  default-trigger: none, panic-indicator
>> yellow:wlan  default-trigger: phy0tx
>> blue:bt  default-trigger: hci0-power
>>
>> Link to 96Boards CE Specification: https://linaro.co/ce-specification
>>
> 
> This is just a board configuration and there is absolutely no need to document
> this in common LED binding. But if your intention is to document the missing
No, I don't think so. The common just means the property linux,default-trigger
is common, but not it values. This can be proved by counter-proving:none of
the triggerrs currently defined in common.yaml is used by 96Boards.

> triggers, then you should look at the patch I submitted long ago.

I'm just trying to eliminate the warnings related to Hisilicon that YAML 
detected.
So I didn't pay attention to other missing triggers.

> 
> https://lore.kernel.org/patchwork/patch/1146359/
> 
> Maybe I should resubmit it again in YAML format. (thanks for reminding me :P)

Yes, I hope that you will resubmit it. After all, these false positives are
entirely due to YAML's failure to list all triggers. The DTS itself is fine.

By the way, the description of this patch I copied from your patch:
953d9f390365 arm64: dts: rockchip: Add on-board LED support on rk3399-rock960

That's why I Cc to you.

> 
> Thanks,
> Mani
> 
>> Signed-off-by: Zhen Lei 
>> Cc: Darshak Patel 
>> Cc: Manivannan Sadhasivam 
>> Cc: Shawn Guo 
>> Cc: Dong Aisheng 
>> Cc: Guodong Xu 
>> Cc: Wei Xu 
>> Cc: Linus Walleij 
>> Cc: Lad Prabhakar 
>> Cc: Marian-Cristian Rotariu 
>> Cc: Geert Uytterhoeven 
>> Cc: Heiko Stuebner 
>> ---
>>  Documentation/devicetree/bindings/leds/common.yaml | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/leds/common.yaml 
>> b/Documentation/devicetree/bindings/leds/common.yaml
>> index f1211e7045f12f3..525752d6c5c84fd 100644
>> --- a/Documentation/devicetree/bindings/leds/common.yaml
>> +++ b/Documentation/devicetree/bindings/leds/common.yaml
>> @@ -97,6 +97,16 @@ properties:
>>  # LED alters the brightness for the specified duration with one 
>> software
>>  # timer (requires "led-pattern" property)
>>- pattern
>> +#For all 96Boards, Green, disk-activity(onboard-storage)
>> +  - mmc0
>> +#For all 96Boards, Green, SD-card
>> +  - mmc1
>> +#For all 96Boards, Green, panic-indicator
>> +  - none
>> +#For all 96Boards, Yellow, WiFi activity LED
>> +  - phy0tx
>> +#For all 96Boards, Blue, Bluetooth activity LED
>> +  - hci0-power
>>  
>>led-pattern:
>>  description: |
>> -- 
>> 1.8.3
>>
>>
> 
> .
> 



Re: [PATCH] dt-bindings: leds: Document commonly used LED triggers

2020-12-09 Thread Leizhen (ThunderTown)



On 2020/12/10 14:14, Manivannan Sadhasivam wrote:
> This commit documents the LED triggers used commonly in the SoCs. Not
> all triggers are documented as some of them are very application specific.
> Most of the triggers documented here are currently used in devicetrees
> of many SoCs.
> 
> Signed-off-by: Manivannan Sadhasivam 
> ---
>  .../devicetree/bindings/leds/common.yaml  | 72 ++-
>  1 file changed, 54 insertions(+), 18 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/leds/common.yaml 
> b/Documentation/devicetree/bindings/leds/common.yaml
> index f1211e7045f1..eee4eb7a4535 100644
> --- a/Documentation/devicetree/bindings/leds/common.yaml
> +++ b/Documentation/devicetree/bindings/leds/common.yaml
> @@ -79,24 +79,60 @@ properties:
>the LED.
>  $ref: /schemas/types.yaml#definitions/string
>  
> -enum:
> -# LED will act as a back-light, controlled by the framebuffer system
> -  - backlight
> -# LED will turn on (but for leds-gpio see "default-state" property in
> -# Documentation/devicetree/bindings/leds/leds-gpio.yaml)
> -  - default-on
> -# LED "double" flashes at a load average based rate
> -  - heartbeat
> -# LED indicates disk activity
> -  - disk-activity
> -# LED indicates IDE disk activity (deprecated), in new 
> implementations
> -# use "disk-activity"
> -  - ide-disk
> -# LED flashes at a fixed, configurable rate
> -  - timer
> -# LED alters the brightness for the specified duration with one 
> software
> -# timer (requires "led-pattern" property)
> -  - pattern
> +oneOf:
> +  - items:
> +  - enum:
> +# LED will act as a back-light, controlled by the 
> framebuffer system
> +  - backlight
> +# LED will turn on (but for leds-gpio see "default-state" 
> property in
> +# Documentation/devicetree/bindings/leds/leds-gpio.yaml)
> +  - default-on
> +# LED "double" flashes at a load average based rate
> +  - heartbeat
> +# LED indicates disk activity
> +  - disk-activity
> +# LED indicates IDE disk activity (deprecated), in new 
> implementations
> +# use "disk-activity"
> +  - ide-disk
> +# LED flashes at a fixed, configurable rate
> +  - timer
> +# LED alters the brightness for the specified duration with 
> one software
> +# timer (requires "led-pattern" property)
> +  - pattern
> +# LED indicates camera flash state
> +  - flash
> +# LED indicates camera torch state
> +  - torch
> +# LED indicates audio mute state
> +  - audio-mute
> +# LED indicates mic mute state
> +  - audio-micmute
> +# LED indicates bluetooth power state
> +  - bluetooth-power
> +# LED indicates USB gadget activity
> +  - usb-gadget
> +# LED indicates USB host activity
> +  - usb-host
> +# LED indicates MTD memory activity
> +  - mtd
> +# LED indicates NAND memory activity (deprecated),
> +# in new implementations use "mtd"
> +  - nand-disk
> +# LED indicates disk read activity
> +  - disk-read
> +# LED indicates disk write activity
> +  - disk-write
> +# No trigger assigned to the LED. This is the default mode
> +# if trigger is absent
> +  - none
> +# LED indicates activity of all CPUs
> +  - cpu
The triggers phy0tx and hci0-power are missed.

Since you've rewritten it, please consider sorting these property strings
in ascending alphabetical order.

> +  - items:
> +# LED indicates activity of [N]th CPU
> +  - pattern: "^cpu[0-9][0-9]$"
should be ^cpu[0-9]{1,2}$, otherwise, it always requires two digit.

> +  - items:
> +# LED indicates [N]th MMC storage activity
> +  - pattern: '^mmc[0-9][0-9]$'
should be '^mmc[0-9]{1,2}$'

Why CPU use "", and mmc use '',It's better to keep them consistent.

>  
>led-pattern:
>  description: |
> 



Re: [PATCH 1/1] device-dax: avoid an unnecessary check in alloc_dev_dax_range()

2020-12-17 Thread Leizhen (ThunderTown)



On 2020/12/18 11:10, Dan Williams wrote:
> On Fri, Nov 20, 2020 at 1:23 AM Zhen Lei  wrote:
>>
>> Swap the calling sequence of krealloc() and __request_region(), call the
>> latter first. In this way, the value of dev_dax->nr_range does not need to
>> be considered when __request_region() failed.
> 
> This looks ok, but I think I want to see another cleanup go in first
> before this to add a helper for trimming the last range off the set of
> ranges:
> 
> static void dev_dax_trim_range(struct dev_dax *dev_dax)
> {
> int i = dev_dax->nr_range - 1;
> struct range *range = _dax->ranges[i].range;
> struct dax_region *dax_region = dev_dax->region;
> 
> dev_dbg(dev, "delete range[%d]: %#llx:%#llx\n", i,
> (unsigned long long)range->start,
> (unsigned long long)range->end);
> 
> __release_region(_region->res, range->start, range_len(range));
> if (--dev_dax->nr_range == 0) {
> kfree(dev_dax->ranges);
> dev_dax->ranges = NULL;
> }
> }
> 
> Care to do a lead in patch with that cleanup, then do this one?

I don't mind! You can add above helper first. After that, I'll update
and send this patch again.

> 
> I think that might also cleanup a memory leak report from Jane in
> addition to not needing the "goto" as well.
> 
> http://lore.kernel.org/r/c8a8a260-34c6-dbfc-1f19-25c23d01c...@oracle.com
> 
> .
> 



Re: [PATCH 1/1] device-dax: avoid an unnecessary check in alloc_dev_dax_range()

2020-12-17 Thread Leizhen (ThunderTown)



On 2020/11/20 17:22, Zhen Lei wrote:
> Swap the calling sequence of krealloc() and __request_region(), call the
> latter first. In this way, the value of dev_dax->nr_range does not need to
> be considered when __request_region() failed.
> 
> Signed-off-by: Zhen Lei 
> ---
>  drivers/dax/bus.c | 29 -
>  1 file changed, 12 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 27513d311242..1efae11d947a 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -763,23 +763,15 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, 
> u64 start,
>   return 0;
>   }
>  
> - ranges = krealloc(dev_dax->ranges, sizeof(*ranges)
> - * (dev_dax->nr_range + 1), GFP_KERNEL);
> - if (!ranges)
> - return -ENOMEM;
> -
>   alloc = __request_region(res, start, size, dev_name(dev), 0);
> - if (!alloc) {
> - /*
> -  * If this was an empty set of ranges nothing else
> -  * will release @ranges, so do it now.
> -  */
> - if (!dev_dax->nr_range) {
> - kfree(ranges);
> - ranges = NULL;
> - }
> - dev_dax->ranges = ranges;
> + if (!alloc)
>   return -ENOMEM;
> +
> + ranges = krealloc(dev_dax->ranges, sizeof(*ranges)
> + * (dev_dax->nr_range + 1), GFP_KERNEL);
> + if (!ranges) {
> + rc = -ENOMEM;
> + goto err;

Hi, Dan Williams:
In fact, after adding the new helper dev_dax_trim_range(), we can
directly call __release_region() and return error code at here. Replace goto.

>   }
>  
>   for (i = 0; i < dev_dax->nr_range; i++)
> @@ -808,11 +800,14 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, 
> u64 start,
>   dev_dbg(dev, "delete range[%d]: %pa:%pa\n", dev_dax->nr_range - 
> 1,
>   >start, >end);
>   dev_dax->nr_range--;
> - __release_region(res, alloc->start, resource_size(alloc));
> - return rc;
> + goto err;
>   }
>  
>   return 0;
> +
> +err:
> + __release_region(res, alloc->start, resource_size(alloc));
> + return rc;
>  }
>  
>  static int adjust_dev_dax_range(struct dev_dax *dev_dax, struct resource 
> *res, resource_size_t size)
> 



Re: [PATCH] device-dax: Fix range release

2020-12-18 Thread Leizhen (ThunderTown)



On 2020/12/19 10:41, Dan Williams wrote:
> There are multiple locations that open-code the release of the last
> range in a device-dax instance. Consolidate this into a new
> dev_dax_trim_range() helper.
> 
> This also addresses a kmemleak report:
> 
> # cat /sys/kernel/debug/kmemleak
> [..]
> unreferenced object 0x976bd46f6240 (size 64):
>comm "ndctl", pid 23556, jiffies 4299514316 (age 5406.733s)
>hex dump (first 32 bytes):
>  00 00 00 00 00 00 00 00 00 00 20 c3 37 00 00 00  .. .7...
>  ff ff ff 7f 38 00 00 00 00 00 00 00 00 00 00 00  8...
>backtrace:
>  [<064003cf>] __kmalloc_track_caller+0x136/0x379
>  [] krealloc+0x67/0x92
>  [] __alloc_dev_dax_range+0x73/0x25c
>  [<27d58626>] devm_create_dev_dax+0x27d/0x416
>  [<434abd43>] __dax_pmem_probe+0x1c9/0x1000 [dax_pmem_core]
>  [<83726c1c>] dax_pmem_probe+0x10/0x1f [dax_pmem]
>  [] nvdimm_bus_probe+0x9d/0x340 [libnvdimm]
>  [] really_probe+0x230/0x48d
>  [<6cabd38e>] driver_probe_device+0x122/0x13b
>  [<29c7b95a>] device_driver_attach+0x5b/0x60
>  [<53e5659b>] bind_store+0xb7/0xc3
>  [] drv_attr_store+0x27/0x31
>  [<949069c5>] sysfs_kf_write+0x4a/0x57
>  [<4a8b5adf>] kernfs_fop_write+0x150/0x1e5
>  [] __vfs_write+0x1b/0x34
>  [] vfs_write+0xd8/0x1d1
> 
> Reported-by: Jane Chu 
> Cc: Zhen Lei 
> Signed-off-by: Dan Williams 
> ---
>  drivers/dax/bus.c |   44 +---
>  1 file changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 9761cb40d4bb..720cd140209f 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -367,19 +367,28 @@ void kill_dev_dax(struct dev_dax *dev_dax)
>  }
>  EXPORT_SYMBOL_GPL(kill_dev_dax);
>  
> -static void free_dev_dax_ranges(struct dev_dax *dev_dax)
> +static void trim_dev_dax_range(struct dev_dax *dev_dax)
>  {
> + int i = dev_dax->nr_range - 1;
> + struct range *range = _dax->ranges[i].range;
>   struct dax_region *dax_region = dev_dax->region;
> - int i;
>  
>   device_lock_assert(dax_region->dev);
> - for (i = 0; i < dev_dax->nr_range; i++) {
> - struct range *range = _dax->ranges[i].range;
> -
> - __release_region(_region->res, range->start,
> - range_len(range));
> + dev_dbg(_dax->dev, "delete range[%d]: %#llx:%#llx\n", i,
> + (unsigned long long)range->start,
> + (unsigned long long)range->end);
> +
> + __release_region(_region->res, range->start, range_len(range));
> + if (--dev_dax->nr_range == 0) {
> + kfree(dev_dax->ranges);
> + dev_dax->ranges = NULL;
>   }
> - dev_dax->nr_range = 0;
> +}
> +
> +static void free_dev_dax_ranges(struct dev_dax *dev_dax)
> +{
> + while (dev_dax->nr_range)
It's better to use READ_ONCE to get the value of dev_dax->nr_range,
to prevent compiler optimization.

> + trim_dev_dax_range(dev_dax);
>  }
>  
>  static void unregister_dev_dax(void *dev)
> @@ -804,15 +813,10 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, 
> u64 start,
>   return 0;
>  
>   rc = devm_register_dax_mapping(dev_dax, dev_dax->nr_range - 1);
> - if (rc) {
> - dev_dbg(dev, "delete range[%d]: %pa:%pa\n", dev_dax->nr_range - 
> 1,
> - >start, >end);
> - dev_dax->nr_range--;
> - __release_region(res, alloc->start, resource_size(alloc));
> - return rc;
> - }
> + if (rc)
> + trim_dev_dax_range(dev_dax);
>  
> - return 0;
> + return rc;
>  }
>  
>  static int adjust_dev_dax_range(struct dev_dax *dev_dax, struct resource 
> *res, resource_size_t size)
> @@ -885,12 +889,7 @@ static int dev_dax_shrink(struct dev_dax *dev_dax, 
> resource_size_t size)
>   if (shrink >= range_len(range)) {
>   devm_release_action(dax_region->dev,
>   unregister_dax_mapping, >dev);
> - __release_region(_region->res, range->start,
> - range_len(range));
> - dev_dax->nr_range--;
> - dev_dbg(dev, "delete range[%d]: %#llx:%#llx\n", i,
> - (unsigned long long) range->start,
> - (unsigned long long) range->end);
> + trim_dev_dax_range(dev_dax);
>   to_shrink -= shrink;
>   if (!to_shrink)
>   break;
> @@ -1267,7 +1266,6 @@ static void dev_dax_release(struct device *dev)
>   put_dax(dax_dev);
>   free_dev_dax_id(dev_dax);
>   

Re: [PATCH 1/1] ARM: LPAE: use phys_addr_t instead of unsigned long in outercache hooks

2020-12-28 Thread Leizhen (ThunderTown)



On 2020/12/26 20:13, Russell King - ARM Linux admin wrote:
> On Fri, Dec 25, 2020 at 07:44:58PM +0800, Zhen Lei wrote:
>> The outercache of some Hisilicon SOCs support physical addresses wider
>> than 32-bits. The unsigned long datatype is not sufficient for mapping
>> physical addresses >= 4GB. The commit ad6b9c9d78b9 ("ARM: 6671/1: LPAE:
>> use phys_addr_t instead of unsigned long in outercache functions") has
>> already modified the outercache functions. But the parameters of the
>> outercache hooks are not changed. This patch use phys_addr_t instead of
>> unsigned long in outercache hooks: inv_range, clean_range, flush_range.
>>
>> To ensure the outercache that does not support LPAE works properly, do
>> cast phys_addr_t to unsigned long by adding a middle-tier function.
> 
> Please don't do that. The cast can be done inside the L2 functions
> themselves without needing all these additional functions.

OK. At first, I wanted to fit in like this:

-static void l2c220_inv_range(unsigned long start, unsigned long end)
+static void l2c220_inv_range(phys_addr_t lpae_start, phys_addr_t lpae_end)
 {
+  unsigned long start = lpae_start;
+  unsigned long end = lpae_end;


> 
> We probably ought to also add some protection against addresses > 4GB,
> although these are hot paths, so we don't want to add tests in these
> functions. Maybe instead checking whether the system has memory above
> 4GB while the L2 cache is being initialised would be a good idea?
> 

I'm sorry, I didn't quite understand what you meant. Currently, the
biggest problem is the compilation problem. The sizeof(long) may be
32, and the 64-bit physical address cannot be transferred from outcache
functions to outcache hooks.



Re: [PATCH 1/1] ARM: LPAE: use phys_addr_t instead of unsigned long in outercache hooks

2020-12-28 Thread Leizhen (ThunderTown)



On 2020/12/26 20:15, Russell King - ARM Linux admin wrote:
> On Sat, Dec 26, 2020 at 10:18:08AM +0800, Leizhen (ThunderTown) wrote:
>> On 2020/12/25 19:44, Zhen Lei wrote:
>>> The outercache of some Hisilicon SOCs support physical addresses wider
>>> than 32-bits. The unsigned long datatype is not sufficient for mapping
>>> physical addresses >= 4GB. The commit ad6b9c9d78b9 ("ARM: 6671/1: LPAE:
>>> use phys_addr_t instead of unsigned long in outercache functions") has
>>> already modified the outercache functions. But the parameters of the
>>> outercache hooks are not changed. This patch use phys_addr_t instead of
>>> unsigned long in outercache hooks: inv_range, clean_range, flush_range.
>>>
>>> To ensure the outercache that does not support LPAE works properly, do
>>> cast phys_addr_t to unsigned long by adding a middle-tier function.
>>
>> This patch will impact the outercache drivers that have not been merged into
>> the kernel. They should also update the datatype of the outercache hooks.
> 
> This isn't much of a concern to mainline. If it's that big a problem
> for you, then please consider merging your code into mainline so that
> everyone can benefit from it.

All right, I got it.

> 



Re: [PATCH 1/1] ARM: LPAE: use phys_addr_t instead of unsigned long in outercache hooks

2020-12-28 Thread Leizhen (ThunderTown)



On 2020/12/28 15:00, Arnd Bergmann wrote:
> On Fri, Dec 25, 2020 at 12:48 PM Zhen Lei  wrote:
>>
>> The outercache of some Hisilicon SOCs support physical addresses wider
>> than 32-bits. The unsigned long datatype is not sufficient for mapping
>> physical addresses >= 4GB. The commit ad6b9c9d78b9 ("ARM: 6671/1: LPAE:
>> use phys_addr_t instead of unsigned long in outercache functions") has
>> already modified the outercache functions. But the parameters of the
>> outercache hooks are not changed. This patch use phys_addr_t instead of
>> unsigned long in outercache hooks: inv_range, clean_range, flush_range.
>>
>> To ensure the outercache that does not support LPAE works properly, do
>> cast phys_addr_t to unsigned long by adding a middle-tier function.
>> For example:
>> -static void l2c220_inv_range(unsigned long start, unsigned long end)
>> +static void __l2c220_inv_range(unsigned long start, unsigned long end)
>>  {
>> ...
>>  }
>> +static void l2c220_inv_range(phys_addr_t start, phys_addr_t end)
>> +{
>> +  __l2c220_inv_range(start, end);
>> +}
>>
>> Note that the outercache functions have been doing this cast before this
>> patch. So now, the cast is just moved to the middle-tier function.
>>
>> No functional change.
>>
>> Signed-off-by: Zhen Lei 
> 
> This looks reasonable in principle, but it would be helpful to
> understand better which SoCs are affected. In which way is
> this specific to Hisilicon implementations, and why would others
> not need this?

I answered at the end.

> 
> Wouldn't this also be needed by an Armada XP that supports
> more than 4GB of RAM but has an outer cache?

I don't know about the armada XP environment.

> 
> I suppose those SoCs using off-the-shelf Arm cores are either
> pre-LPAE and cannot address memory above 4GB, or they do
> not need the outer_cache interfaces.

I think so.

> 
>> diff --git a/arch/arm/mm/cache-feroceon-l2.c 
>> b/arch/arm/mm/cache-feroceon-l2.c
>> index 5c1b7a7b9af6300..ab1d8051bf832c9 100644
>> --- a/arch/arm/mm/cache-feroceon-l2.c
>> +++ b/arch/arm/mm/cache-feroceon-l2.c
>> @@ -168,7 +168,7 @@ static unsigned long calc_range_end(unsigned long start, 
>> unsigned long end)
>> return range_end;
>>  }
>>
>> -static void feroceon_l2_inv_range(unsigned long start, unsigned long end)
>> +static void __feroceon_l2_inv_range(unsigned long start, unsigned long end)
>>  {
>> /*
>>  * Clean and invalidate partial first cache line.
>> @@ -198,7 +198,12 @@ static void feroceon_l2_inv_range(unsigned long start, 
>> unsigned long end)
>> dsb();
>>  }
>>
>> -static void feroceon_l2_clean_range(unsigned long start, unsigned long end)
>> +static void feroceon_l2_inv_range(phys_addr_t start, phys_addr_t end)
>> +{
>> +   __feroceon_l2_inv_range(start, end);
>> +}
>> +
> 
> What is this indirection for? It looks like you do this for all 
> implementations,
> so the actual address gets truncated here.

Because these environments are all 32-bit physical addresses or only the lower
32-bit physical addresses need to be operated. But my environment operates 
64-bit
physical address and sizeof(long) is 32. So need to change the datatype of the
outchache hooks.

 struct outer_cache_fns {
-   void (*inv_range)(unsigned long, unsigned long);
-   void (*clean_range)(unsigned long, unsigned long);
-   void (*flush_range)(unsigned long, unsigned long);
+   void (*inv_range)(phys_addr_t, phys_addr_t);
+   void (*clean_range)(phys_addr_t, phys_addr_t);
+   void (*flush_range)(phys_addr_t, phys_addr_t);
void (*flush_all)(void);

I added middle-tier function for all implementations, just to ensure that the
above changes do not have side effects on them.

> 
>Arnd
> 
> .
> 



<    1   2   3   4   5   >