Re: [PATCH v14 07/13] KVM: VMX: Emulate reads and writes to CET MSRs

2021-01-28 Thread Yang Weijiang
On Thu, Jan 28, 2021 at 06:45:08PM +0100, Paolo Bonzini wrote:
> On 06/11/20 02:16, Yang Weijiang wrote:
> > 
> > +static bool cet_is_ssp_msr_accessible(struct kvm_vcpu *vcpu,
> > + struct msr_data *msr)
> > +{
> > +   u64 mask;
> > +
> > +   if (!kvm_cet_supported())
> > +   return false;
> > +
> > +   if (msr->host_initiated)
> > +   return true;
> > +
> > +   if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
> > +   return false;
> > +
> > +   if (msr->index == MSR_IA32_INT_SSP_TAB)
> > +   return false;
> 
> Shouldn't this return true?
>
Hi, Paolo,
Thanks for the feedback!
Yes, it should be true, will fix it in next release.
> Paolo
> 
> > +   mask = (msr->index == MSR_IA32_PL3_SSP) ? XFEATURE_MASK_CET_USER :
> > + XFEATURE_MASK_CET_KERNEL;
> > +   return !!(vcpu->arch.guest_supported_xss & mask);
> > +}


Re: [binfmt_elf] d97e11e25d: ltp.DS000.fail

2021-01-28 Thread Oliver Sang
On Tue, Jan 26, 2021 at 09:03:26AM +0100, Geert Uytterhoeven wrote:
> Hi Oliver,
> 
> On Tue, Jan 26, 2021 at 6:35 AM kernel test robot  
> wrote:
> > FYI, we noticed the following commit (built with gcc-9):
> >
> > commit: d97e11e25dd226c44257284f95494bb06d1ebf5a ("[PATCH v2] binfmt_elf: 
> > Fix fill_prstatus() call in fill_note_info()")
> > url: 
> > https://github.com/0day-ci/linux/commits/Geert-Uytterhoeven/binfmt_elf-Fix-fill_prstatus-call-in-fill_note_info/20210106-155236
> > base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 
> > e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62
> 
> My patch (which you applied on top of v5.11-rc2) is a build fix for
> a commit that is not part of v5.11-rc2.  Hence the test run is invalid.

sorry for false report. we've fixed the problem. Thanks

> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> -- 
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds


[PATCH v2] kvfree_rcu: Release page cache under memory pressure

2021-01-28 Thread qiang . zhang
From: Zqiang 

Add free per-cpu existing krcp's page cache operation, when
the system is under memory pressure.

Signed-off-by: Zqiang 
---
 kernel/rcu/tree.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c1ae1e52f638..ec098910d80b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3571,17 +3571,40 @@ void kvfree_call_rcu(struct rcu_head *head, 
rcu_callback_t func)
 }
 EXPORT_SYMBOL_GPL(kvfree_call_rcu);
 
+static int free_krc_page_cache(struct kfree_rcu_cpu *krcp)
+{
+   unsigned long flags;
+   struct kvfree_rcu_bulk_data *bnode;
+   int i;
+
+   for (i = 0; i < rcu_min_cached_objs; i++) {
+   raw_spin_lock_irqsave(>lock, flags);
+   bnode = get_cached_bnode(krcp);
+   raw_spin_unlock_irqrestore(>lock, flags);
+   if (!bnode)
+   break;
+   free_page((unsigned long)bnode);
+   }
+
+   return i;
+}
+
 static unsigned long
 kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
 {
int cpu;
unsigned long count = 0;
+   unsigned long flags;
 
/* Snapshot count of all CPUs */
for_each_possible_cpu(cpu) {
struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
 
count += READ_ONCE(krcp->count);
+
+   raw_spin_lock_irqsave(>lock, flags);
+   count += krcp->nr_bkv_objs;
+   raw_spin_unlock_irqrestore(>lock, flags);
}
 
return count;
@@ -3598,6 +3621,8 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct 
shrink_control *sc)
struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
 
count = krcp->count;
+   count += free_krc_page_cache(krcp);
+
raw_spin_lock_irqsave(>lock, flags);
if (krcp->monitor_todo)
kfree_rcu_drain_unlock(krcp, flags);
-- 
2.17.1



Re: riscv+KASAN does not boot

2021-01-28 Thread Alex Ghiti

Hi Dmitry,

On 1/18/21 10:43 AM, Dmitry Vyukov wrote:

On Mon, Jan 18, 2021 at 4:05 PM Dmitry Vyukov  wrote:


On Mon, Jan 18, 2021 at 3:53 PM Tobias Klauser  wrote:

On Thu, Jan 14, 2021 at 5:57 AM Palmer Dabbelt  wrote:


On Fri, 25 Dec 2020 09:13:23 PST (-0800), dvyu...@google.com wrote:

On Fri, Dec 25, 2020 at 5:58 PM Andreas Schwab  wrote:


On Dez 25 2020, Dmitry Vyukov wrote:


qemu-system-riscv64 \
-machine virt -bios default -smp 1 -m 2G \
-device virtio-blk-device,drive=hd0 \
-drive file=buildroot-riscv64.ext4,if=none,format=raw,id=hd0 \
-kernel arch/riscv/boot/Image \
-nographic \
-device virtio-rng-device,rng=rng0 -object
rng-random,filename=/dev/urandom,id=rng0 \
-netdev user,id=net0,host=10.0.2.10,hostfwd=tcp::10022-:22 -device
virtio-net-device,netdev=net0 \
-append "root=/dev/vda earlyprintk=serial console=ttyS0 oops=panic
panic_on_warn=1 panic=86400"


Do you get more output with earlycon=sbi?


Hi Andreas,

For defconfig+kvm_guest.config+ scripts/config -e KASAN -e
KASAN_INLINE it actually gave me more output:


OpenSBI v0.7
_  _
   / __ \  / |  _ \_   _|
  | |  | |_ __   ___ _ __ | (___ | |_) || |
  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
  | |__| | |_) |  __/ | | |) | |_) || |_
   \/| .__/ \___|_| |_|_/|/_|
 | |
 |_|

Platform Name  : QEMU Virt Machine
Platform HART Features : RV64ACDFIMSU
Current Hart   : 0
Firmware Base  : 0x8000
Firmware Size  : 132 KB
Runtime SBI Version: 0.2

MIDELEG : 0x0222
MEDELEG : 0xb109
PMP0: 0x8000-0x8003 (A)
PMP1: 0x-0x (A,R,W,X)
[0.00] Linux version 5.10.0-01370-g71c5f03154ac
(dvyu...@dvyukov-desk.muc.corp.google.com) (riscv64-linux-gnu-gcc
(Debian 10.2.0-9) 10.2.0, GNU ld (GNU Binutils for Debian) 2.35.1) #17
SMP Fri Dec 25 18:10:12 CET 2020
[0.00] OF: fdt: Ignoring memory range 0x8000 - 0x8020
[0.00] earlycon: sbi0 at I/O port 0x0 (options '')
[0.00] printk: bootconsole [sbi0] enabled
[0.00] efi: UEFI not found.
[0.00] Zone ranges:
[0.00]   DMA32[mem 0x8020-0x]
[0.00]   Normal   empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x8020-0x]
[0.00] Initmem setup node 0 [mem 0x8020-0x]
[0.00] SBI specification v0.2 detected
[0.00] SBI implementation ID=0x1 Version=0x7
[0.00] SBI v0.2 TIME extension detected
[0.00] SBI v0.2 IPI extension detected
[0.00] SBI v0.2 RFENCE extension detected
[0.00] software IO TLB: mapped [mem
0xfa3f9000-0xfe3f9000] (64MB)
[0.00] Unable to handle kernel paging request at virtual
address dfc81004
[0.00] Oops [#1]
[0.00] Modules linked in:
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted
5.10.0-01370-g71c5f03154ac #17
[0.00] epc: ffe00042e3e4 ra : ffe000c0462c sp : ffe001603ea0
[0.00]  gp : ffe0016e3c60 tp : ffe00160cd40 t0 :
dfc81004
[0.00]  t1 : ffe000e0a838 t2 :  s0 :
ffe001603f50
[0.00]  s1 : ffe0016e50a8 a0 : dfc81004 a1 :

[0.00]  a2 : 0ffc a3 : dfc82000 a4 :

[0.00]  a5 : 3e8c6001 a6 : ffe000e0a820 a7 :
0900
[0.00]  s2 : dfc82000 s3 : dfc8 s4 :
0001
[0.00]  s5 : ffe0016e5108 s6 : f000 s7 :
dfc81004
[0.00]  s8 : 0080 s9 :  s10:
ffe07a119000
[0.00]  s11: ffc0 t3 : ffe0016eb908 t4 :
0001
[0.00]  t5 : ffc4001c150a t6 : ffe001603be8
[0.00] status: 0100 badaddr: dfc81004
cause: 000f
[0.00] random: get_random_bytes called from
oops_exit+0x30/0x58 with crng_init=0
[0.00] ---[ end trace  ]---
[0.00] Kernel panic - not syncing: Fatal exception
[0.00] ---[ end Kernel panic - not syncing: Fatal exception ]---


But I first tried with a the kernel image I had in the dir, I think it
was this config (no KASAN):
https://gist.githubusercontent.com/dvyukov/b2b62beccf80493781ab03b41430e616/raw/62e673cff08a8a41656d2871b8a37f74b00f509f/gistfile1.txt

and earlycon=sbi did not change anything (no output after OpenSBI).
So potentially there are 2 different problems.


Thanks for reporting this.  Looks like I'd forgotten to add a kasan config to
my tests.  There's one in there now, and it's passing as of the fix that Nylon
posted.


I can boot the KASAN kernel now on riscv/fixes.

Next problem: I've got only to:

[   90.498967][T1] Run /sbin/init as 

Re: [PATCH v3 2/2] vfio/iommu_type1: Fix some sanity checks in detach group

2021-01-28 Thread Keqian Zhu



On 2021/1/28 7:46, Alex Williamson wrote:
> On Fri, 22 Jan 2021 17:26:35 +0800
> Keqian Zhu  wrote:
> 
>> vfio_sanity_check_pfn_list() is used to check whether pfn_list and
>> notifier are empty when remove the external domain, so it makes a
>> wrong assumption that only external domain will use the pinning
>> interface.
>>
>> Now we apply the pfn_list check when a vfio_dma is removed and apply
>> the notifier check when all domains are removed.
>>
>> Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices")
>> Signed-off-by: Keqian Zhu 
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 33 ++---
>>  1 file changed, 10 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c 
>> b/drivers/vfio/vfio_iommu_type1.c
>> index 161725395f2f..d8c10f508321 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -957,6 +957,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, 
>> struct vfio_dma *dma,
>>  
>>  static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
>>  {
>> +WARN_ON(!RB_EMPTY_ROOT(>pfn_list));
>>  vfio_unmap_unpin(iommu, dma, true);
>>  vfio_unlink_dma(iommu, dma);
>>  put_task_struct(dma->task);
>> @@ -2250,23 +2251,6 @@ static void vfio_iommu_unmap_unpin_reaccount(struct 
>> vfio_iommu *iommu)
>>  }
>>  }
>>  
>> -static void vfio_sanity_check_pfn_list(struct vfio_iommu *iommu)
>> -{
>> -struct rb_node *n;
>> -
>> -n = rb_first(>dma_list);
>> -for (; n; n = rb_next(n)) {
>> -struct vfio_dma *dma;
>> -
>> -dma = rb_entry(n, struct vfio_dma, node);
>> -
>> -if (WARN_ON(!RB_EMPTY_ROOT(>pfn_list)))
>> -break;
>> -}
>> -/* mdev vendor driver must unregister notifier */
>> -WARN_ON(iommu->notifier.head);
>> -}
>> -
>>  /*
>>   * Called when a domain is removed in detach. It is possible that
>>   * the removed domain decided the iova aperture window. Modify the
>> @@ -2366,10 +2350,10 @@ static void vfio_iommu_type1_detach_group(void 
>> *iommu_data,
>>  kfree(group);
>>  
>>  if (list_empty(>external_domain->group_list)) {
>> -vfio_sanity_check_pfn_list(iommu);
>> -
>> -if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
>> +if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
>> +WARN_ON(iommu->notifier.head);
>>  vfio_iommu_unmap_unpin_all(iommu);
>> +}
>>  
>>  kfree(iommu->external_domain);
>>  iommu->external_domain = NULL;
>> @@ -2403,10 +2387,12 @@ static void vfio_iommu_type1_detach_group(void 
>> *iommu_data,
>>   */
>>  if (list_empty(>group_list)) {
>>  if (list_is_singular(>domain_list)) {
>> -if (!iommu->external_domain)
>> +if (!iommu->external_domain) {
>> +WARN_ON(iommu->notifier.head);
>>  vfio_iommu_unmap_unpin_all(iommu);
>> -else
>> +} else {
>>  vfio_iommu_unmap_unpin_reaccount(iommu);
>> +}
>>  }
>>  iommu_domain_free(domain->domain);
>>  list_del(>next);
>> @@ -2488,9 +2474,10 @@ static void vfio_iommu_type1_release(void *iommu_data)
>>  struct vfio_iommu *iommu = iommu_data;
>>  struct vfio_domain *domain, *domain_tmp;
>>  
>> +WARN_ON(iommu->notifier.head);
> 
> I don't see that this does any harm, but isn't it actually redundant?
> It seems vfio-core only calls the iommu backend release function after
> removing all the groups, so the tests in _detach_group should catch all
> cases.  We're expecting the vfio bus/mdev driver to remove the notifier
> when a device is closed, which necessarily occurs before detaching the
> group.  Thanks,
Right. Devices of a specific group must be closed before detach this group.
Detach the last group have checked this, so vfio_iommu_type1_release doesn't
need to do this check again.

Could you please queue this patch and delete this check btw? Thanks. ;-)

Keqian.

> 
> Alex
> 
>> +
>>  if (iommu->external_domain) {
>>  vfio_release_domain(iommu->external_domain, true);
>> -vfio_sanity_check_pfn_list(iommu);
>>  kfree(iommu->external_domain);
>>  }
>>  
> 
> .
> 


Re: [PATCH RFC v2 02/10] vringh: add 'iotlb_lock' to synchronize iotlb accesses

2021-01-28 Thread Jason Wang



On 2021/1/28 下午10:41, Stefano Garzarella wrote:

Usually iotlb accesses are synchronized with a spinlock.
Let's request it as a new parameter in vringh_set_iotlb() and
hold it when we navigate the iotlb in iotlb_translate() to avoid
race conditions with any new additions/deletions of ranges from
the ioltb.



Patch looks fine but I wonder if this is the best approach comparing to 
do locking by the caller.


Thanks




Signed-off-by: Stefano Garzarella 
---
  include/linux/vringh.h   | 6 +-
  drivers/vdpa/vdpa_sim/vdpa_sim.c | 3 ++-
  drivers/vhost/vringh.c   | 9 -
  3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 59bd50f99291..9c077863c8f6 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -46,6 +46,9 @@ struct vringh {
/* IOTLB for this vring */
struct vhost_iotlb *iotlb;
  
+	/* spinlock to synchronize IOTLB accesses */

+   spinlock_t *iotlb_lock;
+
/* The function to call to notify the guest about added buffers */
void (*notify)(struct vringh *);
  };
@@ -258,7 +261,8 @@ static inline __virtio64 cpu_to_vringh64(const struct 
vringh *vrh, u64 val)
  
  #if IS_REACHABLE(CONFIG_VHOST_IOTLB)
  
-void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb);

+void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
+ spinlock_t *iotlb_lock);
  
  int vringh_init_iotlb(struct vringh *vrh, u64 features,

  unsigned int num, bool weak_barriers,
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 2183a833fcf4..53238989713d 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -284,7 +284,8 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr 
*dev_attr)
goto err_iommu;
  
  	for (i = 0; i < dev_attr->nvqs; i++)

-   vringh_set_iotlb(>vqs[i].vring, vdpasim->iommu);
+   vringh_set_iotlb(>vqs[i].vring, vdpasim->iommu,
+>iommu_lock);
  
  	ret = iova_cache_get();

if (ret)
diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index 85d85faba058..f68122705719 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -1074,6 +1074,8 @@ static int iotlb_translate(const struct vringh *vrh,
int ret = 0;
u64 s = 0;
  
+	spin_lock(vrh->iotlb_lock);

+
while (len > s) {
u64 size, pa, pfn;
  
@@ -1103,6 +1105,8 @@ static int iotlb_translate(const struct vringh *vrh,

++ret;
}
  
+	spin_unlock(vrh->iotlb_lock);

+
return ret;
  }
  
@@ -1262,10 +1266,13 @@ EXPORT_SYMBOL(vringh_init_iotlb);

   * vringh_set_iotlb - initialize a vringh for a ring with IOTLB.
   * @vrh: the vring
   * @iotlb: iotlb associated with this vring
+ * @iotlb_lock: spinlock to synchronize the iotlb accesses
   */
-void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb)
+void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
+ spinlock_t *iotlb_lock)
  {
vrh->iotlb = iotlb;
+   vrh->iotlb_lock = iotlb_lock;
  }
  EXPORT_SYMBOL(vringh_set_iotlb);
  




[PATCH] Revert "Revert "scsi: megaraid_sas: Simplify the calculation of variables

2021-01-28 Thread Jiapeng Chong
Fix the following coccicheck warnings:

./drivers/scsi/megaraid/megaraid_sas_base.c:8644:30-32: WARNING !A || A
&& B is equivalent to !A || B.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
b/drivers/scsi/megaraid/megaraid_sas_base.c
index 63a4f48..dd26107 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -8641,8 +8641,7 @@ int megasas_update_device_list(struct megasas_instance 
*instance,
 
if (event_type & SCAN_VD_CHANNEL) {
if (!instance->requestorId ||
-   (instance->requestorId &&
-megasas_get_ld_vf_affiliation(instance, 0))) {
+megasas_get_ld_vf_affiliation(instance, 0)) {
dcmd_ret = megasas_ld_list_query(instance,

MR_LD_QUERY_TYPE_EXPOSED_TO_HOST);
if (dcmd_ret != DCMD_SUCCESS)
-- 
1.8.3.1



Re: [PATCH v7] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-28 Thread Sedat Dilek
On Fri, Jan 22, 2021 at 7:41 PM 'Nick Desaulniers' via Clang Built
Linux  wrote:
>
> On Fri, Jan 22, 2021 at 2:12 AM Bill Wendling  wrote:
> >
> > From: Sami Tolvanen 
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> >   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native, unlike
> > the clang support in kernel/gcov.
> >
> > [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> >
> > Signed-off-by: Sami Tolvanen 
> > Co-developed-by: Bill Wendling 
> > Signed-off-by: Bill Wendling 
> > Tested-by: Nick Desaulniers 
>
> Reviewed-by: Nick Desaulniers 
>
> Let's get this queued up, then start thinking about how we can follow
> up with improvements to docs, ergonomics of passing the profiling
> data, and any nailing down which configs tickle any compiler bugs,
> boot failures, or hash mismatches.
>

Some comments:

[ hash mismatches ]

Observed identical warnings when doing a rebuild with GAS or Clang-IAS.

[ Importance of LLVM_IAS=1 working ]

Clang-LTO and Clang-CFI depend both on LLVM_IAS=1 (see for example
"kbuild: add support for Clang LTO").
Sooner or later we will deal with this issue (hope it is not a local problem).

- Sedat -

[1] 
https://github.com/samitolvanen/linux/commit/27da26bada87bde166f01cb1f61b88b727f83a84


> > ---
> > v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
> >   testing.
> > - Corrected documentation, re PGO flags when using LTO, based on Fangrui
> >   Song's comments.
> > v3: - Added change log section based on Sedat Dilek's comments.
> > v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
> >   own popcount implementation, based on Nick Desaulniers's comment.
> > v5: - Correct padding calculation, discovered by Nathan Chancellor.
> > v6: - Add better documentation about the locking scheme and other things.
> > - Rename macros to better match the same macros in LLVM's source code.
> > v7: - Fix minor build failure reported by Sedat.
> > ---
> >  Documentation/dev-tools/index.rst |   1 +
> >  Documentation/dev-tools/pgo.rst   | 127 +
> >  MAINTAINERS   |   9 +
> >  Makefile  |   3 +
> >  arch/Kconfig  |   1 +
> >  arch/x86/Kconfig  |   1 +
> >  arch/x86/boot/Makefile|   1 +
> >  arch/x86/boot/compressed/Makefile |   1 +
> >  arch/x86/crypto/Makefile  |   4 +
> >  arch/x86/entry/vdso/Makefile  |   1 +
> >  arch/x86/kernel/vmlinux.lds.S |   2 +
> >  arch/x86/platform/efi/Makefile|   1 +
> >  arch/x86/purgatory/Makefile   |   1 +
> >  arch/x86/realmode/rm/Makefile |   1 +
> >  arch/x86/um/vdso/Makefile |   1 +
> >  drivers/firmware/efi/libstub/Makefile |   1 +
> >  include/asm-generic/vmlinux.lds.h |  44 +++
> >  kernel/Makefile   |   1 +
> >  kernel/pgo/Kconfig|  35 +++
> >  kernel/pgo/Makefile   |   5 +
> >  kernel/pgo/fs.c   | 389 ++
> >  kernel/pgo/instrument.c   | 189 +
> >  kernel/pgo/pgo.h  | 203 ++
> >  scripts/Makefile.lib  |  10 +
> >  24 files changed, 1032 insertions(+)
> >  create mode 100644 Documentation/dev-tools/pgo.rst
> >  create mode 100644 kernel/pgo/Kconfig
> >  create mode 100644 kernel/pgo/Makefile
> >  create mode 100644 kernel/pgo/fs.c
> >  create mode 100644 kernel/pgo/instrument.c
> >  create mode 100644 kernel/pgo/pgo.h
> >
> > diff --git a/Documentation/dev-tools/index.rst 
> > b/Documentation/dev-tools/index.rst
> > index f7809c7b1ba9..8d6418e85806 100644
> > --- a/Documentation/dev-tools/index.rst
> > +++ b/Documentation/dev-tools/index.rst
> > @@ -26,6 +26,7 @@ whole; patches welcome!
> > kgdb
> > kselftest
> > kunit/index
> > +   pgo
> >
> >
> >  .. only::  subproject and html
> > diff --git a/Documentation/dev-tools/pgo.rst 
> > b/Documentation/dev-tools/pgo.rst
> > new file mode 100644
> > index ..b7f11d8405b7
> > --- /dev/null
> > 

Re: [PATCH v12 6/8] drm/mediatek: enable dither function

2021-01-28 Thread Yongqiang Niu
On Fri, 2021-01-29 at 14:46 +0800, Hsin-Yi Wang wrote:
> On Fri, Jan 29, 2021 at 2:30 PM Yongqiang Niu
>  wrote:
> >
> > On Fri, 2021-01-29 at 14:24 +0800, Hsin-Yi Wang wrote:
> > > On Fri, Jan 29, 2021 at 9:33 AM CK Hu  wrote:
> > > >
> > > > Hi, Hsin-Yi:
> > > >
> > > > On Thu, 2021-01-28 at 19:23 +0800, Hsin-Yi Wang wrote:
> > > > > From: Yongqiang Niu 
> > > > >
> > > > > for 5 or 6 bpc panel, we need enable dither function
> > > > > to improve the display quality
> > > > >
> > > > > Signed-off-by: Yongqiang Niu 
> > > > > Signed-off-by: Hsin-Yi Wang 
> > > > > ---
> > > > >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 15 +--
> > > > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> > > > > b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > > > index ac2cb25620357..6c8f246380a74 100644
> > > > > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > > > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > > > @@ -53,6 +53,7 @@
> > > > >  #define DITHER_ENBIT(0)
> > > > >  #define DISP_DITHER_CFG  0x0020
> > > > >  #define DITHER_RELAY_MODEBIT(0)
> > > > > +#define DITHER_ENGINE_EN BIT(1)
> > > > >  #define DISP_DITHER_SIZE 0x0030
> > > > >
> > > > >  #define LUT_10BIT_MASK   0x03ff
> > > > > @@ -314,9 +315,19 @@ static void mtk_dither_config(struct device 
> > > > > *dev, unsigned int w,
> > > > > unsigned int bpc, struct cmdq_pkt 
> > > > > *cmdq_pkt)
> > > > >  {
> > > > >   struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> > > > > + bool enable = (bpc == 5 || bpc == 6);
> > > >
> > > > I strongly believe that dither function in dither is identical to the
> > > > one in gamma and od, and in mtk_dither_set_common(), 'bpc >=
> > > > MTK_MIN_BPC' is valid, so I believe we need not to limit bpc to 5 or 6.
> > > > But we should consider the case that bpc is invalid in
> > > > mtk_dither_set_common(). Invalid case in gamma and od use different way
> > > > to process. For gamma, dither is default relay mode, so invalid bpc
> > > > would do nothing in mtk_dither_set_common() and result in relay mode.
> > > > For od, it set to relay mode first, them invalid bpc would do nothing in
> > > > mtk_dither_set_common() and result in relay mode. I would like dither,
> > > > gamma and od to process invalid bpc in the same way. One solution is to
> > > > set relay mode in mtk_dither_set_common() for invalid bpc.
> > > >
> > > > Regards,
> > > > CK
> > > >
> > >
> > > I modify the mtk_dither_config() to follow:
> > >
> > >
> > > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > index ac2cb25620357..5b7fcedb9f9a8 100644
> > > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > @@ -53,6 +53,7 @@
> > >  #define DITHER_EN  BIT(0)
> > >  #define DISP_DITHER_CFG0x0020
> > >  #define DITHER_RELAY_MODE  BIT(0)
> > > +#define DITHER_ENGINE_EN   BIT(1)
> > >  #define DISP_DITHER_SIZE   0x0030
> > >
> > >  #define LUT_10BIT_MASK 0x03ff
> > > @@ -166,6 +167,8 @@ void mtk_dither_set_common(void __iomem *regs,
> > > struct cmdq_client_reg *cmdq_reg,
> > >   DITHER_ADD_LSHIFT_G(MTK_MAX_BPC - bpc),
> > >   cmdq_reg, regs, DISP_DITHER_16);
> > > mtk_ddp_write(cmdq_pkt, dither_en, cmdq_reg, regs, cfg);
> > > +   } else {
> > > +   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, cmdq_reg, 
> > > regs, cfg);
> > > }
> > >  }
> > >
> > > @@ -315,8 +318,12 @@ static void mtk_dither_config(struct device *dev,
> > > unsigned int w,
> > >  {
> > > struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> > >
> > > -   mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg,
> > > priv->regs, DISP_DITHER_SIZE);
> > > -   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg,
> > > priv->regs, DISP_DITHER_CFG);
> > > +   mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg, priv->regs,
> > > + DISP_DITHER_SIZE);
> > > +   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg, 
> > > priv->regs,
> > > + DISP_DITHER_CFG);
> > > +   mtk_dither_set_common(priv->regs, >cmdq_reg, bpc, 
> > > DISP_DITHER_CFG,
> > > +  DITHER_ENGINE_EN, cmdq_pkt);
> > >  }
> > >
> > > So now, not only bpc==5 or 6, but all valid bpc, dither config will
> > > call mtk_dither_set_common() with the flag DITHER_ENGINE_EN(BIT(1)).
> > > od config will call mtk_dither_set_common() with the flag
> > > DISP_DITHERING(BIT(2)).
> > > Additionally for 8173, gamma config 

[PATCH 2/2] arm64/mm: Reorganize pfn_valid()

2021-01-28 Thread Anshuman Khandual
There are multiple instances of pfn_to_section_nr() and __pfn_to_section()
when CONFIG_SPARSEMEM is enabled. This can be just optimized if the memory
section is fetched earlier. Hence bifurcate pfn_valid() into two different
definitions depending on whether CONFIG_SPARSEMEM is enabled. Also replace
the open coded pfn <--> addr conversion with __[pfn|phys]_to_[phys|pfn]().
This does not cause any functional change.

Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Ard Biesheuvel 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual 
---
 arch/arm64/mm/init.c | 38 +++---
 1 file changed, 31 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1141075e4d53..09adca90c57a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -217,18 +217,25 @@ static void __init zone_sizes_init(unsigned long min, 
unsigned long max)
free_area_init(max_zone_pfns);
 }
 
+#ifdef CONFIG_SPARSEMEM
 int pfn_valid(unsigned long pfn)
 {
-   phys_addr_t addr = pfn << PAGE_SHIFT;
+   struct mem_section *ms = __pfn_to_section(pfn);
+   phys_addr_t addr = __pfn_to_phys(pfn);
 
-   if ((addr >> PAGE_SHIFT) != pfn)
+   /*
+* Ensure the upper PAGE_SHIFT bits are clear in the
+* pfn. Else it might lead to false positives when
+* some of the upper bits are set, but the lower bits
+* match a valid pfn.
+*/
+   if (__phys_to_pfn(addr) != pfn)
return 0;
 
-#ifdef CONFIG_SPARSEMEM
if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
return 0;
 
-   if (!valid_section(__pfn_to_section(pfn)))
+   if (!valid_section(ms))
return 0;
 
/*
@@ -240,11 +247,28 @@ int pfn_valid(unsigned long pfn)
 * memory sections covering all of hotplug memory including
 * both normal and ZONE_DEVICE based.
 */
-   if (!early_section(__pfn_to_section(pfn)))
-   return pfn_section_valid(__pfn_to_section(pfn), pfn);
-#endif
+   if (!early_section(ms))
+   return pfn_section_valid(ms, pfn);
+
return memblock_is_map_memory(addr);
 }
+#else
+int pfn_valid(unsigned long pfn)
+{
+   phys_addr_t addr = __pfn_to_phys(pfn);
+
+   /*
+* Ensure the upper PAGE_SHIFT bits are clear in the
+* pfn. Else it might lead to false positives when
+* some of the upper bits are set, but the lower bits
+* match a valid pfn.
+*/
+   if (__phys_to_pfn(addr) != pfn)
+   return 0;
+
+   return memblock_is_map_memory(addr);
+}
+#endif
 EXPORT_SYMBOL(pfn_valid);
 
 static phys_addr_t memory_limit = PHYS_ADDR_MAX;
-- 
2.20.1



[PATCH 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory

2021-01-28 Thread Anshuman Khandual
pfn_valid() validates a pfn but basically it checks for a valid struct page
backing for that pfn. It should always return positive for memory ranges
backed with struct page mapping. But currently pfn_valid() fails for all
ZONE_DEVICE based memory types even though they have struct page mapping.

pfn_valid() asserts that there is a memblock entry for a given pfn without
MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
that they do not have memblock entries. Hence memblock_is_map_memory() will
invariably fail via memblock_search() for a ZONE_DEVICE based address. This
eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
into the system via memremap_pages() called from a driver, their respective
memory sections will not have SECTION_IS_EARLY set.

Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
for firmware reserved memory regions. memblock_is_map_memory() can just be
skipped as its always going to be positive and that will be an optimization
for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
hotplugged memory too will not have SECTION_IS_EARLY set for their sections

Skipping memblock_is_map_memory() for all non early memory sections would
fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
performance for normal hotplug memory as well.

Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Ard Biesheuvel 
Cc: Robin Murphy 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support")
Signed-off-by: Anshuman Khandual 
---
 arch/arm64/mm/init.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 709d98fea90c..1141075e4d53 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -230,6 +230,18 @@ int pfn_valid(unsigned long pfn)
 
if (!valid_section(__pfn_to_section(pfn)))
return 0;
+
+   /*
+* ZONE_DEVICE memory does not have the memblock entries.
+* memblock_is_map_memory() check for ZONE_DEVICE based
+* addresses will always fail. Even the normal hotplugged
+* memory will never have MEMBLOCK_NOMAP flag set in their
+* memblock entries. Skip memblock search for all non early
+* memory sections covering all of hotplug memory including
+* both normal and ZONE_DEVICE based.
+*/
+   if (!early_section(__pfn_to_section(pfn)))
+   return pfn_section_valid(__pfn_to_section(pfn), pfn);
 #endif
return memblock_is_map_memory(addr);
 }
-- 
2.20.1



[PATCH] phy: cpcap-usb: Simplify bool conversion

2021-01-28 Thread Yang Li
Fix the following coccicheck warning:
./drivers/phy/motorola/phy-cpcap-usb.c:146:31-36: WARNING: conversion to
bool not needed here

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 
---
 drivers/phy/motorola/phy-cpcap-usb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/phy/motorola/phy-cpcap-usb.c 
b/drivers/phy/motorola/phy-cpcap-usb.c
index 4728e2b..6ee478b 100644
--- a/drivers/phy/motorola/phy-cpcap-usb.c
+++ b/drivers/phy/motorola/phy-cpcap-usb.c
@@ -143,7 +143,7 @@ static bool cpcap_usb_vbus_valid(struct cpcap_phy_ddata 
*ddata)
 
error = iio_read_channel_processed(ddata->vbus, );
if (error >= 0)
-   return value > 3900 ? true : false;
+   return value > 3900;
 
dev_err(ddata->dev, "error reading VBUS: %i\n", error);
 
-- 
1.8.3.1



[PATCH 0/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory

2021-01-28 Thread Anshuman Khandual
This series fixes pfn_valid() for ZONE_DEVICE based memory and also improves
its performance for normal hotplug memory. While here, it also reorganizes
pfn_valid() on CONFIG_SPARSEMEM. This series is based on v5.11-rc5.

Question - should pfn_section_valid() be tested both for boot and non boot
memory as well ?

Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Ard Biesheuvel 
Cc: Mark Rutland 
Cc: James Morse 
Cc: Robin Murphy 
Cc: Jérôme Glisse 
Cc: Dan Williams 
Cc: David Hildenbrand 
Cc: Mike Rapoport 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org

Changes in V1:

- Test pfn_section_valid() for non boot memory

Changes in RFC:

https://lore.kernel.org/linux-arm-kernel/1608621144-4001-1-git-send-email-anshuman.khand...@arm.com/

Anshuman Khandual (2):
  arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
  arm64/mm: Reorganize pfn_valid()

 arch/arm64/mm/init.c | 46 +++-
 1 file changed, 41 insertions(+), 5 deletions(-)

-- 
2.20.1



[PATCH v4 6/8] drm/mediatek: add matrix bits private data for ccorr

2021-01-28 Thread Hsin-Yi Wang
From: Yongqiang Niu 

matrix bits of mt8183 is 12
matrix bits of mt8192 is 13

Signed-off-by: Yongqiang Niu 
Signed-off-by: Hsin-Yi Wang 
---
 drivers/gpu/drm/mediatek/mtk_disp_ccorr.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
index 0c68090eb1e92..1c7163a12f3b1 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
@@ -31,8 +31,10 @@
 #define DISP_CCORR_COEF_3  0x008C
 #define DISP_CCORR_COEF_4  0x0090
 
+#define CCORR_MATRIX_BITS  12
+
 struct mtk_disp_ccorr_data {
-   u32 reserved;
+   u32 matrix_bits;
 };
 
 /**
@@ -116,6 +118,7 @@ void mtk_ccorr_ctm_set(struct device *dev, struct 
drm_crtc_state *state)
uint16_t coeffs[9] = { 0 };
int i;
struct cmdq_pkt *cmdq_pkt = NULL;
+   u32 matrix_bits;
 
if (!blob)
return;
@@ -123,8 +126,16 @@ void mtk_ccorr_ctm_set(struct device *dev, struct 
drm_crtc_state *state)
ctm = (struct drm_color_ctm *)blob->data;
input = ctm->matrix;
 
-   for (i = 0; i < ARRAY_SIZE(coeffs); i++)
+   if (ccorr->data)
+   matrix_bits = ccorr->data->matrix_bits;
+   else
+   matrix_bits = CCORR_MATRIX_BITS;
+
+   for (i = 0; i < ARRAY_SIZE(coeffs); i++) {
coeffs[i] = mtk_ctm_s31_32_to_s1_10(input[i]);
+   if (matrix_bits > CCORR_MATRIX_BITS)
+   coeffs[i] <<= (matrix_bits - CCORR_MATRIX_BITS);
+   }
 
mtk_ddp_write(cmdq_pkt, coeffs[0] << 16 | coeffs[1],
  >cmdq_reg, ccorr->regs, DISP_CCORR_COEF_0);
@@ -205,8 +216,13 @@ static int mtk_disp_ccorr_remove(struct platform_device 
*pdev)
return 0;
 }
 
+static const struct mtk_disp_ccorr_data mt8183_ccorr_driver_data = {
+   .matrix_bits = CCORR_MATRIX_BITS,
+};
+
 static const struct of_device_id mtk_disp_ccorr_driver_dt_match[] = {
-   { .compatible = "mediatek,mt8183-disp-ccorr"},
+   { .compatible = "mediatek,mt8183-disp-ccorr",
+ .data = _ccorr_driver_data},
{},
 };
 MODULE_DEVICE_TABLE(of, mtk_disp_ccorr_driver_dt_match);
-- 
2.30.0.365.g02bc693789-goog



[PATCH v4 7/8] soc: mediatek: add mtk mutex support for MT8192

2021-01-28 Thread Hsin-Yi Wang
From: Yongqiang Niu 

Add mtk mutex support for MT8192 SoC.

Signed-off-by: Yongqiang Niu 
Signed-off-by: Hsin-Yi Wang 
---
 drivers/soc/mediatek/mtk-mutex.c | 35 
 1 file changed, 35 insertions(+)

diff --git a/drivers/soc/mediatek/mtk-mutex.c b/drivers/soc/mediatek/mtk-mutex.c
index 718a41beb6afb..dfd9806d5a001 100644
--- a/drivers/soc/mediatek/mtk-mutex.c
+++ b/drivers/soc/mediatek/mtk-mutex.c
@@ -39,6 +39,18 @@
 #define MT8167_MUTEX_MOD_DISP_DITHER   15
 #define MT8167_MUTEX_MOD_DISP_UFOE 16
 
+#define MT8192_MUTEX_MOD_DISP_OVL0 0
+#define MT8192_MUTEX_MOD_DISP_OVL0_2L  1
+#define MT8192_MUTEX_MOD_DISP_RDMA02
+#define MT8192_MUTEX_MOD_DISP_COLOR0   4
+#define MT8192_MUTEX_MOD_DISP_CCORR0   5
+#define MT8192_MUTEX_MOD_DISP_AAL0 6
+#define MT8192_MUTEX_MOD_DISP_GAMMA0   7
+#define MT8192_MUTEX_MOD_DISP_POSTMASK08
+#define MT8192_MUTEX_MOD_DISP_DITHER0  9
+#define MT8192_MUTEX_MOD_DISP_OVL2_2L  16
+#define MT8192_MUTEX_MOD_DISP_RDMA417
+
 #define MT8183_MUTEX_MOD_DISP_RDMA00
 #define MT8183_MUTEX_MOD_DISP_RDMA11
 #define MT8183_MUTEX_MOD_DISP_OVL0 9
@@ -214,6 +226,20 @@ static const unsigned int 
mt8183_mutex_mod[DDP_COMPONENT_ID_MAX] = {
[DDP_COMPONENT_WDMA0] = MT8183_MUTEX_MOD_DISP_WDMA0,
 };
 
+static const unsigned int mt8192_mutex_mod[DDP_COMPONENT_ID_MAX] = {
+   [DDP_COMPONENT_AAL0] = MT8192_MUTEX_MOD_DISP_AAL0,
+   [DDP_COMPONENT_CCORR] = MT8192_MUTEX_MOD_DISP_CCORR0,
+   [DDP_COMPONENT_COLOR0] = MT8192_MUTEX_MOD_DISP_COLOR0,
+   [DDP_COMPONENT_DITHER] = MT8192_MUTEX_MOD_DISP_DITHER0,
+   [DDP_COMPONENT_GAMMA] = MT8192_MUTEX_MOD_DISP_GAMMA0,
+   [DDP_COMPONENT_POSTMASK0] = MT8192_MUTEX_MOD_DISP_POSTMASK0,
+   [DDP_COMPONENT_OVL0] = MT8192_MUTEX_MOD_DISP_OVL0,
+   [DDP_COMPONENT_OVL_2L0] = MT8192_MUTEX_MOD_DISP_OVL0_2L,
+   [DDP_COMPONENT_OVL_2L2] = MT8192_MUTEX_MOD_DISP_OVL2_2L,
+   [DDP_COMPONENT_RDMA0] = MT8192_MUTEX_MOD_DISP_RDMA0,
+   [DDP_COMPONENT_RDMA4] = MT8192_MUTEX_MOD_DISP_RDMA4,
+};
+
 static const unsigned int mt2712_mutex_sof[MUTEX_SOF_DSI3 + 1] = {
[MUTEX_SOF_SINGLE_MODE] = MUTEX_SOF_SINGLE_MODE,
[MUTEX_SOF_DSI0] = MUTEX_SOF_DSI0,
@@ -275,6 +301,13 @@ static const struct mtk_mutex_data 
mt8183_mutex_driver_data = {
.no_clk = true,
 };
 
+static const struct mtk_mutex_data mt8192_mutex_driver_data = {
+   .mutex_mod = mt8192_mutex_mod,
+   .mutex_sof = mt8183_mutex_sof,
+   .mutex_mod_reg = MT8183_MUTEX0_MOD0,
+   .mutex_sof_reg = MT8183_MUTEX0_SOF0,
+};
+
 struct mtk_mutex *mtk_mutex_get(struct device *dev)
 {
struct mtk_mutex_ctx *mtx = dev_get_drvdata(dev);
@@ -507,6 +540,8 @@ static const struct of_device_id mutex_driver_dt_match[] = {
  .data = _mutex_driver_data},
{ .compatible = "mediatek,mt8183-disp-mutex",
  .data = _mutex_driver_data},
+   { .compatible = "mediatek,mt8192-disp-mutex",
+ .data = _mutex_driver_data},
{},
 };
 MODULE_DEVICE_TABLE(of, mutex_driver_dt_match);
-- 
2.30.0.365.g02bc693789-goog



[PATCH v2 3/3] ARM: dts: sti: Introduce 4KOpen (stih418-b2264) board

2021-01-28 Thread Alain Volmat
4KOpen (B2264) is a board based on the STMicroelectronics STiH418 soc:
  - 2GB DDR
  - HDMI
  - Ethernet 1000-BaseT
  - PCIe (mini PCIe connector)
  - MicroSD slot
  - USB2 and USB3 connectors
  - Sata
  - 40 pins GPIO header

Signed-off-by: Alain Volmat 
---
v2: fix bootargs (removal of console=)
removal of rng11 node, moved into stih418.dtsi

 arch/arm/boot/dts/Makefile  |   3 +-
 arch/arm/boot/dts/stih418-b2264.dts | 123 
 2 files changed, 125 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/boot/dts/stih418-b2264.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 3d1ea0b25168..5ad1b0854b66 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -1059,7 +1059,8 @@ dtb-$(CONFIG_ARCH_STI) += \
stih407-b2120.dtb \
stih410-b2120.dtb \
stih410-b2260.dtb \
-   stih418-b2199.dtb
+   stih418-b2199.dtb \
+   stih418-b2264.dtb
 dtb-$(CONFIG_ARCH_STM32) += \
stm32f429-disco.dtb \
stm32f469-disco.dtb \
diff --git a/arch/arm/boot/dts/stih418-b2264.dts 
b/arch/arm/boot/dts/stih418-b2264.dts
new file mode 100644
index ..b70a76d3faa2
--- /dev/null
+++ b/arch/arm/boot/dts/stih418-b2264.dts
@@ -0,0 +1,123 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2021 STMicroelectronics
+ * Author: Alain Volmat 
+ */
+/dts-v1/;
+#include "stih418.dtsi"
+#include 
+/ {
+   model = "STiH418 B2264";
+   compatible = "st,stih418-b2264", "st,stih418";
+
+   chosen {
+   bootargs = "clk_ignore_unused";
+   stdout-path = _serial0;
+   };
+
+   memory@4000 {
+   device_type = "memory";
+   reg = <0x4000 0xc000>;
+   };
+
+   cpus {
+   cpu@0 {
+   operating-points-v2 = <_opp_table>;
+   /* u-boot puts hpen in SBC dmem at 0xb8 offset */
+   cpu-release-addr = <0x94100b8>;
+   };
+   cpu@1 {
+   operating-points-v2 = <_opp_table>;
+   /* u-boot puts hpen in SBC dmem at 0xb8 offset */
+   cpu-release-addr = <0x94100b8>;
+   };
+   cpu@2 {
+   operating-points-v2 = <_opp_table>;
+   /* u-boot puts hpen in SBC dmem at 0xb8 offset */
+   cpu-release-addr = <0x94100b8>;
+   };
+   cpu@3 {
+   operating-points-v2 = <_opp_table>;
+   /* u-boot puts hpen in SBC dmem at 0xb8 offset */
+   cpu-release-addr = <0x94100b8>;
+   };
+   };
+
+   cpu_opp_table: opp_table {
+   compatible = "operating-points-v2";
+   opp-shared;
+
+   opp00 {
+   opp-hz = /bits/ 64 <5>;
+   opp-microvolt = <784000>;
+   };
+   opp01 {
+   opp-hz = /bits/ 64 <8>;
+   opp-microvolt = <784000>;
+   };
+   opp02 {
+   opp-hz = /bits/ 64 <12>;
+   opp-microvolt = <784000>;
+   };
+   opp03 {
+   opp-hz = /bits/ 64 <15>;
+   opp-microvolt = <784000>;
+   };
+   };
+
+   aliases {
+   ttyAS0 = _serial0;
+   ethernet0 = 
+   };
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   phy-mode = "rgmii";
+   pinctrl-0 = <_rgmii1 _rgmii1_mdio_1>;
+   st,tx-retime-src = "clkgen";
+
+   snps,reset-gpio = < 7 0>;
+   snps,reset-active-low;
+   snps,reset-delays-us = <0 1 100>;
+
+   status = "okay";
+};
+
+_phy {
+   phy_port0: port@9b22000 {
+   st,sata-gen = <2>; /* SATA GEN3 */
+   st,osc-rdy;
+   };
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+};
+
+_serial0 {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+};
+
+_dwc3 {
+   status = "okay";
+};
-- 
2.17.1



[PATCH v2 0/3] Introduction of STiH418 based 4KOpen board

2021-01-28 Thread Alain Volmat
This serie introduces the 4KOpen (stih418-b2264) board based
on a stih418 soc. Since it is the first board to use the spi-fsm
SPI NOR controller available since stih407, the controller is
also added within the stih407-family DT.
It also contains a fix within the stih418 DT since the rng11 is not
available on this platform and is thus disabled.

Alain Volmat (3):
  ARM: dts: sti: add the spinor controller node within stih407-family
  ARM: dts: sti: disable rng11 on the stih418 platform
  ARM: dts: sti: Introduce 4KOpen (stih418-b2264) board

 arch/arm/boot/dts/Makefile|   3 +-
 arch/arm/boot/dts/stih407-family.dtsi |  15 
 arch/arm/boot/dts/stih418-b2264.dts   | 123 ++
 arch/arm/boot/dts/stih418.dtsi|   4 +
 4 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/boot/dts/stih418-b2264.dts



[PATCH v2 2/3] ARM: dts: sti: disable rng11 on the stih418 platform

2021-01-28 Thread Alain Volmat
The rng11 is not available on the STiH418 hence is disabled in the
stih418.dtsi

Signed-off-by: Alain Volmat 
---
 arch/arm/boot/dts/stih418.dtsi | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/stih418.dtsi b/arch/arm/boot/dts/stih418.dtsi
index a05e2278b448..39a249983496 100644
--- a/arch/arm/boot/dts/stih418.dtsi
+++ b/arch/arm/boot/dts/stih418.dtsi
@@ -27,6 +27,10 @@
};
 
soc {
+   rng11: rng@8a8a000 {
+   status = "disabled";
+   };
+
usb2_picophy1: phy2@0 {
compatible = "st,stih407-usb2-phy";
reg = <0 0>;
-- 
2.17.1



[PATCH v2 1/3] ARM: dts: sti: add the spinor controller node within stih407-family

2021-01-28 Thread Alain Volmat
The STiH407 family (and further versions STiH410/STiH418) embedded
a serial flash controller allowing fast access to SPI-NOR.
This commit adds the corresponding node, relying on the st-spi-fsm
drivers.

Signed-off-by: Alain Volmat 
---
v2: commit log improvement

 arch/arm/boot/dts/stih407-family.dtsi | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/arch/arm/boot/dts/stih407-family.dtsi 
b/arch/arm/boot/dts/stih407-family.dtsi
index 23a1746f3baa..21f3347a91d6 100644
--- a/arch/arm/boot/dts/stih407-family.dtsi
+++ b/arch/arm/boot/dts/stih407-family.dtsi
@@ -616,6 +616,21 @@
st,lpc-mode = ;
};
 
+   spifsm: spifsm@9022000{
+   compatible = "st,spi-fsm";
+   reg = <0x9022000 0x1000>;
+   reg-names = "spi-fsm";
+   clocks = <_s_c0_flexgen CLK_FLASH_PROMIP>;
+   clock-names = "emi_clk";
+   pinctrl-names = "default";
+   pinctrl-0 = <_fsm>;
+   st,syscfg = <_core>;
+   st,boot-device-reg = <0x8c4>;
+   st,boot-device-spi = <0x68>;
+
+   status = "disabled";
+   };
+
sata0: sata@9b2 {
compatible = "st,ahci";
reg = <0x9b2 0x1000>;
-- 
2.17.1



[PATCH v4 5/8] drm/mediatek: separate ccorr module

2021-01-28 Thread Hsin-Yi Wang
From: Yongqiang Niu 

ccorr ctm matrix bits will be different in mt8192

Signed-off-by: Yongqiang Niu 
Signed-off-by: Hsin-Yi Wang 
---
 drivers/gpu/drm/mediatek/Makefile   |   3 +-
 drivers/gpu/drm/mediatek/mtk_disp_ccorr.c   | 222 
 drivers/gpu/drm/mediatek/mtk_disp_drv.h |   9 +
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c |  95 +
 drivers/gpu/drm/mediatek/mtk_drm_drv.c  |   8 +-
 drivers/gpu/drm/mediatek/mtk_drm_drv.h  |   1 +
 6 files changed, 242 insertions(+), 96 deletions(-)
 create mode 100644 drivers/gpu/drm/mediatek/mtk_disp_ccorr.c

diff --git a/drivers/gpu/drm/mediatek/Makefile 
b/drivers/gpu/drm/mediatek/Makefile
index 13a0eafabf9c0..f119bef6d6e66 100644
--- a/drivers/gpu/drm/mediatek/Makefile
+++ b/drivers/gpu/drm/mediatek/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 
-mediatek-drm-y := mtk_disp_color.o \
+mediatek-drm-y := mtk_disp_ccorr.o \
+ mtk_disp_color.o \
  mtk_disp_gamma.o \
  mtk_disp_ovl.o \
  mtk_disp_postmask.o \
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
new file mode 100644
index 0..0c68090eb1e92
--- /dev/null
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
@@ -0,0 +1,222 @@
+/*
+ * SPDX-License-Identifier:
+ *
+ * Copyright (c) 2020 MediaTek Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mtk_disp_drv.h"
+#include "mtk_drm_crtc.h"
+#include "mtk_drm_ddp_comp.h"
+
+#define DISP_CCORR_EN  0x
+#define CCORR_EN   BIT(0)
+#define DISP_CCORR_CFG 0x0020
+#define CCORR_RELAY_MODE   BIT(0)
+#define CCORR_ENGINE_ENBIT(1)
+#define CCORR_GAMMA_OFFBIT(2)
+#define CCORR_WGAMUT_SRC_CLIP  BIT(3)
+#define DISP_CCORR_SIZE0x0030
+#define DISP_CCORR_COEF_0  0x0080
+#define DISP_CCORR_COEF_1  0x0084
+#define DISP_CCORR_COEF_2  0x0088
+#define DISP_CCORR_COEF_3  0x008C
+#define DISP_CCORR_COEF_4  0x0090
+
+struct mtk_disp_ccorr_data {
+   u32 reserved;
+};
+
+/**
+ * struct mtk_disp_ccorr - DISP_CCORR driver structure
+ * @ddp_comp - structure containing type enum and hardware resources
+ * @crtc - associated crtc to report irq events to
+ */
+struct mtk_disp_ccorr {
+   struct clk *clk;
+   void __iomem *regs;
+   struct cmdq_client_reg cmdq_reg;
+   const struct mtk_disp_ccorr_data*data;
+};
+
+int mtk_ccorr_clk_enable(struct device *dev)
+{
+   struct mtk_disp_ccorr *ccorr = dev_get_drvdata(dev);
+
+   return clk_prepare_enable(ccorr->clk);
+}
+
+void mtk_ccorr_clk_disable(struct device *dev)
+{
+   struct mtk_disp_ccorr *ccorr = dev_get_drvdata(dev);
+
+   clk_disable_unprepare(ccorr->clk);
+}
+
+void mtk_ccorr_config(struct device *dev, unsigned int w,
+unsigned int h, unsigned int vrefresh,
+unsigned int bpc, struct cmdq_pkt *cmdq_pkt)
+{
+   struct mtk_disp_ccorr *ccorr = dev_get_drvdata(dev);
+
+   mtk_ddp_write(cmdq_pkt, w << 16 | h, >cmdq_reg, ccorr->regs,
+ DISP_CCORR_SIZE);
+   mtk_ddp_write(cmdq_pkt, CCORR_ENGINE_EN, >cmdq_reg, ccorr->regs,
+ DISP_CCORR_CFG);
+}
+
+void mtk_ccorr_start(struct device *dev)
+{
+   struct mtk_disp_ccorr *ccorr = dev_get_drvdata(dev);
+
+   writel(CCORR_EN, ccorr->regs + DISP_CCORR_EN);
+}
+
+void mtk_ccorr_stop(struct device *dev)
+{
+   struct mtk_disp_ccorr *ccorr = dev_get_drvdata(dev);
+
+   writel_relaxed(0x0, ccorr->regs + DISP_CCORR_EN);
+}
+
+/* Converts a DRM S31.32 value to the HW S1.10 format. */
+static u16 mtk_ctm_s31_32_to_s1_10(u64 in)
+{
+   u16 r;
+
+   /* Sign bit. */
+   r = in & BIT_ULL(63) ? BIT(11) : 0;
+
+   if ((in & GENMASK_ULL(62, 33)) > 0) {
+   /* identity value 0x1 -> 0x400, */
+   /* if bigger this, set it to max 0x7ff. */
+   r |= GENMASK(10, 0);
+   } else {
+   /* take the 11 most important bits. */
+   r |= (in >> 22) & GENMASK(10, 0);
+   }
+
+   return r;
+}
+
+void mtk_ccorr_ctm_set(struct device *dev, struct drm_crtc_state *state)
+{
+   struct mtk_disp_ccorr *ccorr = dev_get_drvdata(dev);
+   struct drm_property_blob *blob = state->ctm;
+   struct drm_color_ctm *ctm;
+   const u64 *input;
+   uint16_t coeffs[9] = { 0 };
+   int i;
+   struct cmdq_pkt *cmdq_pkt = NULL;
+
+   if (!blob)
+   return;
+
+   ctm = (struct drm_color_ctm *)blob->data;
+   input = ctm->matrix;
+
+   for (i = 0; i < 

[PATCH v4 8/8] drm/mediatek: add support for mediatek SOC MT8192

2021-01-28 Thread Hsin-Yi Wang
From: Yongqiang Niu 

add support for mediatek SOC MT8192

Signed-off-by: Yongqiang Niu 
Signed-off-by: Hsin-Yi Wang 
---
 drivers/gpu/drm/mediatek/mtk_disp_ccorr.c|  6 +++
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c  | 20 ++
 drivers/gpu/drm/mediatek/mtk_disp_postmask.c |  1 +
 drivers/gpu/drm/mediatek/mtk_disp_rdma.c |  6 +++
 drivers/gpu/drm/mediatek/mtk_drm_drv.c   | 42 
 5 files changed, 75 insertions(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
index 1c7163a12f3b1..43794192e82b1 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
@@ -220,9 +220,15 @@ static const struct mtk_disp_ccorr_data 
mt8183_ccorr_driver_data = {
.matrix_bits = CCORR_MATRIX_BITS,
 };
 
+static const struct mtk_disp_ccorr_data mt8192_ccorr_driver_data = {
+   .matrix_bits = 13,
+};
+
 static const struct of_device_id mtk_disp_ccorr_driver_dt_match[] = {
{ .compatible = "mediatek,mt8183-disp-ccorr",
  .data = _ccorr_driver_data},
+   { .compatible = "mediatek,mt8192-disp-ccorr",
+ .data = _ccorr_driver_data},
{},
 };
 MODULE_DEVICE_TABLE(of, mtk_disp_ccorr_driver_dt_match);
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
index 961f87f8d4d15..e266baae586c4 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -455,6 +455,22 @@ static const struct mtk_disp_ovl_data 
mt8183_ovl_2l_driver_data = {
.fmt_rgb565_is_0 = true,
 };
 
+static const struct mtk_disp_ovl_data mt8192_ovl_driver_data = {
+   .addr = DISP_REG_OVL_ADDR_MT8173,
+   .gmc_bits = 10,
+   .layer_nr = 4,
+   .fmt_rgb565_is_0 = true,
+   .smi_id_en = true,
+};
+
+static const struct mtk_disp_ovl_data mt8192_ovl_2l_driver_data = {
+   .addr = DISP_REG_OVL_ADDR_MT8173,
+   .gmc_bits = 10,
+   .layer_nr = 2,
+   .fmt_rgb565_is_0 = true,
+   .smi_id_en = true,
+};
+
 static const struct of_device_id mtk_disp_ovl_driver_dt_match[] = {
{ .compatible = "mediatek,mt2701-disp-ovl",
  .data = _ovl_driver_data},
@@ -464,6 +480,10 @@ static const struct of_device_id 
mtk_disp_ovl_driver_dt_match[] = {
  .data = _ovl_driver_data},
{ .compatible = "mediatek,mt8183-disp-ovl-2l",
  .data = _ovl_2l_driver_data},
+   { .compatible = "mediatek,mt8192-disp-ovl",
+ .data = _ovl_driver_data},
+   { .compatible = "mediatek,mt8192-disp-ovl-2l",
+ .data = _ovl_2l_driver_data},
{},
 };
 MODULE_DEVICE_TABLE(of, mtk_disp_ovl_driver_dt_match);
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_postmask.c 
b/drivers/gpu/drm/mediatek/mtk_disp_postmask.c
index d640cef9c15a4..62c8431a09605 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_postmask.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_postmask.c
@@ -146,6 +146,7 @@ static int mtk_disp_postmask_remove(struct platform_device 
*pdev)
 }
 
 static const struct of_device_id mtk_disp_postmask_driver_dt_match[] = {
+   { .compatible = "mediatek,mt8192-disp-postmask"},
{},
 };
 MODULE_DEVICE_TABLE(of, mtk_disp_postmask_driver_dt_match);
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c 
b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
index 728aaadfea8cf..f123fc00a3935 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
@@ -355,6 +355,10 @@ static const struct mtk_disp_rdma_data 
mt8183_rdma_driver_data = {
.fifo_size = 5 * SZ_1K,
 };
 
+static const struct mtk_disp_rdma_data mt8192_rdma_driver_data = {
+   .fifo_size = 5 * SZ_1K,
+};
+
 static const struct of_device_id mtk_disp_rdma_driver_dt_match[] = {
{ .compatible = "mediatek,mt2701-disp-rdma",
  .data = _rdma_driver_data},
@@ -362,6 +366,8 @@ static const struct of_device_id 
mtk_disp_rdma_driver_dt_match[] = {
  .data = _rdma_driver_data},
{ .compatible = "mediatek,mt8183-disp-rdma",
  .data = _rdma_driver_data},
+   { .compatible = "mediatek,mt8192-disp-rdma",
+ .data = _rdma_driver_data},
{},
 };
 MODULE_DEVICE_TABLE(of, mtk_disp_rdma_driver_dt_match);
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 3da8996438dbc..6261d6bbe863e 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -147,6 +147,25 @@ static const enum mtk_ddp_comp_id mt8183_mtk_ddp_ext[] = {
DDP_COMPONENT_DPI0,
 };
 
+static const enum mtk_ddp_comp_id mt8192_mtk_ddp_main[] = {
+   DDP_COMPONENT_OVL0,
+   DDP_COMPONENT_OVL_2L0,
+   DDP_COMPONENT_RDMA0,
+   DDP_COMPONENT_COLOR0,
+   DDP_COMPONENT_CCORR,
+   DDP_COMPONENT_AAL0,
+   DDP_COMPONENT_GAMMA,
+   DDP_COMPONENT_POSTMASK0,
+   DDP_COMPONENT_DITHER,
+   DDP_COMPONENT_DSI0,
+};
+
+static const enum 

[PATCH v4 4/8] drm/mediatek: enable OVL_LAYER_SMI_ID_EN for multi-layer usecase

2021-01-28 Thread Hsin-Yi Wang
From: Yongqiang Niu 

enable OVL_LAYER_SMI_ID_EN for multi-layer usecase, without this patch,
ovl will hang up when more than 1 layer enabled.

Signed-off-by: Yongqiang Niu 
Signed-off-by: Hsin-Yi Wang 
---
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
index da7e38a28759b..961f87f8d4d15 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -24,6 +24,7 @@
 #define DISP_REG_OVL_RST   0x0014
 #define DISP_REG_OVL_ROI_SIZE  0x0020
 #define DISP_REG_OVL_DATAPATH_CON  0x0024
+#define OVL_LAYER_SMI_ID_ENBIT(0)
 #define OVL_BGCLR_SEL_IN   BIT(2)
 #define DISP_REG_OVL_ROI_BGCLR 0x0028
 #define DISP_REG_OVL_SRC_CON   0x002c
@@ -62,6 +63,7 @@ struct mtk_disp_ovl_data {
unsigned int gmc_bits;
unsigned int layer_nr;
bool fmt_rgb565_is_0;
+   bool smi_id_en;
 };
 
 /**
@@ -134,6 +136,13 @@ void mtk_ovl_start(struct device *dev)
 {
struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
 
+   if (ovl->data->smi_id_en) {
+   unsigned int reg;
+
+   reg = readl(ovl->regs + DISP_REG_OVL_DATAPATH_CON);
+   reg = reg | OVL_LAYER_SMI_ID_EN;
+   writel_relaxed(reg, ovl->regs + DISP_REG_OVL_DATAPATH_CON);
+   }
writel_relaxed(0x1, ovl->regs + DISP_REG_OVL_EN);
 }
 
@@ -142,6 +151,14 @@ void mtk_ovl_stop(struct device *dev)
struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
 
writel_relaxed(0x0, ovl->regs + DISP_REG_OVL_EN);
+   if (ovl->data->smi_id_en) {
+   unsigned int reg;
+
+   reg = readl(ovl->regs + DISP_REG_OVL_DATAPATH_CON);
+   reg = reg & ~OVL_LAYER_SMI_ID_EN;
+   writel_relaxed(reg, ovl->regs + DISP_REG_OVL_DATAPATH_CON);
+   }
+
 }
 
 void mtk_ovl_config(struct device *dev, unsigned int w,
-- 
2.30.0.365.g02bc693789-goog



[PATCH v4 3/8] drm/mediatek: add component RDMA4

2021-01-28 Thread Hsin-Yi Wang
From: Yongqiang Niu 

This patch add component RDMA4

Signed-off-by: Yongqiang Niu 
Reviewed-by: Chun-Kuang Hu 
Signed-off-by: Hsin-Yi Wang 
---
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
index 6c539783118dd..543cbfc9c5d85 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
@@ -496,6 +496,7 @@ static const struct mtk_ddp_comp_match 
mtk_ddp_matches[DDP_COMPONENT_ID_MAX] = {
[DDP_COMPONENT_RDMA0]   = { MTK_DISP_RDMA,  0, _rdma },
[DDP_COMPONENT_RDMA1]   = { MTK_DISP_RDMA,  1, _rdma },
[DDP_COMPONENT_RDMA2]   = { MTK_DISP_RDMA,  2, _rdma },
+   [DDP_COMPONENT_RDMA4]   = { MTK_DISP_RDMA,  4, _rdma },
[DDP_COMPONENT_UFOE]= { MTK_DISP_UFOE,  0, _ufoe },
[DDP_COMPONENT_WDMA0]   = { MTK_DISP_WDMA,  0, NULL },
[DDP_COMPONENT_WDMA1]   = { MTK_DISP_WDMA,  1, NULL },
-- 
2.30.0.365.g02bc693789-goog



[PATCH v4 0/8] drm/mediatek: add support for mediatek SOC MT8192

2021-01-28 Thread Hsin-Yi Wang
This series are based on kernel/git/chunkuang.hu/linux.git mediatek-drm-next
This series also depends on component support in mmsys[1]:
- [v4,06/10] soc: mediatek: mmsys: add component OVL_2L2
- [v4,07/10] soc: mediatek: mmsys: add component POSTMASK
- [v4,08/10] soc: mediatek: mmsys: add component RDMA4

[1] 
https://patchwork.kernel.org/project/linux-mediatek/patch/1609815993-22744-7-git-send-email-yongqiang@mediatek.com/


Change since v3:
- change several function to rebase to mediatek-drm-next
- drop pm runtime patches due to it's not related to mt8192 support
- fix review comments in v3

Changes since v2:
- fix review comment in v2
- add pm runtime for gamma and color 
- move ddp path select patch to mmsys series
- remove some useless patch

Yongqiang Niu (8):
  drm/mediatek: add component OVL_2L2
  drm/mediatek: add component POSTMASK
  drm/mediatek: add component RDMA4
  drm/mediatek: enable OVL_LAYER_SMI_ID_EN for multi-layer usecase
  drm/mediatek: separate ccorr module
  drm/mediatek: add matrix bits private data for ccorr
  soc: mediatek: add mtk mutex support for MT8192
  drm/mediatek: add support for mediatek SOC MT8192

 drivers/gpu/drm/mediatek/Makefile|   4 +-
 drivers/gpu/drm/mediatek/mtk_disp_ccorr.c| 244 +++
 drivers/gpu/drm/mediatek/mtk_disp_drv.h  |  17 ++
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c  |  37 +++
 drivers/gpu/drm/mediatek/mtk_disp_postmask.c | 162 
 drivers/gpu/drm/mediatek/mtk_disp_rdma.c |   6 +
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c  | 108 ++--
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h  |   1 +
 drivers/gpu/drm/mediatek/mtk_drm_drv.c   |  52 +++-
 drivers/gpu/drm/mediatek/mtk_drm_drv.h   |   2 +
 drivers/soc/mediatek/mtk-mutex.c |  35 +++
 11 files changed, 572 insertions(+), 96 deletions(-)
 create mode 100644 drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_disp_postmask.c

-- 
2.30.0.365.g02bc693789-goog



[PATCH v4 2/8] drm/mediatek: add component POSTMASK

2021-01-28 Thread Hsin-Yi Wang
From: Yongqiang Niu 

This patch add component POSTMASK,

Signed-off-by: Yongqiang Niu 
Signed-off-by: Hsin-Yi Wang 
---
 drivers/gpu/drm/mediatek/Makefile|   1 +
 drivers/gpu/drm/mediatek/mtk_disp_drv.h  |   8 +
 drivers/gpu/drm/mediatek/mtk_disp_postmask.c | 161 +++
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c  |  11 ++
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h  |   1 +
 drivers/gpu/drm/mediatek/mtk_drm_drv.c   |   4 +-
 drivers/gpu/drm/mediatek/mtk_drm_drv.h   |   1 +
 7 files changed, 186 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/mediatek/mtk_disp_postmask.c

diff --git a/drivers/gpu/drm/mediatek/Makefile 
b/drivers/gpu/drm/mediatek/Makefile
index b64674b944860..13a0eafabf9c0 100644
--- a/drivers/gpu/drm/mediatek/Makefile
+++ b/drivers/gpu/drm/mediatek/Makefile
@@ -3,6 +3,7 @@
 mediatek-drm-y := mtk_disp_color.o \
  mtk_disp_gamma.o \
  mtk_disp_ovl.o \
+ mtk_disp_postmask.o \
  mtk_disp_rdma.o \
  mtk_drm_crtc.o \
  mtk_drm_ddp_comp.o \
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_drv.h 
b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
index 02191010699f8..d74e85db3fcdf 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_drv.h
+++ b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
@@ -37,6 +37,14 @@ void mtk_gamma_set_common(void __iomem *regs, struct 
drm_crtc_state *state);
 void mtk_gamma_start(struct device *dev);
 void mtk_gamma_stop(struct device *dev);
 
+int mtk_postmask_clk_enable(struct device *dev);
+void mtk_postmask_clk_disable(struct device *dev);
+void mtk_postmask_config(struct device *dev, unsigned int w,
+  unsigned int h, unsigned int vrefresh,
+  unsigned int bpc, struct cmdq_pkt *cmdq_pkt);
+void mtk_postmask_start(struct device *dev);
+void mtk_postmask_stop(struct device *dev);
+
 void mtk_ovl_bgclr_in_on(struct device *dev);
 void mtk_ovl_bgclr_in_off(struct device *dev);
 void mtk_ovl_bypass_shadow(struct device *dev);
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_postmask.c 
b/drivers/gpu/drm/mediatek/mtk_disp_postmask.c
new file mode 100644
index 0..d640cef9c15a4
--- /dev/null
+++ b/drivers/gpu/drm/mediatek/mtk_disp_postmask.c
@@ -0,0 +1,161 @@
+/*
+ * SPDX-License-Identifier:
+ *
+ * Copyright (c) 2020 MediaTek Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mtk_disp_drv.h"
+#include "mtk_drm_crtc.h"
+#include "mtk_drm_ddp_comp.h"
+
+#define DISP_POSTMASK_EN   0x
+#define POSTMASK_ENBIT(0)
+#define DISP_POSTMASK_CFG  0x0020
+#define POSTMASK_RELAY_MODEBIT(0)
+#define DISP_POSTMASK_SIZE 0x0030
+
+struct mtk_disp_postmask_data {
+   u32 reserved;
+};
+
+/**
+ * struct mtk_disp_postmask - DISP_postmask driver structure
+ * @ddp_comp - structure containing type enum and hardware resources
+ * @crtc - associated crtc to report irq events to
+ */
+struct mtk_disp_postmask {
+   struct clk *clk;
+   void __iomem *regs;
+   struct cmdq_client_reg cmdq_reg;
+   const struct mtk_disp_postmask_data *data;
+};
+
+int mtk_postmask_clk_enable(struct device *dev)
+{
+   struct mtk_disp_postmask *postmask = dev_get_drvdata(dev);
+
+   return clk_prepare_enable(postmask->clk);
+}
+
+void mtk_postmask_clk_disable(struct device *dev)
+{
+   struct mtk_disp_postmask *postmask = dev_get_drvdata(dev);
+
+   clk_disable_unprepare(postmask->clk);
+}
+
+void mtk_postmask_config(struct device *dev, unsigned int w,
+unsigned int h, unsigned int vrefresh,
+unsigned int bpc, struct cmdq_pkt *cmdq_pkt)
+{
+   struct mtk_disp_postmask *postmask = dev_get_drvdata(dev);
+
+   mtk_ddp_write(cmdq_pkt, w << 16 | h, >cmdq_reg, 
postmask->regs,
+ DISP_POSTMASK_SIZE);
+   mtk_ddp_write(cmdq_pkt, POSTMASK_RELAY_MODE, >cmdq_reg,
+ postmask->regs, DISP_POSTMASK_CFG);
+}
+
+void mtk_postmask_start(struct device *dev)
+{
+   struct mtk_disp_postmask *postmask = dev_get_drvdata(dev);
+
+   writel(POSTMASK_EN, postmask->regs + DISP_POSTMASK_EN);
+}
+
+void mtk_postmask_stop(struct device *dev)
+{
+   struct mtk_disp_postmask *postmask = dev_get_drvdata(dev);
+
+   writel_relaxed(0x0, postmask->regs + DISP_POSTMASK_EN);
+}
+
+static int mtk_disp_postmask_bind(struct device *dev, struct device *master, 
void *data)
+{
+   return 0;
+}
+
+static void mtk_disp_postmask_unbind(struct device *dev, struct device *master,
+ void *data)
+{
+}
+
+static const struct component_ops mtk_disp_postmask_component_ops = {
+   .bind   = mtk_disp_postmask_bind,
+   .unbind = mtk_disp_postmask_unbind,
+};
+
+static int mtk_disp_postmask_probe(struct platform_device *pdev)
+{
+

[PATCH v4 1/8] drm/mediatek: add component OVL_2L2

2021-01-28 Thread Hsin-Yi Wang
From: Yongqiang Niu 

This patch add component OVL_2L2

Signed-off-by: Yongqiang Niu 
Reviewed-by: Chun-Kuang Hu 
Signed-off-by: Hsin-Yi Wang 
---
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
index 5b7fcedb9f9a8..ccfaada998cf5 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
@@ -479,6 +479,7 @@ static const struct mtk_ddp_comp_match 
mtk_ddp_matches[DDP_COMPONENT_ID_MAX] = {
[DDP_COMPONENT_OVL1]= { MTK_DISP_OVL,   1, _ovl },
[DDP_COMPONENT_OVL_2L0] = { MTK_DISP_OVL_2L,0, _ovl },
[DDP_COMPONENT_OVL_2L1] = { MTK_DISP_OVL_2L,1, _ovl },
+   [DDP_COMPONENT_OVL_2L2] = { MTK_DISP_OVL_2L,2, _ovl },
[DDP_COMPONENT_PWM0]= { MTK_DISP_PWM,   0, NULL },
[DDP_COMPONENT_PWM1]= { MTK_DISP_PWM,   1, NULL },
[DDP_COMPONENT_PWM2]= { MTK_DISP_PWM,   2, NULL },
-- 
2.30.0.365.g02bc693789-goog



Re: linux-next: manual merge of the char-misc tree with the tty tree

2021-01-28 Thread Greg KH
On Fri, Jan 29, 2021 at 03:53:41PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the char-misc tree got conflicts in:
> 
>   drivers/tty/n_tracerouter.c
>   drivers/tty/n_tracesink.c
> 
> between commit:
> 
>   3b830a9c34d5 ("tty: convert tty_ldisc_ops 'read()' function to take a 
> kernel pointer")
> 
> from the tty tree and commit:
> 
>   8ba59e9dee31 ("misc: pti: Remove driver for deprecated platform")
> 
> from the char-misc tree.

Thanks, I knew this would happen :(


Re: [PATCH v2] xen-blkback: fix compatibility bug with single page rings

2021-01-28 Thread Jürgen Groß

On 29.01.21 07:20, Dongli Zhang wrote:



On 1/28/21 5:04 AM, Paul Durrant wrote:

From: Paul Durrant 

Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
behaviour of xen-blkback when connecting to a frontend was:

- read 'ring-page-order'
- if not present then expect a single page ring specified by 'ring-ref'
- else expect a ring specified by 'ring-refX' where X is between 0 and
   1 << ring-page-order

This was correct behaviour, but was broken by the afforementioned commit to
become:

- read 'ring-page-order'
- if not present then expect a single page ring (i.e. ring-page-order = 0)
- expect a ring specified by 'ring-refX' where X is between 0 and
   1 << ring-page-order
- if that didn't work then see if there's a single page ring specified by
   'ring-ref'

This incorrect behaviour works most of the time but fails when a frontend
that sets 'ring-page-order' is unloaded and replaced by one that does not
because, instead of reading 'ring-ref', xen-blkback will read the stale
'ring-ref0' left around by the previous frontend will try to map the wrong
grant reference.

This patch restores the original behaviour.

Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid inconsistent 
xenstore 'ring-page-order' set by malicious blkfront")
Signed-off-by: Paul Durrant 
---
Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Jens Axboe 
Cc: Dongli Zhang 

v2:
  - Remove now-spurious error path special-case when nr_grefs == 1
---
  drivers/block/xen-blkback/common.h |  1 +
  drivers/block/xen-blkback/xenbus.c | 38 +-
  2 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/drivers/block/xen-blkback/common.h 
b/drivers/block/xen-blkback/common.h
index b0c71d3a81a0..524a79f10de6 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -313,6 +313,7 @@ struct xen_blkif {
  
  	struct work_struct	free_work;

unsigned intnr_ring_pages;
+   boolmulti_ref;


Is it really necessary to introduce 'multi_ref' here or we may just re-use
'nr_ring_pages'?

According to blkfront code, 'ring-page-order' is set only when it is not zero,
that is, only when (info->nr_ring_pages > 1).


Did you look into all other OS's (Windows, OpenBSD, FreebSD, NetBSD,
Solaris, Netware, other proprietary systems) implementations to verify
that claim?

I don't think so. So better safe than sorry.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH v2 0/3] Introduction of STiH418 based 4KOpen board

2021-01-28 Thread Alain Volmat
This serie introduces the 4KOpen (stih418-b2264) board based
on a stih418 soc. Since it is the first board to use the spi-fsm
SPI NOR controller available since stih407, the controller is
also added within the stih407-family DT.
It also contains a fix within the stih418 DT since the rng11 is not
available on this platform and is thus disabled.

Alain Volmat (3):
  ARM: dts: sti: add the spinor controller node within stih407-family
  ARM: dts: sti: disable rng11 on the stih418 platform
  ARM: dts: sti: Introduce 4KOpen (stih418-b2264) board

 arch/arm/boot/dts/Makefile|   3 +-
 arch/arm/boot/dts/stih407-family.dtsi |  15 
 arch/arm/boot/dts/stih418-b2264.dts   | 123 ++
 arch/arm/boot/dts/stih418.dtsi|   4 +
 4 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/boot/dts/stih418-b2264.dts



[PATCH V7 5/6] of: unittest: Create overlay_common.dtsi and testcases_common.dtsi

2021-01-28 Thread Viresh Kumar
In order to build-test the same unit-test files using fdtoverlay tool,
move the device nodes from the existing overlay_base.dts and
testcases_common.dts files to .dtsi counterparts. The .dts files now
include the new .dtsi files, resulting in exactly the same behavior as
earlier.

The .dtsi files can now be reused for compile time tests using
fdtoverlay (will be done by a later commit).

This is required because the base files passed to fdtoverlay tool
shouldn't be overlays themselves (i.e. shouldn't have the /plugin/;
tag).

Note that this commit also moves "testcase-device2" node to
testcases.dts from tests-interrupts.dtsi, as this node has a deliberate
error in it and is only relevant for runtime testing done with
unittest.c.

Signed-off-by: Viresh Kumar 
---
 drivers/of/unittest-data/overlay_base.dts | 90 +-
 drivers/of/unittest-data/overlay_common.dtsi  | 91 +++
 drivers/of/unittest-data/testcases.dts| 18 ++--
 .../of/unittest-data/testcases_common.dtsi| 19 
 .../of/unittest-data/tests-interrupts.dtsi|  7 --
 5 files changed, 118 insertions(+), 107 deletions(-)
 create mode 100644 drivers/of/unittest-data/overlay_common.dtsi
 create mode 100644 drivers/of/unittest-data/testcases_common.dtsi

diff --git a/drivers/of/unittest-data/overlay_base.dts 
b/drivers/of/unittest-data/overlay_base.dts
index 99ab9d12d00b..ab9014589c5d 100644
--- a/drivers/of/unittest-data/overlay_base.dts
+++ b/drivers/of/unittest-data/overlay_base.dts
@@ -2,92 +2,4 @@
 /dts-v1/;
 /plugin/;
 
-/*
- * Base device tree that overlays will be applied against.
- *
- * Do not add any properties in node "/".
- * Do not add any nodes other than "/testcase-data-2" in node "/".
- * Do not add anything that would result in dtc creating node "/__fixups__".
- * dtc will create nodes "/__symbols__" and "/__local_fixups__".
- */
-
-/ {
-   testcase-data-2 {
-   #address-cells = <1>;
-   #size-cells = <1>;
-
-   electric_1: substation@100 {
-   compatible = "ot,big-volts-control";
-   reg = < 0x0100 0x100 >;
-   status = "disabled";
-
-   hvac_1: hvac-medium-1 {
-   compatible = "ot,hvac-medium";
-   heat-range = < 50 75 >;
-   cool-range = < 60 80 >;
-   };
-
-   spin_ctrl_1: motor-1 {
-   compatible = "ot,ferris-wheel-motor";
-   spin = "clockwise";
-   rpm_avail = < 50 >;
-   };
-
-   spin_ctrl_2: motor-8 {
-   compatible = "ot,roller-coaster-motor";
-   };
-   };
-
-   rides_1: fairway-1 {
-   #address-cells = <1>;
-   #size-cells = <1>;
-   compatible = "ot,rides";
-   status = "disabled";
-   orientation = < 127 >;
-
-   ride@100 {
-   #address-cells = <1>;
-   #size-cells = <1>;
-   compatible = "ot,roller-coaster";
-   reg = < 0x0100 0x100 >;
-   hvac-provider = < _1 >;
-   hvac-thermostat = < 29 > ;
-   hvac-zones = < 14 >;
-   hvac-zone-names = "operator";
-   spin-controller = < _ctrl_2 5 _ctrl_2 
7 >;
-   spin-controller-names = "track_1", "track_2";
-   queues = < 2 >;
-
-   track@30 {
-   reg = < 0x0030 0x10 >;
-   };
-
-   track@40 {
-   reg = < 0x0040 0x10 >;
-   };
-
-   };
-   };
-
-   lights_1: lights@3 {
-   compatible = "ot,work-lights";
-   reg = < 0x0003 0x1000 >;
-   status = "disabled";
-   };
-
-   lights_2: lights@4 {
-   compatible = "ot,show-lights";
-   reg = < 0x0004 0x1000 >;
-   status = "disabled";
-   rate = < 13 138 >;
-   };
-
-   retail_1: vending@5 {
-   reg = < 0x0005 0x1000 >;
-   compatible = "ot,tickets";
-   status = "disabled";
-   };
-
-   };
-};
-
+#include "overlay_common.dtsi"
diff --git a/drivers/of/unittest-data/overlay_common.dtsi 

[PATCH V7 3/6] scripts: dtc: Remove the unused fdtdump.c file

2021-01-28 Thread Viresh Kumar
This was copied from external DTC repository long back and isn't used
anymore. Over that the dtc tool can be used to generate the dts source
back from the dtb. Remove the unused fdtdump.c file.

Signed-off-by: Viresh Kumar 
---
 scripts/dtc/fdtdump.c | 163 --
 1 file changed, 163 deletions(-)
 delete mode 100644 scripts/dtc/fdtdump.c

diff --git a/scripts/dtc/fdtdump.c b/scripts/dtc/fdtdump.c
deleted file mode 100644
index 7d460a50b513..
--- a/scripts/dtc/fdtdump.c
+++ /dev/null
@@ -1,163 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * fdtdump.c - Contributed by Pantelis Antoniou 
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-
-#include "util.h"
-
-#define ALIGN(x, a)(((x) + ((a) - 1)) & ~((a) - 1))
-#define PALIGN(p, a)   ((void *)(ALIGN((unsigned long)(p), (a
-#define GET_CELL(p)(p += 4, *((const uint32_t *)(p-4)))
-
-static void print_data(const char *data, int len)
-{
-   int i;
-   const char *p = data;
-
-   /* no data, don't print */
-   if (len == 0)
-   return;
-
-   if (util_is_printable_string(data, len)) {
-   printf(" = \"%s\"", (const char *)data);
-   } else if ((len % 4) == 0) {
-   printf(" = <");
-   for (i = 0; i < len; i += 4)
-   printf("0x%08x%s", fdt32_to_cpu(GET_CELL(p)),
-  i < (len - 4) ? " " : "");
-   printf(">");
-   } else {
-   printf(" = [");
-   for (i = 0; i < len; i++)
-   printf("%02x%s", *p++, i < len - 1 ? " " : "");
-   printf("]");
-   }
-}
-
-static void dump_blob(void *blob)
-{
-   struct fdt_header *bph = blob;
-   uint32_t off_mem_rsvmap = fdt32_to_cpu(bph->off_mem_rsvmap);
-   uint32_t off_dt = fdt32_to_cpu(bph->off_dt_struct);
-   uint32_t off_str = fdt32_to_cpu(bph->off_dt_strings);
-   struct fdt_reserve_entry *p_rsvmap =
-   (struct fdt_reserve_entry *)((char *)blob + off_mem_rsvmap);
-   const char *p_struct = (const char *)blob + off_dt;
-   const char *p_strings = (const char *)blob + off_str;
-   uint32_t version = fdt32_to_cpu(bph->version);
-   uint32_t totalsize = fdt32_to_cpu(bph->totalsize);
-   uint32_t tag;
-   const char *p, *s, *t;
-   int depth, sz, shift;
-   int i;
-   uint64_t addr, size;
-
-   depth = 0;
-   shift = 4;
-
-   printf("/dts-v1/;\n");
-   printf("// magic:\t\t0x%x\n", fdt32_to_cpu(bph->magic));
-   printf("// totalsize:\t\t0x%x (%d)\n", totalsize, totalsize);
-   printf("// off_dt_struct:\t0x%x\n", off_dt);
-   printf("// off_dt_strings:\t0x%x\n", off_str);
-   printf("// off_mem_rsvmap:\t0x%x\n", off_mem_rsvmap);
-   printf("// version:\t\t%d\n", version);
-   printf("// last_comp_version:\t%d\n",
-  fdt32_to_cpu(bph->last_comp_version));
-   if (version >= 2)
-   printf("// boot_cpuid_phys:\t0x%x\n",
-  fdt32_to_cpu(bph->boot_cpuid_phys));
-
-   if (version >= 3)
-   printf("// size_dt_strings:\t0x%x\n",
-  fdt32_to_cpu(bph->size_dt_strings));
-   if (version >= 17)
-   printf("// size_dt_struct:\t0x%x\n",
-  fdt32_to_cpu(bph->size_dt_struct));
-   printf("\n");
-
-   for (i = 0; ; i++) {
-   addr = fdt64_to_cpu(p_rsvmap[i].address);
-   size = fdt64_to_cpu(p_rsvmap[i].size);
-   if (addr == 0 && size == 0)
-   break;
-
-   printf("/memreserve/ %llx %llx;\n",
-  (unsigned long long)addr, (unsigned long long)size);
-   }
-
-   p = p_struct;
-   while ((tag = fdt32_to_cpu(GET_CELL(p))) != FDT_END) {
-
-   /* printf("tag: 0x%08x (%d)\n", tag, p - p_struct); */
-
-   if (tag == FDT_BEGIN_NODE) {
-   s = p;
-   p = PALIGN(p + strlen(s) + 1, 4);
-
-   if (*s == '\0')
-   s = "/";
-
-   printf("%*s%s {\n", depth * shift, "", s);
-
-   depth++;
-   continue;
-   }
-
-   if (tag == FDT_END_NODE) {
-   depth--;
-
-   printf("%*s};\n", depth * shift, "");
-   continue;
-   }
-
-   if (tag == FDT_NOP) {
-   printf("%*s// [NOP]\n", depth * shift, "");
-   continue;
-   }
-
-   if (tag != FDT_PROP) {
-   fprintf(stderr, "%*s ** Unknown tag 0x%08x\n", depth * 
shift, "", tag);
-   break;
-   }
-   sz = fdt32_to_cpu(GET_CELL(p));
-   s = p_strings + fdt32_to_cpu(GET_CELL(p));
-   if 

[PATCH V7 6/6] of: unittest: Statically apply overlays using fdtoverlay

2021-01-28 Thread Viresh Kumar
Now that fdtoverlay is part of the kernel build, start using it to test
the unitest overlays we have by applying them statically. Create two new
base files static_base_1.dts and static_base_2.dts which includes other
.dtsi files.

Some unittest overlays deliberately contain errors that unittest checks
for. These overlays will cause fdtoverlay to fail, and are thus not
included for static builds.

Signed-off-by: Viresh Kumar 
---
 drivers/of/unittest-data/Makefile  | 56 ++
 drivers/of/unittest-data/static_base_1.dts |  4 ++
 drivers/of/unittest-data/static_base_2.dts |  4 ++
 3 files changed, 64 insertions(+)
 create mode 100644 drivers/of/unittest-data/static_base_1.dts
 create mode 100644 drivers/of/unittest-data/static_base_2.dts

diff --git a/drivers/of/unittest-data/Makefile 
b/drivers/of/unittest-data/Makefile
index 009f4045c8e4..fc286224b2d1 100644
--- a/drivers/of/unittest-data/Makefile
+++ b/drivers/of/unittest-data/Makefile
@@ -34,7 +34,63 @@ DTC_FLAGS_overlay += -@
 DTC_FLAGS_overlay_bad_phandle += -@
 DTC_FLAGS_overlay_bad_symbol += -@
 DTC_FLAGS_overlay_base += -@
+DTC_FLAGS_static_base_1 += -@
+DTC_FLAGS_static_base_2 += -@
 DTC_FLAGS_testcases += -@
 
 # suppress warnings about intentional errors
 DTC_FLAGS_testcases += -Wno-interrupts_property
+
+# Apply overlays statically with fdtoverlay.  This is a build time test that
+# the overlays can be applied successfully by fdtoverlay.  This does not
+# guarantee that the overlays can be applied successfully at run time by
+# unittest, but it provides a bit of build time test coverage for those
+# who do not execute unittest.
+#
+# The overlays are applied on top of static_base_1.dtb and static_base_2.dtb to
+# create static_test_1.dtb and static_test_2.dtb.  If fdtoverlay detects an
+# error than the kernel build will fail.  static_test_1.dtb and
+# static_test_2.dtb are not consumed by unittest.
+#
+# Some unittest overlays deliberately contain errors that unittest checks for.
+# These overlays will cause fdtoverlay to fail, and are thus not included
+# in the static test:
+#overlay_bad_add_dup_node.dtb \
+#overlay_bad_add_dup_prop.dtb \
+#overlay_bad_phandle.dtb \
+#overlay_bad_symbol.dtb \
+
+apply_static_overlay_1 := overlay_0.dtb \
+ overlay_1.dtb \
+ overlay_2.dtb \
+ overlay_3.dtb \
+ overlay_4.dtb \
+ overlay_5.dtb \
+ overlay_6.dtb \
+ overlay_7.dtb \
+ overlay_8.dtb \
+ overlay_9.dtb \
+ overlay_10.dtb \
+ overlay_11.dtb \
+ overlay_12.dtb \
+ overlay_13.dtb \
+ overlay_15.dtb \
+ overlay_gpio_01.dtb \
+ overlay_gpio_02a.dtb \
+ overlay_gpio_02b.dtb \
+ overlay_gpio_03.dtb \
+ overlay_gpio_04a.dtb \
+ overlay_gpio_04b.dtb
+
+apply_static_overlay_2 := overlay.dtb
+
+quiet_cmd_fdtoverlay = FDTOVERLAY $@
+  cmd_fdtoverlay = $(objtree)/scripts/dtc/fdtoverlay -o $@ -i $^
+
+$(obj)/static_test_1.dtb: $(obj)/static_base_1.dtb $(addprefix 
$(obj)/,$(apply_static_overlay_1))
+   $(call if_changed,fdtoverlay)
+
+$(obj)/static_test_2.dtb: $(obj)/static_base_2.dtb $(addprefix 
$(obj)/,$(apply_static_overlay_2))
+   $(call if_changed,fdtoverlay)
+
+always-$(CONFIG_OF_OVERLAY) += static_test_1.dtb static_test_2.dtb
diff --git a/drivers/of/unittest-data/static_base_1.dts 
b/drivers/of/unittest-data/static_base_1.dts
new file mode 100644
index ..10556cb3f01f
--- /dev/null
+++ b/drivers/of/unittest-data/static_base_1.dts
@@ -0,0 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
+/dts-v1/;
+
+#include "testcases_common.dtsi"
diff --git a/drivers/of/unittest-data/static_base_2.dts 
b/drivers/of/unittest-data/static_base_2.dts
new file mode 100644
index ..b0ea9504d6f3
--- /dev/null
+++ b/drivers/of/unittest-data/static_base_2.dts
@@ -0,0 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
+/dts-v1/;
+
+#include "overlay_common.dtsi"
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH V7 2/6] scripts: dtc: Build fdtoverlay tool

2021-01-28 Thread Viresh Kumar
We will start building overlays for platforms soon in the kernel and
would need fdtoverlay going forward. Lets start building it.

The fdtoverlay program applies one or more overlay dtb blobs to a base
dtb blob. The kernel build system would later use fdtoverlay to generate
the overlaid blobs based on platform specific configurations.

Signed-off-by: Viresh Kumar 
---
 scripts/dtc/Makefile | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/scripts/dtc/Makefile b/scripts/dtc/Makefile
index 4852bf44e913..c8c21e0f2531 100644
--- a/scripts/dtc/Makefile
+++ b/scripts/dtc/Makefile
@@ -1,13 +1,19 @@
 # SPDX-License-Identifier: GPL-2.0
 # scripts/dtc makefile
 
-hostprogs-always-$(CONFIG_DTC) += dtc
+hostprogs-always-$(CONFIG_DTC) += dtc fdtoverlay
 hostprogs-always-$(CHECK_DT_BINDING)   += dtc
 
 dtc-objs   := dtc.o flattree.o fstree.o data.o livetree.o treesource.o \
   srcpos.o checks.o util.o
 dtc-objs   += dtc-lexer.lex.o dtc-parser.tab.o
 
+# The upstream project builds libfdt as a separate library.  We are choosing to
+# instead directly link the libfdt object files into fdtoverlay.
+libfdt-objs:= fdt.o fdt_ro.o fdt_wip.o fdt_sw.o fdt_rw.o fdt_strerror.o 
fdt_empty_tree.o fdt_addresses.o fdt_overlay.o
+libfdt = $(addprefix libfdt/,$(libfdt-objs))
+fdtoverlay-objs:= $(libfdt) fdtoverlay.o util.o
+
 # Source files need to get at the userspace version of libfdt_env.h to compile
 HOST_EXTRACFLAGS += -I $(srctree)/$(src)/libfdt
 
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH V7 4/6] kbuild: Add support to build overlays (%.dtbo)

2021-01-28 Thread Viresh Kumar
Add support for building DT overlays (%.dtbo). The overlay's source file
will have the usual extension, i.e. .dts, though the blob will have
.dtbo extension to distinguish it from normal blobs.

Acked-by: Masahiro Yamada 
Signed-off-by: Viresh Kumar 
---
 .gitignore   | 1 +
 Makefile | 5 -
 scripts/Makefile.dtbinst | 3 +++
 scripts/Makefile.lib | 5 +
 4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/.gitignore b/.gitignore
index d01cda8e1177..bb65fa253e58 100644
--- a/.gitignore
+++ b/.gitignore
@@ -18,6 +18,7 @@
 *.c.[012]*.*
 *.dt.yaml
 *.dtb
+*.dtbo
 *.dtb.S
 *.dwo
 *.elf
diff --git a/Makefile b/Makefile
index e0af7a4a5598..d5bc67e523be 100644
--- a/Makefile
+++ b/Makefile
@@ -1337,6 +1337,9 @@ ifneq ($(dtstree),)
 %.dtb: include/config/kernel.release scripts_dtc
$(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@
 
+%.dtbo: include/config/kernel.release scripts_dtc
+   $(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@
+
 PHONY += dtbs dtbs_install dtbs_check
 dtbs: include/config/kernel.release scripts_dtc
$(Q)$(MAKE) $(build)=$(dtstree)
@@ -1816,7 +1819,7 @@ clean: $(clean-dirs)
@find $(if $(KBUILD_EXTMOD), $(KBUILD_EXTMOD), .) $(RCS_FIND_IGNORE) \
\( -name '*.[aios]' -o -name '*.ko' -o -name '.*.cmd' \
-o -name '*.ko.*' \
-   -o -name '*.dtb' -o -name '*.dtb.S' -o -name '*.dt.yaml' \
+   -o -name '*.dtb' -o -name '*.dtbo' -o -name '*.dtb.S' -o -name 
'*.dt.yaml' \
-o -name '*.dwo' -o -name '*.lst' \
-o -name '*.su' -o -name '*.mod' \
-o -name '.*.d' -o -name '.*.tmp' -o -name '*.mod.c' \
diff --git a/scripts/Makefile.dtbinst b/scripts/Makefile.dtbinst
index 50d580d77ae9..ba01f5ba2517 100644
--- a/scripts/Makefile.dtbinst
+++ b/scripts/Makefile.dtbinst
@@ -29,6 +29,9 @@ quiet_cmd_dtb_install = INSTALL $@
 $(dst)/%.dtb: $(obj)/%.dtb
$(call cmd,dtb_install)
 
+$(dst)/%.dtbo: $(obj)/%.dtbo
+   $(call cmd,dtb_install)
+
 PHONY += $(subdirs)
 $(subdirs):
$(Q)$(MAKE) $(dtbinst)=$@ dst=$(patsubst $(obj)/%,$(dst)/%,$@)
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 213677a5ed33..b00855b247e0 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -86,7 +86,9 @@ extra-$(CONFIG_OF_ALL_DTBS)   += $(dtb-)
 
 ifneq ($(CHECK_DTBS),)
 extra-y += $(patsubst %.dtb,%.dt.yaml, $(dtb-y))
+extra-y += $(patsubst %.dtbo,%.dt.yaml, $(dtb-y))
 extra-$(CONFIG_OF_ALL_DTBS) += $(patsubst %.dtb,%.dt.yaml, $(dtb-))
+extra-$(CONFIG_OF_ALL_DTBS) += $(patsubst %.dtbo,%.dt.yaml, $(dtb-))
 endif
 
 # Add subdir path
@@ -327,6 +329,9 @@ cmd_dtc = $(HOSTCC) -E $(dtc_cpp_flags) -x 
assembler-with-cpp -o $(dtc-tmp) $< ;
 $(obj)/%.dtb: $(src)/%.dts $(DTC) FORCE
$(call if_changed_dep,dtc)
 
+$(obj)/%.dtbo: $(src)/%.dts $(DTC) FORCE
+   $(call if_changed_dep,dtc)
+
 DT_CHECKER ?= dt-validate
 DT_BINDING_DIR := Documentation/devicetree/bindings
 # DT_TMP_SCHEMA may be overridden from 
Documentation/devicetree/bindings/Makefile
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH V7 0/6] dt: build overlays

2021-01-28 Thread Viresh Kumar
Hi,

This patchset makes necessary changes to the kernel to add support for
building overlays (%.dtbo) and the required fdtoverlay tool. This also
builds static_test.dtb using most of the existing overlay tests present
in drivers/of/unittest-data/ for better test coverage.

Note that in order for anyone to test this stuff, you need to manually
run the ./update-dtc-source.sh script once to fetch the necessary
changes from the external DTC project (i.e. fdtoverlay.c and this[1]
patch).

I have tested this patchset for static and runtime testing (on Hikey
board) and no issues were reported.

V7:
- Add a comment in scripts/dtc/Makefile
- Add Ack from Masahiro for patch 4/6.
- Drop word "merge" from commit log of 2/6.
- Split apply_static_overlay, static_test.dtb, and static_base.dts into
  two parts to handle overlay_base.dts and testcases.dts separately.

V6:
- Create separate rules for dtbo-s and separate entries in .gitignore in
  4/6 (Masahiro).
- A new file layout for handling all overlays for existing and new tests
  5/6 (Frank).
- Include overlay.dts as well now in 6/6 (Frank).

V5:

- Don't reuse DTC_SOURCE for fdtoverlay.c in patch 1/5 (Frank).

- Update .gitignore and scripts/Makefile.dtbinst, drop dtbo-y syntax and
  DTC_FLAGS += -@ in patch 4/5 (Masahiro).

- Remove the intermediate dtb, rename output to static_test.dtb, don't
  use overlay.dtb and overlay_base.dtb for static builds, improved
  layout/comments in Makefile for patch 5/5 (Frank).

--
Viresh

[1] 
https://github.com/dgibson/dtc/commit/163f0469bf2ed8b2fe5aa15bc796b93c70243ddc
[2] https://lore.kernel.org/lkml/74f8aa8f-ffab-3b0f-186f-31fb7395e...@gmail.com/

Viresh Kumar (6):
  scripts: dtc: Fetch fdtoverlay.c from external DTC project
  scripts: dtc: Build fdtoverlay tool
  scripts: dtc: Remove the unused fdtdump.c file
  kbuild: Add support to build overlays (%.dtbo)
  of: unittest: Create overlay_common.dtsi and testcases_common.dtsi
  of: unittest: Statically apply overlays using fdtoverlay

 .gitignore|   1 +
 Makefile  |   5 +-
 drivers/of/unittest-data/Makefile |  56 ++
 drivers/of/unittest-data/overlay_base.dts |  90 +-
 drivers/of/unittest-data/overlay_common.dtsi  |  91 ++
 drivers/of/unittest-data/static_base_1.dts|   4 +
 drivers/of/unittest-data/static_base_2.dts|   4 +
 drivers/of/unittest-data/testcases.dts|  18 +-
 .../of/unittest-data/testcases_common.dtsi|  19 ++
 .../of/unittest-data/tests-interrupts.dtsi|   7 -
 scripts/Makefile.dtbinst  |   3 +
 scripts/Makefile.lib  |   5 +
 scripts/dtc/Makefile  |   8 +-
 scripts/dtc/fdtdump.c | 163 --
 scripts/dtc/update-dtc-source.sh  |   3 +-
 15 files changed, 204 insertions(+), 273 deletions(-)
 create mode 100644 drivers/of/unittest-data/overlay_common.dtsi
 create mode 100644 drivers/of/unittest-data/static_base_1.dts
 create mode 100644 drivers/of/unittest-data/static_base_2.dts
 create mode 100644 drivers/of/unittest-data/testcases_common.dtsi
 delete mode 100644 scripts/dtc/fdtdump.c


base-commit: 6ee1d745b7c9fd573fba142a2efdad76a9f1cb04
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH V7 1/6] scripts: dtc: Fetch fdtoverlay.c from external DTC project

2021-01-28 Thread Viresh Kumar
We will start building overlays for platforms soon in the kernel and
would need fdtoverlay tool going forward. Lets start fetching it.

Signed-off-by: Viresh Kumar 
---
 scripts/dtc/update-dtc-source.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/dtc/update-dtc-source.sh b/scripts/dtc/update-dtc-source.sh
index bc704e2a6a4a..32ff17ffd089 100755
--- a/scripts/dtc/update-dtc-source.sh
+++ b/scripts/dtc/update-dtc-source.sh
@@ -37,6 +37,7 @@ DTC_SOURCE="checks.c data.c dtc.c dtc.h flattree.c fstree.c 
livetree.c srcpos.c
 LIBFDT_SOURCE="fdt.c fdt.h fdt_addresses.c fdt_empty_tree.c \
fdt_overlay.c fdt_ro.c fdt_rw.c fdt_strerror.c fdt_sw.c \
fdt_wip.c libfdt.h libfdt_env.h libfdt_internal.h"
+FDTOVERLAY_SOURCE=fdtoverlay.c
 
 get_last_dtc_version() {
git log --oneline scripts/dtc/ | grep 'upstream' | head -1 | sed -e 
's/^.* \(.*\)/\1/'
@@ -54,7 +55,7 @@ dtc_log=$(git log --oneline ${last_dtc_ver}..)
 
 # Copy the files into the Linux tree
 cd $DTC_LINUX_PATH
-for f in $DTC_SOURCE; do
+for f in $DTC_SOURCE $FDTOVERLAY_SOURCE; do
cp ${DTC_UPSTREAM_PATH}/${f} ${f}
git add ${f}
 done
-- 
2.25.0.rc1.19.g042ed3e048af



Re: [PATCH v4 1/2] bio: limit bio max size

2021-01-28 Thread Ming Lei
On Fri, Jan 29, 2021 at 12:49:08PM +0900, Changheun Lee wrote:
> bio size can grow up to 4GB when muli-page bvec is enabled.
> but sometimes it would lead to inefficient behaviors.
> in case of large chunk direct I/O, - 32MB chunk read in user space -
> all pages for 32MB would be merged to a bio structure if the pages
> physical addresses are contiguous. it makes some delay to submit
> until merge complete. bio max size should be limited to a proper size.
> 
> When 32MB chunk read with direct I/O option is coming from userspace,
> kernel behavior is below now. it's timeline.
> 
>  | bio merge for 32MB. total 8,192 pages are merged.
>  | total elapsed time is over 2ms.
>  |-- ... --->|
>  | 8,192 pages merged a bio.
>  | at this time, first bio 
> submit is done.
>  | 1 bio is split to 32 read 
> request and issue.
>  |--->
>   |--->
>|--->
>   ..
>
> |--->
> 
> |--->|
>   total 19ms elapsed to complete 32MB read done from 
> device. |
> 
> If bio max size is limited with 1MB, behavior is changed below.
> 
>  | bio merge for 1MB. 256 pages are merged for each bio.
>  | total 32 bio will be made.
>  | total elapsed time is over 2ms. it's same.
>  | but, first bio submit timing is fast. about 100us.
>  |--->|--->|--->|---> ... -->|--->|--->|--->|--->|
>   | 256 pages merged a bio.
>   | at this time, first bio submit is done.
>   | and 1 read request is issued for 1 bio.
>   |--->
>|--->
> |--->
>   ..
>  |--->
>   |--->|
> total 17ms elapsed to complete 32MB read done from device. |

Can you share us if enabling THP in your application can avoid this issue? BTW, 
you
need to make the 32MB buffer aligned with huge page size. IMO, THP perfectly 
fits
your case.


Thanks,
Ming



Re: [PATCH v5 4/4] ARM: Add support for Hisilicon Kunpeng L3 cache controller

2021-01-28 Thread Leizhen (ThunderTown)



On 2021/1/28 22:24, Arnd Bergmann wrote:
> On Sat, Jan 16, 2021 at 4:27 AM Zhen Lei  wrote:
>> diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
>> +
>> +static void l3cache_maint_common(u32 range, u32 op_type)
>> +{
>> +   u32 reg;
>> +
>> +   reg = readl(l3_ctrl_base + L3_MAINT_CTRL);
>> +   reg &= ~(L3_MAINT_RANGE_MASK | L3_MAINT_TYPE_MASK);
>> +   reg |= range | op_type;
>> +   reg |= L3_MAINT_STATUS_START;
>> +   writel(reg, l3_ctrl_base + L3_MAINT_CTRL);
> 
> Are there contents of L3_MAINT_CTRL that need to be preserved
> across calls and can not be inferred? A 'readl()' is often expensive,
> so it might be more efficient if you can avoid that.

Right, this readl() can be replaced with readl_relaxed(). Thanks.

I'll check and correct the readl() and writel() in other places.

> 
>> +static inline void l3cache_flush_all_nolock(void)
>> +{
>> +   l3cache_maint_common(L3_MAINT_RANGE_ALL, L3_MAINT_TYPE_FLUSH);
>> +}
>> +
>> +static void l3cache_flush_all(void)
>> +{
>> +   unsigned long flags;
>> +
>> +   spin_lock_irqsave(_lock, flags);
>> +   l3cache_flush_all_nolock();
>> +   spin_unlock_irqrestore(_lock, flags);
>> +}
> 
> I see that cache-l2x0 uses raw_spin_lock_irqsave() instead of
> spin_lock_irqsave(), to avoid preemption in the middle of a cache
> operation. This is probably a good idea here as well.

I don't think there's any essential difference between the two! I don't know
if the compiler or tool will do anything extra. I checked the git log of the
l2x0 driver and it used raw_spin_lock_irqsave() at the beginning. Maybe
there's a description in 2.6. Since you mentioned this potential risk, I'll
change it to raw_spin_lock_irqsave.

include/linux/spinlock.h:
static __always_inline raw_spinlock_t *spinlock_check(spinlock_t *lock)
{
return >rlock;
}

#define spin_lock_irqsave(lock, flags)  \
do {\
raw_spin_lock_irqsave(spinlock_check(lock), flags); \
} while (0)

> 
> I also see that l2x0 uses readl_relaxed(), to avoid a deadlock
> in l2x0_cache_sync(). This may also be beneficial for performance
> reasons, so it might be helpful to compare performance
> overhead. On the other hand, readl()/writel() are usually the
> safe choice, as those avoid the need to argue over whether
> the relaxed versions are safe in all corner cases.
> 
>> +static int __init l3cache_init(void)
>> +{
>> +   u32 reg;
>> +   struct device_node *node;
>> +
>> +   node = of_find_matching_node(NULL, l3cache_ids);
>> +   if (!node)
>> +   return -ENODEV;
> 
> I think the initcall should return '0' to indicate success when running
> a kernel with this driver built-in on a platform that does not have
> this device.

I have added "depends on ARCH_KUNPENG50X" for this driver. But it's OK to
return 0.

> 
>> diff --git a/arch/arm/mm/cache-kunpeng-l3.h b/arch/arm/mm/cache-kunpeng-l3.h
>> new file mode 100644
>> index 000..9ef6a53e7d4db49
>> --- /dev/null
>> +++ b/arch/arm/mm/cache-kunpeng-l3.h
>> @@ -0,0 +1,30 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef __CACHE_KUNPENG_L3_H
>> +#define __CACHE_KUNPENG_L3_H
>> +
>> +#define L3_CACHE_LINE_SHITF6
> 
> I would suggest moving the contents of the header file into the .c file,
> since there is only a single user of these macros.

Okay, I'll move it.

> 
>   Arnd
> 
> .
> 



Re: [PATCH 1/3] serial: 8250: Handle UART without interrupt on TEMT using em485

2021-01-28 Thread Jiri Slaby

On 29. 01. 21, 0:36, Eric Tremblay wrote:

The patch introduce the UART_CAP_TEMT capability which is by default
assigned to all 8250 UART since the code assume that device has the
interrupt on TEMT

In the case where the device does not support it, we calculate the
maximum of time it could take for the transmitter to empty the
shift register. When we get in the situation where we get the
THRE interrupt but the TEMT bit is not set we start the timer
and recall __stop_tx after the delay

Signed-off-by: Eric Tremblay 
---
  drivers/tty/serial/8250/8250.h|  1 +
  drivers/tty/serial/8250/8250_bcm2835aux.c |  2 +-
  drivers/tty/serial/8250/8250_omap.c   |  2 +-
  drivers/tty/serial/8250/8250_port.c   | 89 ++-
  include/linux/serial_8250.h   |  2 +
  5 files changed, 93 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/serial/8250/8250.h b/drivers/tty/serial/8250/8250.h
index 52bb21205bb6..5361b761eed7 100644
--- a/drivers/tty/serial/8250/8250.h
+++ b/drivers/tty/serial/8250/8250.h
@@ -82,6 +82,7 @@ struct serial8250_config {
  #define UART_CAP_MINI (1 << 17) /* Mini UART on BCM283X family lacks:
 * STOP PARITY EPAR SPAR WLEN5 WLEN6
 */
+#define UART_CAP_TEMT  (1 << 18) /* UART have interrupt on TEMT */


What about the inversion _NOTEMT? You then set it only on uarts without 
TEMT and don't need to update every single driver.



diff --git a/drivers/tty/serial/8250/8250_bcm2835aux.c 
b/drivers/tty/serial/8250/8250_bcm2835aux.c
index fd95860cd661..354faebce885 100644
--- a/drivers/tty/serial/8250/8250_bcm2835aux.c
+++ b/drivers/tty/serial/8250/8250_bcm2835aux.c
@@ -91,7 +91,7 @@ static int bcm2835aux_serial_probe(struct platform_device 
*pdev)
return -ENOMEM;
  
  	/* initialize data */

-   up.capabilities = UART_CAP_FIFO | UART_CAP_MINI;
+   data->uart.capabilities = UART_CAP_FIFO | UART_CAP_MINI | UART_CAP_TEMT;


This change looks weird and undocumented. Why do you set data->uart 
suddenly?


Actually, does this build?


up.port.dev = >dev;
up.port.regshift = 2;
up.port.type = PORT_16550;
diff --git a/drivers/tty/serial/8250/8250_omap.c 
b/drivers/tty/serial/8250/8250_omap.c
index 23e0decde33e..1c21ac68ff37 100644
--- a/drivers/tty/serial/8250/8250_omap.c
+++ b/drivers/tty/serial/8250/8250_omap.c
@@ -1294,7 +1294,7 @@ static int omap8250_probe(struct platform_device *pdev)
up.port.regshift = 2;
up.port.fifosize = 64;
up.tx_loadsz = 64;
-   up.capabilities = UART_CAP_FIFO;
+   up.capabilities = UART_CAP_FIFO | UART_CAP_TEMT;
  #ifdef CONFIG_PM
/*
 * Runtime PM is mostly transparent. However to do it right we need to a
diff --git a/drivers/tty/serial/8250/8250_port.c 
b/drivers/tty/serial/8250/8250_port.c
index b0af13074cd3..44a54406e4b4 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -558,8 +558,41 @@ static void serial8250_clear_fifos(struct uart_8250_port 
*p)
}
  }
  
+static inline void serial8250_em485_update_temt_delay(struct uart_8250_port *p,

+   unsigned int cflag, unsigned int baud)
+{
+   unsigned int bits;
+
+   if (!p->em485)
+   return;
+
+   /* byte size and parity */
+   switch (cflag & CSIZE) {
+   case CS5:
+   bits = 7;
+   break;
+   case CS6:
+   bits = 8;
+   break;
+   case CS7:
+   bits = 9;
+   break;
+   default:
+   bits = 10;
+   break; /* CS8 */
+   }
+
+   if (cflag & CSTOPB)
+   bits++;
+   if (cflag & PARENB)
+   bits++;
+
+   p->em485->no_temt_delay = bits*100/baud;
+}
+
  static enum hrtimer_restart serial8250_em485_handle_start_tx(struct hrtimer 
*t);
  static enum hrtimer_restart serial8250_em485_handle_stop_tx(struct hrtimer 
*t);
+static enum hrtimer_restart serial8250_em485_handle_no_temt(struct hrtimer *t);
  
  void serial8250_clear_and_reinit_fifos(struct uart_8250_port *p)

  {
@@ -618,6 +651,18 @@ static int serial8250_em485_init(struct uart_8250_port *p)
 HRTIMER_MODE_REL);
hrtimer_init(>em485->start_tx_timer, CLOCK_MONOTONIC,
 HRTIMER_MODE_REL);
+
+   if (!(p->capabilities & UART_CAP_TEMT)) {
+   struct tty_struct *tty = p->port.state->port.tty;


Is this safe? Don't you need a tty reference? Or maybe you need to pass 
the tty from the TIOCSRS485 ioctl to here.



+   serial8250_em485_update_temt_delay(p, tty->termios.c_cflag,
+  tty_get_baud_rate(tty));
+   hrtimer_init(>em485->no_temt_timer, CLOCK_MONOTONIC,
+HRTIMER_MODE_REL);
+   p->em485->no_temt_timer.function =
+   

Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation

2021-01-28 Thread Mike Rapoport
On Thu, Jan 28, 2021 at 02:01:06PM +0100, Michal Hocko wrote:
> On Thu 28-01-21 11:22:59, Mike Rapoport wrote:
> 
> > And hugetlb pools may be also depleted by anybody by calling
> > mmap(MAP_HUGETLB) and there is no any limiting knob for this, while
> > secretmem has RLIMIT_MEMLOCK.
> 
> Yes it can fail. But it would fail at the mmap time when the reservation
> fails. Not during the #PF time which can be at any time.

It may fail at $PF time as well:

hugetlb_fault()
hugeltb_no_page()
...
alloc_huge_page()
alloc_gigantic_page()
cma_alloc()
-ENOMEM; 

 
> > That said, simply replacing VM_FAULT_OOM with VM_FAULT_SIGBUS makes
> > secretmem at least as controllable and robust than hugeltbfs even without
> > complex reservation at mmap() time.
> 
> Still sucks huge!
 
Any #PF can get -ENOMEM for whatever reason. Sucks huge indeed.

> > > > > So unless I am really misreading the code
> > > > > Nacked-by: Michal Hocko 
> > > > > 
> > > > > That doesn't mean I reject the whole idea. There are some details to
> > > > > sort out as mentioned elsewhere but you cannot really depend on
> > > > > pre-allocated pool which can fail at a fault time like that.
> > > > 
> > > > So, to do it similar to hugetlbfs (e.g., with CMA), there would have to 
> > > > be a
> > > > mechanism to actually try pre-reserving (e.g., from the CMA area), at 
> > > > which
> > > > point in time the pages would get moved to the secretmem pool, and a
> > > > mechanism for mmap() etc. to "reserve" from these secretmem pool, such 
> > > > that
> > > > there are guarantees at fault time?
> > > 
> > > yes, reserve at mmap time and use during the fault. But this all sounds
> > > like a self inflicted problem to me. Sure you can have a pre-allocated
> > > or more dynamic pool to reduce the direct mapping fragmentation but you
> > > can always fall back to regular allocatios. In other ways have the pool
> > > as an optimization rather than a hard requirement. With a careful access
> > > control this sounds like a manageable solution to me.
> > 
> > I'd really wish we had this discussion for earlier spins of this series,
> > but since this didn't happen let's refresh the history a bit.
> 
> I am sorry but I am really fighting to find time to watch for all the
> moving targets...
> 
> > One of the major pushbacks on the first RFC [1] of the concept was about
> > the direct map fragmentation. I tried really hard to find data that shows
> > what is the performance difference with different page sizes in the direct
> > map and I didn't find anything.
> > 
> > So presuming that large pages do provide advantage the first implementation
> > of secretmem used PMD_ORDER allocations to amortise the effect of the
> > direct map fragmentation and then handed out 4k pages at each fault. In
> > addition there was an option to reserve a finite pool at boot time and
> > limit secretmem allocations only to that pool.
> > 
> > At some point David suggested to use CMA to improve overall flexibility
> > [3], so I switched secretmem to use CMA.
> > 
> > Now, with the data we have at hand (my benchmarks and Intel's report David
> > mentioned) I'm even not sure this whole pooling even required.
> 
> I would still like to understand whether that data is actually
> representative. With some underlying reasoning rather than I have run
> these XYZ benchmarks and numbers do not look terrible.

I would also very much like to see, for example, reasoning to enabling 1GB
pages in the direct map beyond "because we can" (commits 00d1c5e05736
("x86: add gbpages switches") and ef9257668e31 ("x86: do kernel direct
mapping at boot using GB pages")).

The original Kconfig text for CONFIG_DIRECT_GBPAGES said

  Enable gigabyte pages support (if the CPU supports it). This can
  improve the kernel's performance a tiny bit by reducing TLB
  pressure.

So it is very interesting how tiny that bit was.
 
> > I like the idea to have a pool as an optimization rather than a hard
> > requirement but I don't see why would it need a careful access control. As
> > the direct map fragmentation is not necessarily degrades the performance
> > (and even sometimes it actually improves it) and even then the degradation
> > is small, trying a PMD_ORDER allocation for a pool and then falling back to
> > 4K page may be just fine.
> 
> Well, as soon as this is a scarce resource then an access control seems
> like a first thing to think of. Maybe it is not really necessary but
> then this should be really justified.

And what being a scarce resource here? If we consider lack of the direct
map fragmentation as this resource, there enough measures secretmem
implements to limit user ability to fragment the direct map, as was already
discussed several times. Global limit, memcg and rlimit provide enough
access control already.

-- 
Sincerely yours,
Mike.


Re: [PATCH] misc: bcm-vk: only support ttyVK if CONFIG_TTY is set

2021-01-28 Thread Randy Dunlap
On 1/28/21 10:04 PM, Scott Branden wrote:
> Correct compile issue if CONFIG_TTY is not set by
> only adding ttyVK devices if CONFIG_TTY is set.
> 
> Reported-by: Randy Dunlap 
> Signed-off-by: Scott Branden 
> ---
>  drivers/misc/bcm-vk/Makefile |  4 ++--
>  drivers/misc/bcm-vk/bcm_vk_dev.c | 13 +
>  2 files changed, 15 insertions(+), 2 deletions(-)

Acked-by: Randy Dunlap  # build-tested

Thanks.
-- 
~Randy



Re: [PATCH v2 3/3] arch/arm/configs: Enable VMSPLIT_2G in imx_v6_v7_defconfig

2021-01-28 Thread Shawn Guo
On Sun, Jan 17, 2021 at 10:03:01AM -0800, Alistair Francis wrote:
> The reMarkable2 requires VMSPLIT_2G, so lets set this in the
> imx_v6_v7_defconfig.

Hmm, why is VMSPLIT_2G required by reMarkable2?

Shawn

> 
> Signed-off-by: Alistair Francis 
> ---
>  arch/arm/configs/imx_v6_v7_defconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/configs/imx_v6_v7_defconfig 
> b/arch/arm/configs/imx_v6_v7_defconfig
> index 55674cb1ffce..fa9229616106 100644
> --- a/arch/arm/configs/imx_v6_v7_defconfig
> +++ b/arch/arm/configs/imx_v6_v7_defconfig
> @@ -29,6 +29,7 @@ CONFIG_SOC_IMX7D=y
>  CONFIG_SOC_IMX7ULP=y
>  CONFIG_SOC_VF610=y
>  CONFIG_SMP=y
> +CONFIG_VMSPLIT_2G=y
>  CONFIG_ARM_PSCI=y
>  CONFIG_HIGHMEM=y
>  CONFIG_FORCE_MAX_ZONEORDER=14
> -- 
> 2.29.2
> 


Re: [PATCH 1/1] process_madvise.2: Add process_madvise man page

2021-01-28 Thread Suren Baghdasaryan
On Thu, Jan 28, 2021 at 12:31 PM Michael Kerrisk (man-pages)
 wrote:
>
> Hello Suren,
>
> On 1/28/21 7:40 PM, Suren Baghdasaryan wrote:
> > On Thu, Jan 28, 2021 at 4:24 AM Michael Kerrisk (man-pages)
> >  wrote:
> >>
> >> Hello Suren,
> >>
> >> Thank you for writing this page! Some comments below.
> >
> > Thanks for the review!
> > Couple questions below and I'll respin the new version once they are 
> > clarified.
>
> Okay. See below.
>
> >> On Wed, 20 Jan 2021 at 21:36, Suren Baghdasaryan  wrote:
> >>>
>
> [...]
>
> Thanks for all the acks. That let's me know that you saw what I said.
>
> >>> RETURN VALUE
> >>> On success, process_madvise() returns the number of bytes advised. 
> >>> This
> >>> return value may be less than the total number of requested bytes, if 
> >>> an
> >>> error occurred. The caller should check return value to determine 
> >>> whether
> >>> a partial advice occurred.
> >>
> >> So there are three return values possible,
> >
> > Ok, I think I see your point. How about this instead:
>
> Well, I'm glad you saw it, because I forgot to finish it. But yes,
> you understood what I forgot to say.
>
> > RETURN VALUE
> >  On success, process_madvise() returns the number of bytes advised. This
> >  return value may be less than the total number of requested bytes, if 
> > an
> >  error occurred after some iovec elements were already processed. The 
> > caller
> >  should check the return value to determine whether a partial
> > advice occurred.
> >
> > On error, -1 is returned and errno is set appropriately.
>
> We recently standardized some wording here:
> s/appropriately/to indicate the error/.
>
>
> >>> +.PP
> >>> +The pointer
> >>> +.I iovec
> >>> +points to an array of iovec structures, defined in
> >>
> >> "iovec" should be formatted as
> >>
> >> .I iovec
> >
> > I think it is formatted that way above. What am I missing?
>
> But also in "an array of iovec structures"...
>
> > BTW, where should I be using .I vs .IR? I was looking for an answer
> > but could not find it.
>
> .B / .I == bold/italic this line
> .BR / .IR == alternate bold/italic with normal (Roman) font.
>
> So:
> .I iovec
> .I iovec ,   # so that comma is not italic
> .BR process_madvise ()
> etc.
>
> [...]
>
> >>> +.I iovec
> >>> +if one of its elements points to an invalid memory
> >>> +region in the remote process. No further elements will be
> >>> +processed beyond that point.
> >>> +.PP
> >>> +Permission to provide a hint to external process is governed by a
> >>> +ptrace access mode
> >>> +.B PTRACE_MODE_READ_REALCREDS
> >>> +check; see
> >>> +.BR ptrace (2)
> >>> +and
> >>> +.B CAP_SYS_ADMIN
> >>> +capability that caller should have in order to affect performance
> >>> +of an external process.
> >>
> >> The preceding sentence is garbled. Missing words?
> >
> > Maybe I worded it incorrectly. What I need to say here is that the
> > caller should have both PTRACE_MODE_READ_REALCREDS credentials and
> > CAP_SYS_ADMIN capability. The first part I shamelessly copy/pasted
> > from https://man7.org/linux/man-pages/man2/process_vm_readv.2.html and
> > tried adding the second one to it, obviously unsuccessfully. Any
> > advice on how to fix that?
>
> I think you already got pretty close. How about:
>
> [[
> Permission to provide a hint to another process is governed by a
> ptrace access mode
> .B PTRACE_MODE_READ_REALCREDS
> check (see
> BR ptrace (2));
> in addition, the caller must have the
> .B CAP_SYS_ADMIN
> capability.

In V2 I explanded a bit this part to explain why CAP_SYS_ADMIN is
needed. There were questions about that during my patch review which
adds this requirement
(https://lore.kernel.org/patchwork/patch/1363605), so I thought a
short explanation would be useful.

> ]]
>
> [...]
>
> >>> +.TP
> >>> +.B ESRCH
> >>> +No process with ID
> >>> +.I pidfd
> >>> +exists.
> >>
> >> Should this maybe be:
> >> [[
> >> The target process does not exist (i.e., it has terminated and
> >> been waited on).
> >> ]]
> >>
> >> See pidfd_send_signal(2).
> >
> > I "borrowed" mine from
> > https://man7.org/linux/man-pages/man2/process_vm_readv.2.html but
> > either one sounds good to me. Maybe for pidfd_send_signal the wording
> > about termination is more important. Anyway, it's up to you. Just let
> > me know which one to use.
>
> I think the pidfd_send_signal(2) wording fits better.
>
> [...]
>
> Thanks,
>
> Michael
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/


[PATCH v2 2/2] perf script: Support filtering by hex address

2021-01-28 Thread Jin Yao
Perf-script supports '-S' or '--symbol' options to only list the
trace records in given symbols. Symbol is typically a name
or hex address. If it's hex address, it is the start address of
one symbol.

While it would be useful if we can filter trace records by any hex
address (not only the start address of symbol). So now we support
filtering trace records by more conditions, such as:
- symbol name
- start address of symbol
- any hexadecimal address
- address range

The comparison order is defined as:

1. symbol name comparison
2. symbol start address comparison.
3. any hexadecimal address comparison.
4. address range comparison.

Let's see some examples:

root@kbl-ppc:~# ./perf script -S 0x9ca77308
perf 18123 [000] 6142863.075104:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075107:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075108: 10   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075156:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075158:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075159: 17   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [002] 6142863.075202:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [002] 6142863.075204:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [002] 6142863.075205: 16   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [003] 6142863.075250:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [003] 6142863.075252:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [003] 6142863.075253: 16   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])

Filter the traced records by hex address 0x9ca77308.

Easy to use, we support the hex address without '0x' prefix,
e.g.
./perf script -S 9ca77308

It has the same effect.

We also support to filter trace records by a address range.

root@kbl-ppc:~# ./perf script -S 9ca77304 --addr-range 16
perf 18123 [000] 6142863.075104:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075107:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075108: 10   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075109:273   cycles:  
9ca7730a native_write_msr+0xa ([kernel.kallsyms])
perf 18123 [001] 6142863.075156:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075158:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075159: 17   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075160:456   cycles:  
9ca7730f native_write_msr+0xf ([kernel.kallsyms])
perf 18123 [002] 6142863.075202:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [002] 6142863.075204:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [002] 6142863.075205: 16   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [002] 6142863.075206:436   cycles:  
9ca7730f native_write_msr+0xf ([kernel.kallsyms])
perf 18123 [003] 6142863.075250:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [003] 6142863.075252:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [003] 6142863.075253: 16   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [003] 6142863.075254:436   cycles:  
9ca7730f native_write_msr+0xf ([kernel.kallsyms])
perf 18123 [004] 6142863.075299:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [004] 6142863.075301:  1   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [004] 6142863.075302: 16   cycles:  
9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [004] 6142863.075303:431   

[v3] PCI: Avoid unsync of LTR mechanism configuration

2021-01-28 Thread mingchuang.qiao
From: Mingchuang Qiao 

In bus scan flow, the "LTR Mechanism Enable" bit of DEVCTL2 register is
configured in pci_configure_ltr(). If device and bridge both support LTR
mechanism, the "LTR Mechanism Enable" bit of device and bridge will be
enabled in DEVCTL2 register. And pci_dev->ltr_path will be set as 1.

If PCIe link goes down when device resets, the "LTR Mechanism Enable" bit
of bridge will change to 0 according to PCIe r5.0, sec 7.5.3.16. However,
the pci_dev->ltr_path value of bridge is still 1.

For following conditions, check and re-configure "LTR Mechanism Enable" bit
of bridge to make "LTR Mechanism Enable" bit mtach ltr_path value.
   -before configuring device's LTR for hot-remove/hot-add
   -before restoring device's DEVCTL2 register when restore device state

Signed-off-by: Mingchuang Qiao 
---
changes of v2
 -modify patch description
 -reconfigure bridge's LTR before restoring device DEVCTL2 register
changes of v3
 -call pci_reconfigure_bridge_ltr() in probe.c
---
 drivers/pci/pci.c   | 25 +
 drivers/pci/pci.h   |  1 +
 drivers/pci/probe.c | 13 ++---
 3 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b9fecc25d213..12b557c8f062 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1437,6 +1437,24 @@ static int pci_save_pcie_state(struct pci_dev *dev)
return 0;
 }
 
+void pci_reconfigure_bridge_ltr(struct pci_dev *dev)
+{
+#ifdef CONFIG_PCIEASPM
+   struct pci_dev *bridge;
+   u32 ctl;
+
+   bridge = pci_upstream_bridge(dev);
+   if (bridge && bridge->ltr_path) {
+   pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2, );
+   if (!(ctl & PCI_EXP_DEVCTL2_LTR_EN)) {
+   pci_dbg(bridge, "re-enabling LTR\n");
+   pcie_capability_set_word(bridge, PCI_EXP_DEVCTL2,
+PCI_EXP_DEVCTL2_LTR_EN);
+   }
+   }
+#endif
+}
+
 static void pci_restore_pcie_state(struct pci_dev *dev)
 {
int i = 0;
@@ -1447,6 +1465,13 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
if (!save_state)
return;
 
+   /*
+* Downstream ports reset the LTR enable bit when link goes down.
+* Check and re-configure the bit here before restoring device.
+* PCIe r5.0, sec 7.5.3.16.
+*/
+   pci_reconfigure_bridge_ltr(dev);
+
cap = (u16 *)_state->cap.data[0];
pcie_capability_write_word(dev, PCI_EXP_DEVCTL, cap[i++]);
pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 5c59365092fa..a660a01358c5 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -111,6 +111,7 @@ void pci_free_cap_save_buffers(struct pci_dev *dev);
 bool pci_bridge_d3_possible(struct pci_dev *dev);
 void pci_bridge_d3_update(struct pci_dev *dev);
 void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev);
+void pci_reconfigure_bridge_ltr(struct pci_dev *dev);
 
 static inline void pci_wakeup_event(struct pci_dev *dev)
 {
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 953f15abc850..fa6075093f3b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2132,9 +2132,16 @@ static void pci_configure_ltr(struct pci_dev *dev)
 * Complex and all intermediate Switches indicate support for LTR.
 * PCIe r4.0, sec 6.18.
 */
-   if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
-   ((bridge = pci_upstream_bridge(dev)) &&
- bridge->ltr_path)) {
+   if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
+   pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+PCI_EXP_DEVCTL2_LTR_EN);
+   dev->ltr_path = 1;
+   return;
+   }
+
+   bridge = pci_upstream_bridge(dev);
+   if (bridge && bridge->ltr_path) {
+   pci_reconfigure_bridge_ltr(dev);
pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
 PCI_EXP_DEVCTL2_LTR_EN);
dev->ltr_path = 1;
-- 
2.18.0


[PATCH v2 1/2] perf util: Change intlist int to unsigned long

2021-01-28 Thread Jin Yao
This is to let intlist support address.

One potential problem is it can't support negative number. But
so far, there is no such kind of use case.

Signed-off-by: Jin Yao 
---
 v2:
   New in v2.

 tools/perf/util/intlist.c | 27 ---
 tools/perf/util/intlist.h | 10 +-
 tools/perf/util/probe-event.c |  2 +-
 3 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/intlist.c b/tools/perf/util/intlist.c
index 84e5304e151a..934092199f89 100644
--- a/tools/perf/util/intlist.c
+++ b/tools/perf/util/intlist.c
@@ -13,7 +13,7 @@
 static struct rb_node *intlist__node_new(struct rblist *rblist __maybe_unused,
 const void *entry)
 {
-   int i = (int)((long)entry);
+   unsigned long i = (unsigned long)entry;
struct rb_node *rc = NULL;
struct int_node *node = malloc(sizeof(*node));
 
@@ -41,15 +41,20 @@ static void intlist__node_delete(struct rblist *rblist 
__maybe_unused,
 
 static int intlist__node_cmp(struct rb_node *rb_node, const void *entry)
 {
-   int i = (int)((long)entry);
+   unsigned long i = (unsigned long)entry;
struct int_node *node = container_of(rb_node, struct int_node, rb_node);
 
-   return node->i - i;
+   if (node->i > i)
+   return 1;
+   else if (node->i < i)
+   return -1;
+
+   return 0;
 }
 
-int intlist__add(struct intlist *ilist, int i)
+int intlist__add(struct intlist *ilist, unsigned long i)
 {
-   return rblist__add_node(>rblist, (void *)((long)i));
+   return rblist__add_node(>rblist, (void *)i);
 }
 
 void intlist__remove(struct intlist *ilist, struct int_node *node)
@@ -58,7 +63,7 @@ void intlist__remove(struct intlist *ilist, struct int_node 
*node)
 }
 
 static struct int_node *__intlist__findnew(struct intlist *ilist,
-  int i, bool create)
+  unsigned long i, bool create)
 {
struct int_node *node = NULL;
struct rb_node *rb_node;
@@ -67,9 +72,9 @@ static struct int_node *__intlist__findnew(struct intlist 
*ilist,
return NULL;
 
if (create)
-   rb_node = rblist__findnew(>rblist, (void *)((long)i));
+   rb_node = rblist__findnew(>rblist, (void *)i);
else
-   rb_node = rblist__find(>rblist, (void *)((long)i));
+   rb_node = rblist__find(>rblist, (void *)i);
 
if (rb_node)
node = container_of(rb_node, struct int_node, rb_node);
@@ -77,12 +82,12 @@ static struct int_node *__intlist__findnew(struct intlist 
*ilist,
return node;
 }
 
-struct int_node *intlist__find(struct intlist *ilist, int i)
+struct int_node *intlist__find(struct intlist *ilist, unsigned long i)
 {
return __intlist__findnew(ilist, i, false);
 }
 
-struct int_node *intlist__findnew(struct intlist *ilist, int i)
+struct int_node *intlist__findnew(struct intlist *ilist, unsigned long i)
 {
return __intlist__findnew(ilist, i, true);
 }
@@ -93,7 +98,7 @@ static int intlist__parse_list(struct intlist *ilist, const 
char *s)
int err;
 
do {
-   long value = strtol(s, , 10);
+   unsigned long value = strtol(s, , 10);
err = -EINVAL;
if (*sep != ',' && *sep != '\0')
break;
diff --git a/tools/perf/util/intlist.h b/tools/perf/util/intlist.h
index 5c19ee001299..e336b174d0c7 100644
--- a/tools/perf/util/intlist.h
+++ b/tools/perf/util/intlist.h
@@ -9,7 +9,7 @@
 
 struct int_node {
struct rb_node rb_node;
-   int i;
+   unsigned long i;
void *priv;
 };
 
@@ -21,13 +21,13 @@ struct intlist *intlist__new(const char *slist);
 void intlist__delete(struct intlist *ilist);
 
 void intlist__remove(struct intlist *ilist, struct int_node *in);
-int intlist__add(struct intlist *ilist, int i);
+int intlist__add(struct intlist *ilist, unsigned long i);
 
 struct int_node *intlist__entry(const struct intlist *ilist, unsigned int idx);
-struct int_node *intlist__find(struct intlist *ilist, int i);
-struct int_node *intlist__findnew(struct intlist *ilist, int i);
+struct int_node *intlist__find(struct intlist *ilist, unsigned long i);
+struct int_node *intlist__findnew(struct intlist *ilist, unsigned long i);
 
-static inline bool intlist__has_entry(struct intlist *ilist, int i)
+static inline bool intlist__has_entry(struct intlist *ilist, unsigned long i)
 {
return intlist__find(ilist, i) != NULL;
 }
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8eae2afff71a..137f19c5b686 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1074,7 +1074,7 @@ static int __show_line_range(struct line_range *lr, const 
char *module,
}
 
intlist__for_each_entry(ln, lr->line_list) {
-   for (; ln->i > l; l++) {
+   for (; ln->i > 

Re: [PATCH v2 1/1] mm/madvise: replace ptrace attach requirement for process_madvise

2021-01-28 Thread Suren Baghdasaryan
On Thu, Jan 28, 2021 at 11:51 AM Suren Baghdasaryan  wrote:
>
> On Tue, Jan 26, 2021 at 5:52 AM 'Michal Hocko' via kernel-team
>  wrote:
> >
> > On Wed 20-01-21 14:17:39, Jann Horn wrote:
> > > On Wed, Jan 13, 2021 at 3:22 PM Michal Hocko  wrote:
> > > > On Tue 12-01-21 09:51:24, Suren Baghdasaryan wrote:
> > > > > On Tue, Jan 12, 2021 at 9:45 AM Oleg Nesterov  wrote:
> > > > > >
> > > > > > On 01/12, Michal Hocko wrote:
> > > > > > >
> > > > > > > On Mon 11-01-21 09:06:22, Suren Baghdasaryan wrote:
> > > > > > >
> > > > > > > > What we want is the ability for one process to influence 
> > > > > > > > another process
> > > > > > > > in order to optimize performance across the entire system while 
> > > > > > > > leaving
> > > > > > > > the security boundary intact.
> > > > > > > > Replace PTRACE_MODE_ATTACH with a combination of 
> > > > > > > > PTRACE_MODE_READ
> > > > > > > > and CAP_SYS_NICE. PTRACE_MODE_READ to prevent leaking ASLR 
> > > > > > > > metadata
> > > > > > > > and CAP_SYS_NICE for influencing process performance.
> > > > > > >
> > > > > > > I have to say that ptrace modes are rather obscure to me. So I 
> > > > > > > cannot
> > > > > > > really judge whether MODE_READ is sufficient. My understanding has
> > > > > > > always been that this is requred to RO access to the address 
> > > > > > > space. But
> > > > > > > this operation clearly has a visible side effect. Do we have any 
> > > > > > > actual
> > > > > > > documentation for the existing modes?
> > > > > > >
> > > > > > > I would be really curious to hear from Jann and Oleg (now Cced).
> > > > > >
> > > > > > Can't comment, sorry. I never understood these security checks and 
> > > > > > never tried.
> > > > > > IIUC only selinux/etc can treat ATTACH/READ differently and I have 
> > > > > > no idea what
> > > > > > is the difference.
> > >
> > > Yama in particular only does its checks on ATTACH and ignores READ,
> > > that's the difference you're probably most likely to encounter on a
> > > normal desktop system, since some distros turn Yama on by default.
> > > Basically the idea there is that running "gdb -p $pid" or "strace -p
> > > $pid" as a normal user will usually fail, but reading /proc/$pid/maps
> > > still works; so you can see things like detailed memory usage
> > > information and such, but you're not supposed to be able to directly
> > > peek into a running SSH client and inject data into the existing SSH
> > > connection, or steal the cryptographic keys for the current
> > > connection, or something like that.
> > >
> > > > > I haven't seen a written explanation on ptrace modes but when I
> > > > > consulted Jann his explanation was:
> > > > >
> > > > > PTRACE_MODE_READ means you can inspect metadata about processes with
> > > > > the specified domain, across UID boundaries.
> > > > > PTRACE_MODE_ATTACH means you can fully impersonate processes with the
> > > > > specified domain, across UID boundaries.
> > > >
> > > > Maybe this would be a good start to document expectations. Some more
> > > > practical examples where the difference is visible would be great as
> > > > well.
> > >
> > > Before documenting the behavior, it would be a good idea to figure out
> > > what to do with perf_event_open(). That one's weird in that it only
> > > requires PTRACE_MODE_READ, but actually allows you to sample stuff
> > > like userspace stack and register contents (if perf_event_paranoid is
> > > 1 or 2). Maybe for SELinux things (and maybe also for Yama), there
> > > should be a level in between that allows fully inspecting the process
> > > (for purposes like profiling) but without the ability to corrupt its
> > > memory or registers or things like that. Or maybe perf_event_open()
> > > should just use the ATTACH mode.
> >
> > Thanks for the clarification. I still cannot say I would have a good
> > mental picture. Having something in Documentation/core-api/ sounds
> > really needed. Wrt to perf_event_open it sounds really odd it can do
> > more than other places restrict indeed. Something for the respective
> > maintainer but I strongly suspect people simply copy the pattern from
> > other places because the expected semantic is not really clear.
> >
>
> Sorry, back to the matters of this patch. Are there any actionable
> items for me to take care of before it can be accepted? The only
> request from Andrew to write a man page is being worked on at
> https://lore.kernel.org/linux-mm/20210120202337.1481402-1-sur...@google.com/
> and I'll follow up with the next version. I also CC'ed stable@ for
> this to be included into 5.10 per Andrew's request. That CC was lost
> at some point, so CC'ing again.
>
> I do not see anything else on this patch to fix. Please chime in if
> there are any more concerns, otherwise I would ask Andrew to take it
> into mm-tree and stable@ to apply it to 5.10.
> Thanks!

process_madvise man page V2 is posted at:
https://lore.kernel.org/linux-mm/20210129070340.566340-1-sur...@google.com/

>
>
> > --
> > 

linux-next: build warning after merge of the char-misc tree

2021-01-28 Thread Stephen Rothwell
Hi all,

After merging the char-misc tree, today's linux-next build (htmldocs)
produced this warning:

Documentation/driver-api/index.rst:14: WARNING: toctree contains reference to 
nonexisting document 'driver-api/pti_intel_mid'

Introduced by commit

  8ba59e9dee31 ("misc: pti: Remove driver for deprecated platform")

-- 
Cheers,
Stephen Rothwell


pgp6NGnW_iehq.pgp
Description: OpenPGP digital signature


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Hannes Reinecke

On 1/29/21 4:07 AM, Chao Leng wrote:



On 2021/1/29 9:42, Sagi Grimberg wrote:



You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/ 


Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

ok, I see.


nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.

The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;


Where does 'old' pointing to?


This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
 ns != old;
 ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.


No. nvme_next_ns() will return NULL.

Cheers,

Hannes
--
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


Re: [PATCH] crypto: octeontx2 - Add dependency on NET_VENDOR_MARVELL

2021-01-28 Thread Randy Dunlap
On 1/28/21 9:48 PM, Herbert Xu wrote:
> On Mon, Jan 25, 2021 at 09:41:12AM -0800, Randy Dunlap wrote:
>> on x86_64:
>>
>> ld: drivers/crypto/marvell/octeontx2/otx2_cptpf_main.o: in function 
>> `cptpf_flr_wq_handler':
>> otx2_cptpf_main.c:(.text+0x2b): undefined reference to 
>> `otx2_mbox_alloc_msg_rsp'
> 
> Thanks for the report.  The issue is that the crypto driver depends
> on code that sits under net so if that option is off then you'll end
> up with these errors.
> 
> ---8<---
> The crypto octeontx2 driver depends on the mbox code in the network
> tree.  It tries to select the MBOX Kconfig option but that option
> itself depends on many other options which are not selected, e.g.,
> CONFIG_NET_VENDOR_MARVELL.  It would be inappropriate to select them
> all as randomly prompting the user for network options which would
> oterhwise be disabled just because a crypto driver has been enabled
> makes no sense.
> 
> This patch fixes this by adding a dependency on NET_VENDOR_MARVELL.
> This makes the crypto driver invisible if the network option is off.
> 
> If the crypto driver must be visible even without the network stack
> then the shared mbox code should be moved out of drivers/net.
> 
> Reported-by: Randy Dunlap 
> Reported-by: kernel test robot 
> Fixes: 5e8ce8334734 ("crypto: marvell - add Marvell OcteonTX2 CPT...")
> Signed-off-by: Herbert Xu 

Thanks, Herbert.

Acked-by: Randy Dunlap  # build-tested


> diff --git a/drivers/crypto/marvell/Kconfig b/drivers/crypto/marvell/Kconfig
> index 2efbd79180ce..a188ad1fadd3 100644
> --- a/drivers/crypto/marvell/Kconfig
> +++ b/drivers/crypto/marvell/Kconfig
> @@ -41,6 +41,7 @@ config CRYPTO_DEV_OCTEONTX2_CPT
>   depends on ARM64 || COMPILE_TEST
>   depends on PCI_MSI && 64BIT
>   depends on CRYPTO_LIB_AES
> + depends on NET_VENDOR_MARVELL
>   select OCTEONTX2_MBOX
>   select CRYPTO_DEV_MARVELL
>   select CRYPTO_SKCIPHER
> 


-- 
~Randy



linux-next: build warning after merge of the hwmon-staging tree

2021-01-28 Thread Stephen Rothwell
Hi all,

After merging the hwmon-staging tree, today's linux-next build (htmldocs)
produced this warning:

Documentation/hwmon/max16601.rst:94: WARNING: Malformed table.
Text in column margin in table line 39.

=== ===
in1_label   "vin1"
in1_input   VCORE input voltage.
in1_alarm   Input voltage alarm.

in2_label   "vout1"
in2_input   VCORE output voltage.
in2_alarm   Output voltage alarm.

curr1_label "iin1"
curr1_input VCORE input current, derived from duty cycle and output
current.
curr1_max   Maximum input current.
curr1_max_alarm Current high alarm.

curr[P+2]_label "iin1.P"
curr[P+2]_input VCORE phase P input current.

curr[N+2]_label "iin2"
curr[N+2]_input VCORE input current, derived from sensor element.
'N' is the number of enabled/populated phases.

curr[N+3]_label "iin3"
curr[N+3]_input VSA input current.

curr[N+4]_label "iout1"
curr[N+4]_input VCORE output current.
curr[N+4]_crit  Critical output current.
curr[N+4]_crit_alarmOutput current critical alarm.
curr[N+4]_max   Maximum output current.
curr[N+4]_max_alarm Output current high alarm.

curr[N+P+5]_label   "iout1.P"
curr[N+P+5]_input   VCORE phase P output current.

curr[2*N+5]_label   "iout3"
curr[2*N+5]_input   VSA output current.
curr[2*N+5]_highest Historical maximum VSA output current.
curr[2*N+5]_reset_history
Write any value to reset curr21_highest.
curr[2*N+5]_critCritical output current.
curr[2*N+5]_crit_alarm  Output current critical alarm.
curr[2*N+5]_max Maximum output current.
curr[2*N+5]_max_alarm   Output current high alarm.

power1_label"pin1"
power1_inputInput power, derived from duty cycle and output current.
power1_alarmInput power alarm.

power2_label"pin2"
power2_inputInput power, derived from input current sensor.

power3_label"pout"
power3_inputOutput power.

temp1_input VCORE temperature.
temp1_crit  Critical high temperature.
temp1_crit_alarmChip temperature critical high alarm.
temp1_max   Maximum temperature.
temp1_max_alarm Chip temperature high alarm.

temp2_input TSENSE_0 temperature
temp3_input TSENSE_1 temperature
temp4_input TSENSE_2 temperature
temp5_input TSENSE_3 temperature

temp6_input VSA temperature.
temp6_crit  Critical high temperature.
temp6_crit_alarmChip temperature critical high alarm.
temp6_max   Maximum temperature.
temp6_max_alarm Chip temperature high alarm.
=== ===

Introduced by commit

  90b0f71d62df ("hwmon: (pmbus/max16601) Determine and use number of populated 
phases")

-- 
Cheers,
Stephen Rothwell


pgpcC5sOxxZIs.pgp
Description: OpenPGP digital signature


Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation

2021-01-28 Thread Mike Rapoport
On Thu, Jan 28, 2021 at 07:28:57AM -0800, James Bottomley wrote:
> On Thu, 2021-01-28 at 14:01 +0100, Michal Hocko wrote:
> > On Thu 28-01-21 11:22:59, Mike Rapoport wrote:
> [...]
> > > One of the major pushbacks on the first RFC [1] of the concept was
> > > about the direct map fragmentation. I tried really hard to find
> > > data that shows what is the performance difference with different
> > > page sizes in the direct map and I didn't find anything.
> > > 
> > > So presuming that large pages do provide advantage the first
> > > implementation of secretmem used PMD_ORDER allocations to amortise
> > > the effect of the direct map fragmentation and then handed out 4k
> > > pages at each fault. In addition there was an option to reserve a
> > > finite pool at boot time and limit secretmem allocations only to
> > > that pool.
> > > 
> > > At some point David suggested to use CMA to improve overall
> > > flexibility [3], so I switched secretmem to use CMA.
> > > 
> > > Now, with the data we have at hand (my benchmarks and Intel's
> > > report David mentioned) I'm even not sure this whole pooling even
> > > required.
> > 
> > I would still like to understand whether that data is actually
> > representative. With some underlying reasoning rather than I have run
> > these XYZ benchmarks and numbers do not look terrible.
> 
> My theory, and the reason I made Mike run the benchmarks, is that our
> fear of TLB miss has been alleviated by CPU speculation advances over
> the years.  You can appreciate this if you think that both Intel and
> AMD have increased the number of levels in the page table to
> accommodate larger virtual memory size 5 instead of 3.  That increases
> the length of the page walk nearly 2x in a physical system and even
> more in a virtual system.  Unless this were massively optimized,
> systems would have slowed down significantly.  Using 2M pages only
> eliminates one level and 2G pages eliminates 2, so I theorized that
> actually fragmentation wouldn't be the significant problem we once
> thought it was and asked Mike to benchmark it.
> 
> The benchmarks show that indeed, it isn't a huge change in the data TLB
> miss time, I suspect because data is nicely continuous nowadays and the
> prediction that goes into the CPU optimizations quite easy.  ITLB
> fragmentation actually seems to be quite a bit worse, likely because we
> still don't have branch prediction down to an exact science.

Another thing is that normally useful work done by userspace so data
accesses are dominated by userspace and any change in dTLB miss rate for
kernel data accesses is only a small fraction of all misses.

> James
> 
> 

-- 
Sincerely yours,
Mike.


[PATCH v2 1/1] process_madvise.2: Add process_madvise man page

2021-01-28 Thread Suren Baghdasaryan
Initial version of process_madvise(2) manual page. Initial text was
extracted from [1], amended after fix [2] and more details added using
man pages of madvise(2) and process_vm_read(2) as examples. It also
includes the changes to required permission proposed in [3].

[1] https://lore.kernel.org/patchwork/patch/1297933/
[2] https://lkml.org/lkml/2020/12/8/1282
[3] 
https://patchwork.kernel.org/project/selinux/patch/2021070622.2613577-1-sur...@google.com/#23888311

Signed-off-by: Suren Baghdasaryan 
---
changes in v2:
- Changed description of MADV_COLD per Michal Hocko's suggestion
- Appled fixes suggested by Michael Kerrisk

NAME
process_madvise - give advice about use of memory to a process

SYNOPSIS
#include 

ssize_t process_madvise(int pidfd,
   const struct iovec *iovec,
   unsigned long vlen,
   int advice,
   unsigned int flags);

DESCRIPTION
The process_madvise() system call is used to give advice or directions
to the kernel about the address ranges of other process as well as of
the calling process. It provides the advice to address ranges of process
described by iovec and vlen. The goal of such advice is to improve system
or application performance.

The pidfd argument is a PID file descriptor (see pidofd_open(2)) that
specifies the process to which the advice is to be applied.

The pointer iovec points to an array of iovec structures, defined in
 as:

struct iovec {
void  *iov_base;/* Starting address */
size_t iov_len; /* Number of bytes to transfer */
};

The iovec structure describes address ranges beginning at iov_base address
and with the size of iov_len bytes.

The vlen represents the number of elements in the iovec structure.

The advice argument is one of the values listed below.

  Linux-specific advice values
The following Linux-specific advice values have no counterparts in the
POSIX-specified posix_madvise(3), and may or may not have counterparts
in the madvise(2) interface available on other implementations.

MADV_COLD (since Linux 5.4.1)
Deactive a given range of pages which will make them a more probable
reclaim target should there be a memory pressure. This is a non-
destructive operation. The advice might be ignored for some pages in
the range when it is not applicable.

MADV_PAGEOUT (since Linux 5.4.1)
Reclaim a given range of pages. This is done to free up memory occupied
by these pages. If a page is anonymous it will be swapped out. If a
page is file-backed and dirty it will be written back to the backing
storage. The advice might be ignored for some pages in the range when
it is not applicable.

The flags argument is reserved for future use; currently, this argument
must be specified as 0.

The value specified in the vlen argument must be less than or equal to
IOV_MAX (defined in  or accessible via the call
sysconf(_SC_IOV_MAX)).

The vlen and iovec arguments are checked before applying any hints. If
the vlen is too big, or iovec is invalid, an error will be returned
immediately.

The hint might be applied to a part of iovec if one of its elements points
to an invalid memory region in the remote process. No further elements will
be processed beyond that point.

Permission to provide a hint to another process is governed by a ptrace
access mode PTRACE_MODE_READ_REALCREDS check (see ptrace(2)); in addition,
the caller must have the CAP_SYS_ADMIN capability due to performance
implications of applying the hint.

RETURN VALUE
On success, process_madvise() returns the number of bytes advised. This
return value may be less than the total number of requested bytes, if an
error occurred after some iovec elements were already processed. The caller
should check the return value to determine whether a partial advice
occurred.

On error, -1 is returned and errno is set to indicate the error.

ERRORS
EFAULT The memory described by iovec is outside the accessible address
   space of the process referred to by pidfd.
EINVAL flags is not 0.
EINVAL The sum of the iov_len values of iovec overflows a ssize_t value.
EINVAL vlen is too large.
ENOMEM Could not allocate memory for internal copies of the iovec
   structures.
EPERM The caller does not have permission to access the address space of
  the process pidfd.
ESRCH The target process does not exist (i.e., it has terminated and been
  waited on).
EBADF pidfd is not a valid PID file descriptor.

VERSIONS
This system call first appeared in Linux 5.10, Support for this system
call is optional, depending on the setting of the CONFIG_ADVISE_SYSCALLS
configuration option.

SEE ALSO
madvise(2), 

Re: [PATCH 5/5] arm64: dts: meson: add initial device-tree for ODROID-HC4

2021-01-28 Thread Christian Hewitt


> On 29 Jan 2021, at 10:51 am, Christian Hewitt  
> wrote:
> 
> ODROID-HC4 is a derivative of the C4 with minor differences:
> 
> - 128MB SPI-NOR flash

^ should be 16MB, I forgot to amend. I can send a v2 series if needed.

HC4:~ # dmesg | grep spi
[0.453235] spi-nor spi0.0: xt25f128b (16384 Kbytes)

Christian



Re: [PATCH] memory: tegra: Remove calls to dev_pm_opp_set_clkname()

2021-01-28 Thread Krzysztof Kozlowski
On Wed, Jan 27, 2021 at 03:46:22PM +0530, Viresh Kumar wrote:
> There is no point calling dev_pm_opp_set_clkname() with the "name"
> parameter set to NULL, this is already done by the OPP core at setup
> time and should work as it is.
> 
> Signed-off-by: Viresh Kumar 
> 
> ---
> V2: Update tegra124 as well.
> 
> Krzysztof, please take this through your tree, it doesn't have any
> dependency in the OPP tree.
> ---
>  drivers/memory/tegra/tegra124-emc.c | 13 ++---
>  drivers/memory/tegra/tegra20-emc.c  | 13 ++---
>  drivers/memory/tegra/tegra30-emc.c  | 13 ++---
>  3 files changed, 6 insertions(+), 33 deletions(-)

Thanks, applied.

Best regards,
Krzysztof



test

2021-01-28 Thread dzp
test


Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page

2021-01-28 Thread Muchun Song
On Fri, Jan 29, 2021 at 9:04 AM Mike Kravetz  wrote:
>
> On 1/28/21 4:37 AM, Muchun Song wrote:
> > On Wed, Jan 27, 2021 at 6:36 PM David Hildenbrand  wrote:
> >>
> >> On 26.01.21 16:56, David Hildenbrand wrote:
> >>> On 26.01.21 16:34, Oscar Salvador wrote:
>  On Tue, Jan 26, 2021 at 04:10:53PM +0100, David Hildenbrand wrote:
> > The real issue seems to be discarding the vmemmap on any memory that has
> > movability constraints - CMA and ZONE_MOVABLE; otherwise, as discussed, 
> > we
> > can reuse parts of the thingy we're freeing for the vmemmap. Not that it
> > would be ideal: that once-a-huge-page thing will never ever be a huge 
> > page
> > again - but if it helps with OOM in corner cases, sure.
> 
>  Yes, that is one way, but I am not sure how hard would it be to 
>  implement.
>  Plus the fact that as you pointed out, once that memory is used for 
>  vmemmap
>  array, we cannot use it again.
>  Actually, we would fragment the memory eventually?
> 
> > Possible simplification: don't perform the optimization for now with 
> > free
> > huge pages residing on ZONE_MOVABLE or CMA. Certainly not perfect: what
> > happens when migrating a huge page from ZONE_NORMAL to 
> > (ZONE_MOVABLE|CMA)?
> 
>  But if we do not allow theose pages to be in ZONE_MOVABLE or CMA, there 
>  is no
>  point in migrate them, right?
> >>>
> >>> Well, memory unplug "could" still work and migrate them and
> >>> alloc_contig_range() "could in the future" still want to migrate them
> >>> (virtio-mem, gigantic pages, powernv memtrace). Especially, the latter
> >>> two don't work with ZONE_MOVABLE/CMA. But, I mean, it would be fair
> >>> enough to say "there are no guarantees for
> >>> alloc_contig_range()/offline_pages() with ZONE_NORMAL, so we can break
> >>> these use cases when a magic switch is flipped and make these pages
> >>> non-migratable anymore".
> >>>
> >>> I assume compaction doesn't care about huge pages either way, not sure
> >>> about numa balancing etc.
> >>>
> >>>
> >>> However, note that there is a fundamental issue with any approach that
> >>> allocates a significant amount of unmovable memory for user-space
> >>> purposes (excluding CMA allocations for unmovable stuff, CMA is
> >>> special): pairing it with ZONE_MOVABLE becomes very tricky as your user
> >>> space might just end up eating all kernel memory, although the system
> >>> still looks like there is plenty of free memory residing in
> >>> ZONE_MOVABLE. I mentioned that in the context of secretmem in a reduced
> >>> form as well.
> >>>
> >>> We theoretically have that issue with dynamic allocation of gigantic
> >>> pages, but it's something a user explicitly/rarely triggers and it can
> >>> be documented to cause problems well enough. We'll have the same issue
> >>> with GUP+ZONE_MOVABLE that Pavel is fixing right now - but GUP is
> >>> already known to be broken in various ways and that it has to be treated
> >>> in a special way. I'd like to limit the nasty corner cases.
> >>>
> >>> Of course, we could have smart rules like "don't online memory to
> >>> ZONE_MOVABLE automatically when the magic switch is active". That's just
> >>> ugly, but could work.
> >>>
> >>
> >> Extending on that, I just discovered that only x86-64, ppc64, and arm64
> >> really support hugepage migration.
> >>
> >> Maybe one approach with the "magic switch" really would be to disable
> >> hugepage migration completely in hugepage_migration_supported(), and
> >> consequently making hugepage_movable_supported() always return false.
> >>
> >> Huge pages would never get placed onto ZONE_MOVABLE/CMA and cannot be
> >> migrated. The problem I describe would apply (careful with using
> >> ZONE_MOVABLE), but well, it can at least be documented.
> >
> > Thanks for your explanation.
> >
> > All thinking seems to be introduced by encountering OOM. :-(
>
> Yes.  Or, I think about it as the problem of not being able to dissolve (free
> to buddy) a hugetlb page.  We can not dissolve because we can not allocate
> vmemmap for all sumpages.
>
> > In order to move forward and free the hugepage. We should add some
> > restrictions below.
> >
> > 1. Only free the hugepage which is allocated from the ZONE_NORMAL.
> Corrected: Only vmemmap optimize hugepages in ZONE_NORMAL
>
> > 2. Disable hugepage migration when this feature is enabled.
>
> I am not sure if we want to fully disable migration.  I may be 
> misunderstanding
> but the thought was to prevent migration between some movability types.  It
> seems we should be able to migrate form ZONE_NORMAL to ZONE_NORMAL.
>
> Also, if we do allow huge pages without vmemmap optimization in MOVABLE or CMA
> then we should allow those to be migrated to NORMAL?  Or is there a reason why
> we should prevent that.
>
> > 3. Using GFP_ATOMIC to allocate vmemmap pages firstly (it can reduce
> >memory fragmentation), if it fails, we use part of the hugepage to

[PATCH 3/5] arm64: dts: meson: convert meson-sm1-odroid-c4 to dtsi

2021-01-28 Thread Christian Hewitt
Convert the ODRIOD-C4 dts to meson-sm1-odroid.dtsi and C4 board dts in
preparation for adding additional C4 family boards.

Signed-off-by: Christian Hewitt 
---
 .../boot/dts/amlogic/meson-sm1-odroid-c4.dts  | 427 +
 .../boot/dts/amlogic/meson-sm1-odroid.dtsi| 441 ++
 2 files changed, 442 insertions(+), 426 deletions(-)
 create mode 100644 arch/arm64/boot/dts/amlogic/meson-sm1-odroid.dtsi

diff --git a/arch/arm64/boot/dts/amlogic/meson-sm1-odroid-c4.dts 
b/arch/arm64/boot/dts/amlogic/meson-sm1-odroid-c4.dts
index eadd75e6e067..b2a4e823c1d8 100644
--- a/arch/arm64/boot/dts/amlogic/meson-sm1-odroid-c4.dts
+++ b/arch/arm64/boot/dts/amlogic/meson-sm1-odroid-c4.dts
@@ -5,34 +5,12 @@
 
 /dts-v1/;
 
-#include "meson-sm1.dtsi"
-#include 
-#include 
-#include 
+#include "meson-sm1-odroid.dtsi"
 
 / {
compatible = "hardkernel,odroid-c4", "amlogic,sm1";
model = "Hardkernel ODROID-C4";
 
-   aliases {
-   serial0 = _AO;
-   ethernet0 = 
-   };
-
-   chosen {
-   stdout-path = "serial0:115200n8";
-   };
-
-   memory@0 {
-   device_type = "memory";
-   reg = <0x0 0x0 0x0 0x4000>;
-   };
-
-   emmc_pwrseq: emmc-pwrseq {
-   compatible = "mmc-pwrseq-emmc";
-   reset-gpios = < BOOT_12 GPIO_ACTIVE_LOW>;
-   };
-
leds {
compatible = "gpio-leds";
 
@@ -45,96 +23,6 @@
};
};
 
-   tflash_vdd: regulator-tflash_vdd {
-   compatible = "regulator-fixed";
-
-   regulator-name = "TFLASH_VDD";
-   regulator-min-microvolt = <330>;
-   regulator-max-microvolt = <330>;
-
-   gpio = <_ao GPIOAO_3 GPIO_OPEN_DRAIN>;
-   enable-active-high;
-   regulator-always-on;
-   };
-
-   tf_io: gpio-regulator-tf_io {
-   compatible = "regulator-gpio";
-
-   regulator-name = "TF_IO";
-   regulator-min-microvolt = <180>;
-   regulator-max-microvolt = <330>;
-
-   gpios = <_ao GPIOAO_6 GPIO_ACTIVE_HIGH>;
-   gpios-states = <0>;
-
-   states = <330 0>,
-<180 1>;
-   };
-
-   flash_1v8: regulator-flash_1v8 {
-   compatible = "regulator-fixed";
-   regulator-name = "FLASH_1V8";
-   regulator-min-microvolt = <180>;
-   regulator-max-microvolt = <180>;
-   vin-supply = <_3v3>;
-   regulator-always-on;
-   };
-
-   main_12v: regulator-main_12v {
-   compatible = "regulator-fixed";
-   regulator-name = "12V";
-   regulator-min-microvolt = <1200>;
-   regulator-max-microvolt = <1200>;
-   regulator-always-on;
-   };
-
-   vcc_5v: regulator-vcc_5v {
-   compatible = "regulator-fixed";
-   regulator-name = "5V";
-   regulator-min-microvolt = <500>;
-   regulator-max-microvolt = <500>;
-   regulator-always-on;
-   vin-supply = <_12v>;
-   };
-
-   vcc_1v8: regulator-vcc_1v8 {
-   compatible = "regulator-fixed";
-   regulator-name = "VCC_1V8";
-   regulator-min-microvolt = <180>;
-   regulator-max-microvolt = <180>;
-   vin-supply = <_3v3>;
-   regulator-always-on;
-   };
-
-   vcc_3v3: regulator-vcc_3v3 {
-   compatible = "regulator-fixed";
-   regulator-name = "VCC_3V3";
-   regulator-min-microvolt = <330>;
-   regulator-max-microvolt = <330>;
-   vin-supply = <_3v3>;
-   regulator-always-on;
-   /* FIXME: actually controlled by VDDCPU_B_EN */
-   };
-
-   vddcpu: regulator-vddcpu {
-   /*
-* MP8756GD Regulator.
-*/
-   compatible = "pwm-regulator";
-
-   regulator-name = "VDDCPU";
-   regulator-min-microvolt = <721000>;
-   regulator-max-microvolt = <1022000>;
-
-   vin-supply = <_12v>;
-
-   pwms = <_AO_cd 1 1250 0>;
-   pwm-dutycycle-range = <100 0>;
-
-   regulator-boot-on;
-   regulator-always-on;
-   };
-
hub_5v: regulator-hub_5v {
compatible = "regulator-fixed";
regulator-name = "HUB_5V";
@@ -147,215 +35,12 @@
enable-active-high;
};
 
-   usb_pwr_en: regulator-usb_pwr_en {
-   compatible = "regulator-fixed";
-   regulator-name = "USB_PWR_EN";
-   regulator-min-microvolt = <500>;
-   regulator-max-microvolt = <500>;
-   vin-supply = <_5v>;
-
-   /* Connected to the microUSB port 

[PATCH 2/5] arm64: dts: meson: sort Amlogic dtb Makefile

2021-01-28 Thread Christian Hewitt
Sort the Makefile before adding new SM1 devices.

Signed-off-by: Christian Hewitt 
---
 arch/arm64/boot/dts/amlogic/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/amlogic/Makefile 
b/arch/arm64/boot/dts/amlogic/Makefile
index dce41cd3f347..f3c8a85fe987 100644
--- a/arch/arm64/boot/dts/amlogic/Makefile
+++ b/arch/arm64/boot/dts/amlogic/Makefile
@@ -45,7 +45,7 @@ dtb-$(CONFIG_ARCH_MESON) += meson-gxm-rbox-pro.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-gxm-s912-libretech-pc.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-gxm-vega-s96.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-gxm-wetek-core2.dtb
-dtb-$(CONFIG_ARCH_MESON) += meson-sm1-sei610.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-sm1-khadas-vim3l.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-sm1-odroid-c4.dtb
+dtb-$(CONFIG_ARCH_MESON) += meson-sm1-sei610.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-a1-ad401.dtb
-- 
2.17.1



[PATCH 5/5] arm64: dts: meson: add initial device-tree for ODROID-HC4

2021-01-28 Thread Christian Hewitt
ODROID-HC4 is a derivative of the C4 with minor differences:

- 128MB SPI-NOR flash
- 2x SATA ports via ASM1061 PCIe to SATA controller
- 7-pin header with SPI and I2C for 1-inch OLED display and RTC
- 1x USB 2.0 host port

Signed-off-by: Christian Hewitt 
---
 arch/arm64/boot/dts/amlogic/Makefile  |  1 +
 .../boot/dts/amlogic/meson-sm1-odroid-hc4.dts | 96 +++
 2 files changed, 97 insertions(+)
 create mode 100644 arch/arm64/boot/dts/amlogic/meson-sm1-odroid-hc4.dts

diff --git a/arch/arm64/boot/dts/amlogic/Makefile 
b/arch/arm64/boot/dts/amlogic/Makefile
index f3c8a85fe987..78a569d7fa20 100644
--- a/arch/arm64/boot/dts/amlogic/Makefile
+++ b/arch/arm64/boot/dts/amlogic/Makefile
@@ -47,5 +47,6 @@ dtb-$(CONFIG_ARCH_MESON) += meson-gxm-vega-s96.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-gxm-wetek-core2.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-sm1-khadas-vim3l.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-sm1-odroid-c4.dtb
+dtb-$(CONFIG_ARCH_MESON) += meson-sm1-odroid-hc4.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-sm1-sei610.dtb
 dtb-$(CONFIG_ARCH_MESON) += meson-a1-ad401.dtb
diff --git a/arch/arm64/boot/dts/amlogic/meson-sm1-odroid-hc4.dts 
b/arch/arm64/boot/dts/amlogic/meson-sm1-odroid-hc4.dts
new file mode 100644
index ..bf15700c4b15
--- /dev/null
+++ b/arch/arm64/boot/dts/amlogic/meson-sm1-odroid-hc4.dts
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+/*
+ * Copyright (c) 2020 Dongjin Kim 
+ */
+
+/dts-v1/;
+
+#include "meson-sm1-odroid.dtsi"
+
+/ {
+   compatible = "hardkernel,odroid-hc4", "amlogic,sm1";
+   model = "Hardkernel ODROID-HC4";
+
+   aliases {
+   rtc0 = 
+   rtc1 = 
+   };
+
+   fan0: pwm-fan {
+   compatible = "pwm-fan";
+   #cooling-cells = <2>;
+   cooling-min-state = <0>;
+   cooling-max-state = <3>;
+   cooling-levels = <0 120 170 220>;
+   pwms = <_cd 1 4 0>;
+   };
+
+   leds {
+   compatible = "gpio-leds";
+
+   led-blue {
+   color = ;
+   function = LED_FUNCTION_STATUS;
+   gpios = <_ao GPIOAO_11 GPIO_ACTIVE_HIGH>;
+   linux,default-trigger = "heartbeat";
+   panic-indicator;
+   };
+
+   led-red {
+   color = ;
+   function = LED_FUNCTION_POWER;
+   gpios = <_ao GPIOAO_7 GPIO_ACTIVE_HIGH>;
+   default-state = "on";
+   };
+   };
+
+   sound {
+   model = "ODROID-HC4";
+   };
+};
+
+_thermal {
+   cooling-maps {
+   map {
+   trip = <_passive>;
+   cooling-device = < THERMAL_NO_LIMIT 
THERMAL_NO_LIMIT>;
+   };
+   };
+};
+
+ {
+   linux,rc-map-name = "rc-odroid";
+};
+
+ {
+   status = "okay";
+   pinctrl-0 = <_sda_x_pins>, <_sck_x_pins>;
+   pinctrl-names = "default";
+
+   rtc: rtc@51 {
+   status = "okay";
+   compatible = "nxp,pcf8563";
+   reg = <0x51>;
+   wakeup-source;
+   };
+};
+
+ {
+   status = "okay";
+   reset-gpios = < GPIOH_4 GPIO_ACTIVE_LOW>;
+};
+
+_cd {
+   status = "okay";
+   pinctrl-names = "default";
+   pinctrl-0 = <_d_x6_pins>;
+};
+
+_emmc_c {
+   status = "disabled";
+};
+
+ {
+   phys = <_phy0>, <_phy1>;
+   phy-names = "usb2-phy0", "usb2-phy1";
+};
-- 
2.17.1



[PATCH 1/5] dt-bindings: arm: amlogic: sort SM1 bindings

2021-01-28 Thread Christian Hewitt
Sort the bindings before adding new SM1 devices.

Signed-off-by: Christian Hewitt 
---
 Documentation/devicetree/bindings/arm/amlogic.yaml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/amlogic.yaml 
b/Documentation/devicetree/bindings/arm/amlogic.yaml
index 6bef60ddda64..b21ba8ba23dd 100644
--- a/Documentation/devicetree/bindings/arm/amlogic.yaml
+++ b/Documentation/devicetree/bindings/arm/amlogic.yaml
@@ -164,9 +164,9 @@ properties:
   - description: Boards with the Amlogic Meson SM1 S905X3/D3/Y3 SoC
 items:
   - enum:
-  - seirobotics,sei610
-  - khadas,vim3l
   - hardkernel,odroid-c4
+  - khadas,vim3l
+  - seirobotics,sei610
   - const: amlogic,sm1
 
   - description: Boards with the Amlogic Meson A1 A113L SoC
-- 
2.17.1



[PATCH 0/5] arm64: dts: meson: add support for ODROID-HC4

2021-01-28 Thread Christian Hewitt
This series fixes minor sort-order issues in the Amlogic bindings yaml and
dtb Makefile, then converts the existing ODROID-C2 dts into dtsi so we can
support its new sister product the ODROID-HC4.

I've also given the devices different audio card names. This is partly
cosmetic, but also because HC4 is HDMI-only while C4 can be used with
other i2c audio devices via an expansion connector so users may want to
use different alsa configs.

Patches to support the spifc chip are still being upstreamed [0] so this
will be addressed in a follow up. A WIP patch for the dts change can be
found in my amlogic-5.11.y dev branch [1].

For reference, here's dmesg from LibreELEC on 5.11-rc5 [2].

[0] 
https://patchwork.ozlabs.org/project/linux-mtd/patch/20201220224314.2659-1-andr...@rammhold.de/
[1] https://github.com/chewitt/linux/commits/amlogic-5.11.y
[2] http://ix.io/2NCi

Christian Hewitt (5):
  dt-bindings: arm: amlogic: sort SM1 bindings
  arm64: dts: meson: sort Amlogic dtb Makefile
  arm64: dts: meson: convert meson-sm1-odroid-c4 to dtsi
  dt-bindings: arm: amlogic: add ODROID-HC4 bindings
  arm64: dts: meson: add initial device-tree for ODROID-HC4

 .../devicetree/bindings/arm/amlogic.yaml  |   5 +-
 arch/arm64/boot/dts/amlogic/Makefile  |   3 +-
 .../boot/dts/amlogic/meson-sm1-odroid-c4.dts  | 427 +
 .../boot/dts/amlogic/meson-sm1-odroid-hc4.dts |  96 
 .../boot/dts/amlogic/meson-sm1-odroid.dtsi| 441 ++
 5 files changed, 543 insertions(+), 429 deletions(-)
 create mode 100644 arch/arm64/boot/dts/amlogic/meson-sm1-odroid-hc4.dts
 create mode 100644 arch/arm64/boot/dts/amlogic/meson-sm1-odroid.dtsi

-- 
2.17.1



[PATCH 4/5] dt-bindings: arm: amlogic: add ODROID-HC4 bindings

2021-01-28 Thread Christian Hewitt
Add the board bindings for the ODROID-HC4 device.

Signed-off-by: Christian Hewitt 
---
 Documentation/devicetree/bindings/arm/amlogic.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/arm/amlogic.yaml 
b/Documentation/devicetree/bindings/arm/amlogic.yaml
index b21ba8ba23dd..5f6769bf45bd 100644
--- a/Documentation/devicetree/bindings/arm/amlogic.yaml
+++ b/Documentation/devicetree/bindings/arm/amlogic.yaml
@@ -165,6 +165,7 @@ properties:
 items:
   - enum:
   - hardkernel,odroid-c4
+  - hardkernel,odroid-hc4
   - khadas,vim3l
   - seirobotics,sei610
   - const: amlogic,sm1
-- 
2.17.1



Re: [net-next PATCH v4 01/15] Documentation: ACPI: DSD: Document MDIO PHY

2021-01-28 Thread Calvin Johnson
On Thu, Jan 28, 2021 at 02:27:00PM +0100, Rafael J. Wysocki wrote:
> On Thu, Jan 28, 2021 at 2:12 PM Calvin Johnson
>  wrote:
> >
> > On Thu, Jan 28, 2021 at 01:00:40PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Jan 28, 2021 at 12:27 PM Calvin Johnson
> > >  wrote:
> > > >
> > > > Hi Rafael,
> > > >
> > > > Thanks for the review. I'll work on all the comments.
> > > >
> > > > On Fri, Jan 22, 2021 at 08:22:21PM +0100, Rafael J. Wysocki wrote:
> > > > > On Fri, Jan 22, 2021 at 4:43 PM Calvin Johnson
> > > > >  wrote:
> > > > > >
> > > > > > Introduce ACPI mechanism to get PHYs registered on a MDIO bus and
> > > > > > provide them to be connected to MAC.
> > > > > >
> > > > > > Describe properties "phy-handle" and "phy-mode".
> > > > > >
> > > > > > Signed-off-by: Calvin Johnson 
> > > > > > ---
> > > > > >
> > > > > > Changes in v4:
> > > > > > - More cleanup
> > > > >
> > > > > This looks much better that the previous versions IMV, some nits 
> > > > > below.
> > > > >
> > > > > > Changes in v3: None
> > > > > > Changes in v2:
> > > > > > - Updated with more description in document
> > > > > >
> > > > > >  Documentation/firmware-guide/acpi/dsd/phy.rst | 129 
> > > > > > ++
> > > > > >  1 file changed, 129 insertions(+)
> > > > > >  create mode 100644 Documentation/firmware-guide/acpi/dsd/phy.rst
> > > > > >
> > > > > > diff --git a/Documentation/firmware-guide/acpi/dsd/phy.rst 
> > > > > > b/Documentation/firmware-guide/acpi/dsd/phy.rst
> > > > > > new file mode 100644
> > > > > > index ..76fca994bc99
> > > > > > --- /dev/null
> > > > > > +++ b/Documentation/firmware-guide/acpi/dsd/phy.rst
> > > > > > @@ -0,0 +1,129 @@
> > > > > > +.. SPDX-License-Identifier: GPL-2.0
> > > > > > +
> > > > > > +=
> > > > > > +MDIO bus and PHYs in ACPI
> > > > > > +=
> > > > > > +
> > > > > > +The PHYs on an MDIO bus [1] are probed and registered using
> > > > > > +fwnode_mdiobus_register_phy().
> > > > >
> > > > > Empty line here, please.
> > > > >
> > > > > > +Later, for connecting these PHYs to MAC, the PHYs registered on the
> > > > > > +MDIO bus have to be referenced.
> > > > > > +
> > > > > > +The UUID given below should be used as mentioned in the "Device 
> > > > > > Properties
> > > > > > +UUID For _DSD" [2] document.
> > > > > > +   - UUID: daffd814-6eba-4d8c-8a91-bc9bbf4aa301
> > > > >
> > > > > I would drop the above paragraph.
> > > > >
> > > > > > +
> > > > > > +This document introduces two _DSD properties that are to be used
> > > > > > +for PHYs on the MDIO bus.[3]
> > > > >
> > > > > I'd say "for connecting PHYs on the MDIO bus [3] to the MAC layer."
> > > > > above and add the following here:
> > > > >
> > > > > "These properties are defined in accordance with the "Device
> > > > > Properties UUID For _DSD" [2] document and the
> > > > > daffd814-6eba-4d8c-8a91-bc9bbf4aa301 UUID must be used in the Device
> > > > > Data Descriptors containing them."
> > > > >
> > > > > > +
> > > > > > +phy-handle
> > > > > > +--
> > > > > > +For each MAC node, a device property "phy-handle" is used to 
> > > > > > reference
> > > > > > +the PHY that is registered on an MDIO bus. This is mandatory for
> > > > > > +network interfaces that have PHYs connected to MAC via MDIO bus.
> > > > > > +
> > > > > > +During the MDIO bus driver initialization, PHYs on this bus are 
> > > > > > probed
> > > > > > +using the _ADR object as shown below and are registered on the 
> > > > > > MDIO bus.
> > > > >
> > > > > Do you want to mention the "reg" property here?  I think it would be
> > > > > useful to do that.
> > > >
> > > > No. I think we should adhere to _ADR in MDIO case. The "reg" property 
> > > > for ACPI
> > > > may be useful for other use cases that Andy is aware of.
> > >
> > > The code should reflect this, then.  I mean it sounds like you want to
> > > check the "reg" property only if this is a non-ACPI node.
> >
> > Right. For MDIO case, that is what is required.
> > "reg" for DT and "_ADR" for ACPI.
> >
> > However, Andy pointed out [1] that ACPI nodes can also hold reg property and
> > therefore, fwnode_get_id() need to be capable to handling that situation as
> > well.
> 
> No, please don't confuse those two things.
> 
> Yes, ACPI nodes can also hold a "reg" property, but the meaning of it
> depends on the binding which is exactly my point: _ADR is not a
> fallback replacement for "reg" in general and it is not so for MDIO
> too.  The new function as proposed doesn't match the MDIO requirements
> and so it should not be used for MDIO.
> 
> For MDIO, the exact flow mentioned above needs to be implemented (and
> if someone wants to use it for their use case too, fine).
> 
> Otherwise the code wouldn't match the documentation.

In that case, is this good?

/**
 * fwnode_get_local_addr - Get the local address of fwnode.
 * @fwnode: firmware node
 * @addr: addr value contained in the fwnode
 *
 * For DT, retrieve the value of the "reg" 

Re: [PATCH v12 6/8] drm/mediatek: enable dither function

2021-01-28 Thread Hsin-Yi Wang
On Fri, Jan 29, 2021 at 2:30 PM Yongqiang Niu
 wrote:
>
> On Fri, 2021-01-29 at 14:24 +0800, Hsin-Yi Wang wrote:
> > On Fri, Jan 29, 2021 at 9:33 AM CK Hu  wrote:
> > >
> > > Hi, Hsin-Yi:
> > >
> > > On Thu, 2021-01-28 at 19:23 +0800, Hsin-Yi Wang wrote:
> > > > From: Yongqiang Niu 
> > > >
> > > > for 5 or 6 bpc panel, we need enable dither function
> > > > to improve the display quality
> > > >
> > > > Signed-off-by: Yongqiang Niu 
> > > > Signed-off-by: Hsin-Yi Wang 
> > > > ---
> > > >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 15 +--
> > > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> > > > b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > > index ac2cb25620357..6c8f246380a74 100644
> > > > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > > @@ -53,6 +53,7 @@
> > > >  #define DITHER_ENBIT(0)
> > > >  #define DISP_DITHER_CFG  0x0020
> > > >  #define DITHER_RELAY_MODEBIT(0)
> > > > +#define DITHER_ENGINE_EN BIT(1)
> > > >  #define DISP_DITHER_SIZE 0x0030
> > > >
> > > >  #define LUT_10BIT_MASK   0x03ff
> > > > @@ -314,9 +315,19 @@ static void mtk_dither_config(struct device *dev, 
> > > > unsigned int w,
> > > > unsigned int bpc, struct cmdq_pkt *cmdq_pkt)
> > > >  {
> > > >   struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> > > > + bool enable = (bpc == 5 || bpc == 6);
> > >
> > > I strongly believe that dither function in dither is identical to the
> > > one in gamma and od, and in mtk_dither_set_common(), 'bpc >=
> > > MTK_MIN_BPC' is valid, so I believe we need not to limit bpc to 5 or 6.
> > > But we should consider the case that bpc is invalid in
> > > mtk_dither_set_common(). Invalid case in gamma and od use different way
> > > to process. For gamma, dither is default relay mode, so invalid bpc
> > > would do nothing in mtk_dither_set_common() and result in relay mode.
> > > For od, it set to relay mode first, them invalid bpc would do nothing in
> > > mtk_dither_set_common() and result in relay mode. I would like dither,
> > > gamma and od to process invalid bpc in the same way. One solution is to
> > > set relay mode in mtk_dither_set_common() for invalid bpc.
> > >
> > > Regards,
> > > CK
> > >
> >
> > I modify the mtk_dither_config() to follow:
> >
> >
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > index ac2cb25620357..5b7fcedb9f9a8 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > @@ -53,6 +53,7 @@
> >  #define DITHER_EN  BIT(0)
> >  #define DISP_DITHER_CFG0x0020
> >  #define DITHER_RELAY_MODE  BIT(0)
> > +#define DITHER_ENGINE_EN   BIT(1)
> >  #define DISP_DITHER_SIZE   0x0030
> >
> >  #define LUT_10BIT_MASK 0x03ff
> > @@ -166,6 +167,8 @@ void mtk_dither_set_common(void __iomem *regs,
> > struct cmdq_client_reg *cmdq_reg,
> >   DITHER_ADD_LSHIFT_G(MTK_MAX_BPC - bpc),
> >   cmdq_reg, regs, DISP_DITHER_16);
> > mtk_ddp_write(cmdq_pkt, dither_en, cmdq_reg, regs, cfg);
> > +   } else {
> > +   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, cmdq_reg, regs, 
> > cfg);
> > }
> >  }
> >
> > @@ -315,8 +318,12 @@ static void mtk_dither_config(struct device *dev,
> > unsigned int w,
> >  {
> > struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> >
> > -   mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg,
> > priv->regs, DISP_DITHER_SIZE);
> > -   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg,
> > priv->regs, DISP_DITHER_CFG);
> > +   mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg, priv->regs,
> > + DISP_DITHER_SIZE);
> > +   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg, 
> > priv->regs,
> > + DISP_DITHER_CFG);
> > +   mtk_dither_set_common(priv->regs, >cmdq_reg, bpc, 
> > DISP_DITHER_CFG,
> > +  DITHER_ENGINE_EN, cmdq_pkt);
> >  }
> >
> > So now, not only bpc==5 or 6, but all valid bpc, dither config will
> > call mtk_dither_set_common() with the flag DITHER_ENGINE_EN(BIT(1)).
> > od config will call mtk_dither_set_common() with the flag
> > DISP_DITHERING(BIT(2)).
> > Additionally for 8173, gamma config will call mtk_dither_set_common()
> > with the flag DISP_DITHERING (BIT(2))
> >
> > For invalid mode all of them will be DITHER_RELAY_MODE.
> >
> > Just to make sure that this follows the spec? thanks
> >
>
> for mt8173 gamma, there is no relay mode, only dither enable or not(bit

Re: [PATCH] arm64: dts: imx8mq: use_dt_domains for pci node

2021-01-28 Thread Shawn Guo
On Fri, Jan 15, 2021 at 11:26:57AM +0800, Peng Fan (OSS) wrote:
> From: Peng Fan 
> 
> We are using Jailhouse Hypervsior which has virtual pci node that
> use dt domains. so also use dt domains for pci node, this will avoid
> conflict with Jailhouse Hypervisor to trigger the following error:
>   pr_err("Inconsistent \"linux,pci-domain\" property in DT\n");
> 
> Reviewed-by: Richard Zhu 
> Signed-off-by: Peng Fan 

Applied, thanks.


[PATCH] staging: qlge/qlge_ethtool.c: strlcpy -> strscpy

2021-01-28 Thread Kumar Kartikeya Dwivedi
Fixes checkpatch warnings for usage of strlcpy.

Signed-off-by: Kumar Kartikeya Dwivedi 
---
 drivers/staging/qlge/qlge_ethtool.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/qlge/qlge_ethtool.c 
b/drivers/staging/qlge/qlge_ethtool.c
index a28f0254c..635d3338f 100644
--- a/drivers/staging/qlge/qlge_ethtool.c
+++ b/drivers/staging/qlge/qlge_ethtool.c
@@ -417,15 +417,15 @@ static void ql_get_drvinfo(struct net_device *ndev,
 {
struct ql_adapter *qdev = netdev_priv(ndev);
 
-   strlcpy(drvinfo->driver, qlge_driver_name, sizeof(drvinfo->driver));
-   strlcpy(drvinfo->version, qlge_driver_version,
+   strscpy(drvinfo->driver, qlge_driver_name, sizeof(drvinfo->driver));
+   strscpy(drvinfo->version, qlge_driver_version,
sizeof(drvinfo->version));
snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
 "v%d.%d.%d",
 (qdev->fw_rev_id & 0x00ff) >> 16,
 (qdev->fw_rev_id & 0xff00) >> 8,
 (qdev->fw_rev_id & 0x00ff));
-   strlcpy(drvinfo->bus_info, pci_name(qdev->pdev),
+   strscpy(drvinfo->bus_info, pci_name(qdev->pdev),
sizeof(drvinfo->bus_info));
 }
 
-- 
2.29.2



Re: [PATCH 4/6] lib: inline _find_next_bit() wrappers

2021-01-28 Thread Yury Norov
On Thu, Jan 21, 2021 at 2:27 AM Andy Shevchenko
 wrote:
>
> On Wed, Jan 20, 2021 at 04:06:28PM -0800, Yury Norov wrote:
> > lib/find_bit.c declares five single-line wrappers for _find_next_bit().
> > We may turn those wrappers to inline functions. It eliminates
> > unneeded function calls and opens room for compile-time optimizations.
>
> ...
>
> > --- a/include/asm-generic/bitops/le.h
> > +++ b/include/asm-generic/bitops/le.h
> > @@ -4,6 +4,7 @@
> >
> >  #include 
> >  #include 
> > +#include 
>
> I'm wondering if generic header inclusion should go before arch-dependent 
> ones.
>
> ...
>
> > -#ifndef find_next_bit
>
> > -#ifndef find_next_zero_bit
>
> > -#if !defined(find_next_and_bit)
>
> > -#ifndef find_next_zero_bit_le
>
> > -#ifndef find_next_bit_le
>
> Shouldn't you leave these in new wrappers as well?
>
> --
> With Best Regards,
> Andy Shevchenko

Could you please elaborate? Wrappers in find.h are protected, functions
in lib/find_bit.c too. Maybe I misunderstood you?..


Re: [RFC PATCH v3 00/13] virtio/vsock: introduce SOCK_SEQPACKET support

2021-01-28 Thread Arseny Krasnov


On 28.01.2021 20:19, Stefano Garzarella wrote:
> Hi Arseny,
> I reviewed a part, tomorrow I hope to finish the other patches.
>
> Just a couple of comments in the TODOs below.
>
> On Mon, Jan 25, 2021 at 02:09:00PM +0300, Arseny Krasnov wrote:
>>  This patchset impelements support of SOCK_SEQPACKET for virtio
>> transport.
>>  As SOCK_SEQPACKET guarantees to save record boundaries, so to
>> do it, new packet operation was added: it marks start of record (with
>> record length in header), such packet doesn't carry any data.  To send
>> record, packet with start marker is sent first, then all data is sent
>> as usual 'RW' packets. On receiver's side, length of record is known
> >from packet with start record marker. Now as  packets of one socket
>> are not reordered neither on vsock nor on vhost transport layers, such
>> marker allows to restore original record on receiver's side. If user's
>> buffer is smaller that record length, when all out of size data is
>> dropped.
>>  Maximum length of datagram is not limited as in stream socket,
>> because same credit logic is used. Difference with stream socket is
>> that user is not woken up until whole record is received or error
>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>  Tests also implemented.
>>
>> Arseny Krasnov (13):
>>  af_vsock: prepare for SOCK_SEQPACKET support
>>  af_vsock: prepare 'vsock_connectible_recvmsg()'
>>  af_vsock: implement SEQPACKET rx loop
>>  af_vsock: implement send logic for SOCK_SEQPACKET
>>  af_vsock: rest of SEQPACKET support
>>  af_vsock: update comments for stream sockets
>>  virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>  virtio/vsock: fetch length for SEQPACKET record
>>  virtio/vsock: add SEQPACKET receive logic
>>  virtio/vsock: rest of SOCK_SEQPACKET support
>>  virtio/vsock: setup SEQPACKET ops for transport
>>  vhost/vsock: setup SEQPACKET ops for transport
>>  vsock_test: add SOCK_SEQPACKET tests
>>
>> drivers/vhost/vsock.c   |   7 +-
>> include/linux/virtio_vsock.h|  12 +
>> include/net/af_vsock.h  |   6 +
>> include/uapi/linux/virtio_vsock.h   |   9 +
>> net/vmw_vsock/af_vsock.c| 543 --
>> net/vmw_vsock/virtio_transport.c|   4 +
>> net/vmw_vsock/virtio_transport_common.c | 295 ++--
>> tools/testing/vsock/util.c  |  32 +-
>> tools/testing/vsock/util.h  |   3 +
>> tools/testing/vsock/vsock_test.c| 126 +
>> 10 files changed, 862 insertions(+), 175 deletions(-)
>>
>> TODO:
>> - Support for record integrity control. As transport could drop some
>>   packets, something like "record-id" and record end marker need to
>>   be implemented. Idea is that SEQ_BEGIN packet carries both record
>>   length and record id, end marker(let it be SEQ_END) carries only
>>   record id. To be sure that no one packet was lost, receiver checks
>>   length of data between SEQ_BEGIN and SEQ_END(it must be same with
>>   value in SEQ_BEGIN) and record ids of SEQ_BEGIN and SEQ_END(this
>>   means that both markers were not dropped. I think that easiest way
>>   to implement record id for SEQ_BEGIN is to reuse another field of
>>   packet header(SEQ_BEGIN already uses 'flags' as record length).For
>>   SEQ_END record id could be stored in 'flags'.
> I don't really like the idea of reusing the 'flags' field for this 
> purpose.
>
>> Another way to implement it, is to move metadata of both SEQ_END
>>   and SEQ_BEGIN to payload. But this approach has problem, because
>>   if we move something to payload, such payload is accounted by
>>   credit logic, which fragments payload, while payload with record
>>   length and id couldn't be fragmented. One way to overcome it is to
>>   ignore credit update for SEQ_BEGIN/SEQ_END packet.Another solution
>>   is to update 'stream_has_space()' function: current implementation
>>   return non-zero when at least 1 byte is allowed to use,but updated
>>   version will have extra argument, which is needed length. For 'RW'
>>   packet this argument is 1, for SEQ_BEGIN it is sizeof(record len +
>>   record id) and for SEQ_END it is sizeof(record id).
> Is the payload accounted by credit logic also if hdr.op is not 
> VIRTIO_VSOCK_OP_RW?

Yes, on send any packet with payload could be fragmented if

there is not enough space at receiver. On receive 'fwd_cnt' and

'buf_alloc' are updated with header of every packet. Of course,

to every such case i've described i can add check for 'RW'

packet, to exclude payload from credit accounting, but this is

bunch of dumb checks.

>
> I think that we can define a specific header to put after the 
> virtio_vsock_hdr when hdr.op is SEQ_BEGIN or SEQ_END, and in this header 
> we can store the id and the length of the message.

I think it is better than use payload and touch credit logic

>
>> - What to do, when server doesn't support SOCK_SEQPACKET. In current
>>   implementation RST is replied 

[PATCH v2] x86/vmemmap: Handle unpopulated sub-pmd ranges

2021-01-28 Thread Oscar Salvador
When the size of a struct page is not multiple of 2MB, sections do
not span a PMD anymore and so when populating them some parts of the
PMD will remain unused.
Because of this, PMDs will be left behind when depopulating sections
since remove_pmd_table() thinks that those unused parts are still in
use.

Fix this by marking the unused parts with PAGE_INUSE, so memchr_inv() will
do the right thing and will let us free the PMD when the last user of it
is gone.

This patch is based on a similar patch by David Hildenbrand:

https://lore.kernel.org/linux-mm/20200722094558.9828-9-da...@redhat.com/
https://lore.kernel.org/linux-mm/20200722094558.9828-10-da...@redhat.com/

Signed-off-by: Oscar Salvador 
---

 v1 -> v2:
 - Rename PAGE_INUSE to PAGE_UNUSED as it better describes what we do

---
 arch/x86/mm/init_64.c | 91 +--
 1 file changed, 79 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b5a3fa4033d3..dbb76160ed52 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -871,7 +871,72 @@ int arch_add_memory(int nid, u64 start, u64 size,
return add_pages(nid, start_pfn, nr_pages, params);
 }
 
-#define PAGE_INUSE 0xFD
+#define PAGE_UNUSED 0xFD
+
+/*
+ * The unused vmemmap range, which was not yet memset(PAGE_UNUSED) ranges
+ * from unused_pmd_start to next PMD_SIZE boundary.
+ */
+static unsigned long unused_pmd_start __meminitdata;
+
+static void __meminit vmemmap_flush_unused_pmd(void)
+{
+   if (!unused_pmd_start)
+   return;
+   /*
+* Clears (unused_pmd_start, PMD_END]
+*/
+   memset((void *)unused_pmd_start, PAGE_UNUSED,
+  ALIGN(unused_pmd_start, PMD_SIZE) - unused_pmd_start);
+   unused_pmd_start = 0;
+}
+
+/* Returns true if the PMD is completely unused and thus it can be freed */
+static bool __meminit vmemmap_unuse_sub_pmd(unsigned long addr, unsigned long 
end)
+{
+   unsigned long start = ALIGN_DOWN(addr, PMD_SIZE);
+
+   vmemmap_flush_unused_pmd();
+   memset((void *)addr, PAGE_UNUSED, end - addr);
+
+   return !memchr_inv((void *)start, PAGE_UNUSED, PMD_SIZE);
+}
+
+static void __meminit vmemmap_use_sub_pmd(unsigned long start, unsigned long 
end)
+{
+   /*
+* We only optimize if the new used range directly follows the
+* previously unused range (esp., when populating consecutive sections).
+*/
+   if (unused_pmd_start == start) {
+   if (likely(IS_ALIGNED(end, PMD_SIZE)))
+   unused_pmd_start = 0;
+   else
+   unused_pmd_start = end;
+   return;
+   }
+
+   vmemmap_flush_unused_pmd();
+}
+
+static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned 
long end)
+{
+   vmemmap_flush_unused_pmd();
+
+   /*
+* Mark the unused parts of the new memmap range
+*/
+   if (!IS_ALIGNED(start, PMD_SIZE))
+   memset((void *)start, PAGE_UNUSED,
+  start - ALIGN_DOWN(start, PMD_SIZE));
+   /*
+* We want to avoid memset(PAGE_UNUSED) when populating the vmemmap of
+* consecutive sections. Remember for the last added PMD the last
+* unused range in the populated PMD.
+*/
+   if (!IS_ALIGNED(end, PMD_SIZE))
+   unused_pmd_start = end;
+}
 
 static void __meminit free_pagetable(struct page *page, int order)
 {
@@ -1008,10 +1073,10 @@ remove_pte_table(pte_t *pte_start, unsigned long addr, 
unsigned long end,
 * with 0xFD, and remove the page when it is wholly
 * filled with 0xFD.
 */
-   memset((void *)addr, PAGE_INUSE, next - addr);
+   memset((void *)addr, PAGE_UNUSED, next - addr);
 
page_addr = page_address(pte_page(*pte));
-   if (!memchr_inv(page_addr, PAGE_INUSE, PAGE_SIZE)) {
+   if (!memchr_inv(page_addr, PAGE_UNUSED, PAGE_SIZE)) {
free_pagetable(pte_page(*pte), 0);
 
spin_lock(_mm.page_table_lock);
@@ -1034,7 +1099,6 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, 
unsigned long end,
unsigned long next, pages = 0;
pte_t *pte_base;
pmd_t *pmd;
-   void *page_addr;
 
pmd = pmd_start + pmd_index(addr);
for (; addr < end; addr = next, pmd++) {
@@ -1055,12 +1119,10 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, 
unsigned long end,
spin_unlock(_mm.page_table_lock);
pages++;
} else {
-   /* If here, we are freeing vmemmap pages. */
-   memset((void *)addr, PAGE_INUSE, next - addr);
-
-   page_addr = page_address(pmd_page(*pmd));
- 

Re: [PATCH] parser: Fix kernel-doc markups

2021-01-28 Thread Randy Dunlap
On 1/28/21 9:00 PM, bingjingc wrote:
> From: BingJing Chang 
> 
> Fix existing issues at the kernel-doc markups
> 
> Signed-off-by: BingJing Chang 
> ---
>  lib/parser.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 

Reviewed-by: Randy Dunlap 

thanks for the kernel-doc patch.

-- 
~Randy
netiquette: https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH V6 5/6] of: unittest: Create overlay_common.dtsi and testcases_common.dtsi

2021-01-28 Thread Frank Rowand
Hi Viresh,

Second attempt, I think the first reply did not properly send.

On 1/26/21 11:56 PM, Viresh Kumar wrote:
> On 22-01-21, 16:20, Viresh Kumar wrote:
>> In order to build-test the same unit-test files using fdtoverlay tool,
>> move the device nodes from the existing overlay_base.dts and
>> testcases_common.dts files to .dtsi files. The .dts files now include
>> the new .dtsi files, resulting in exactly the same behavior as earlier.
>>
>> The .dtsi files can now be reused for compile time tests using
>> fdtoverlay (will be done in a later patch).
>>
>> This is required because the base files passed to fdtoverlay tool
>> shouldn't be overlays themselves (i.e. shouldn't have the /plugin/;
>> tag).
>>
>> Signed-off-by: Viresh Kumar 
>> ---
>>  drivers/of/unittest-data/overlay_base.dts | 90 +-
>>  drivers/of/unittest-data/overlay_common.dtsi  | 91 +++
>>  drivers/of/unittest-data/testcases.dts| 17 +---
>>  .../of/unittest-data/testcases_common.dtsi| 18 
>>  4 files changed, 111 insertions(+), 105 deletions(-)
>>  create mode 100644 drivers/of/unittest-data/overlay_common.dtsi
>>  create mode 100644 drivers/of/unittest-data/testcases_common.dtsi
> 
> Frank,
> 
> As I mentioned in the cover-letter, I get a build warning right now:
> 
> drivers/of/unittest-data/tests-interrupts.dtsi:20.5-28: Warning 
> (interrupts_property): /testcase-data/testcase-device2:#interrupt-cells: size 
> is (4), expected multiple of 8

Thanks for catching that.

> 
> I think I need to add below diff to this patch to fix this warning, will that
> be okay ?

In my first reply, I said "nope", or something to that effect.  Upon 
reflection, it looks
like the below diff will fix the problem.  This is base on source code 
inspection and
building with the diff applied.

I did not successfully boot my target (I have some issues to resolve after 
updating
the OS on my development host), so I have not verified that unittest is not 
impacted.

-Frank

> 
> diff --git a/drivers/of/unittest-data/testcases.dts 
> b/drivers/of/unittest-data/testcases.dts
> index 185125085784..04b9e7bb30d9 100644
> --- a/drivers/of/unittest-data/testcases.dts
> +++ b/drivers/of/unittest-data/testcases.dts
> @@ -3,3 +3,14 @@
>  /plugin/;
>  
>  #include "testcases_common.dtsi"
> +
> +/ {
> +   testcase-data {
> +   testcase-device2 {
> +   compatible = "testcase-device";
> +   interrupt-parent = <_intc2>;
> +   interrupts = <1>; /* invalid specifier - too short */
> +   };
> +   };
> +
> +};
> diff --git a/drivers/of/unittest-data/tests-interrupts.dtsi 
> b/drivers/of/unittest-data/tests-interrupts.dtsi
> index ec175e800725..0e5914611107 100644
> --- a/drivers/of/unittest-data/tests-interrupts.dtsi
> +++ b/drivers/of/unittest-data/tests-interrupts.dtsi
> @@ -61,12 +61,5 @@ testcase-device1 {
> interrupt-parent = <_intc0>;
> interrupts = <1>;
> };
> -
> -   testcase-device2 {
> -   compatible = "testcase-device";
> -   interrupt-parent = <_intc2>;
> -   interrupts = <1>; /* invalid specifier - too short */
> -   };
> };
> -
>  };
> 



[PATCH] gpu/drm: Parse all ext. blocks in EDID

2021-01-28 Thread Ching-shih.Li
From: Louis Li 

[Why] EDID parser cannot correctly parse EDID which includes
multiple same extension blocks (e.g. two same ext. blocks: , are included in EDID defined in test case HF1-66, HDMI 2.0 CTS),
since it only parse the first target ext. block only. This causes CTS fail.

[How]
Original parser searches ext. block from HEAD of EDID,
and always return the first target ext. block.
Solution is to find all ext. blocks and pass start address
of each ext. block to parser to handle.

By this change, no matter how ext. block is placed in EDID, all
target ext. blocks are handled.

Tested-by: mika.hsu 
Signed-off-by: Louis Li 
---
 drivers/gpu/drm/drm_edid.c | 52 --
 1 file changed, 50 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 14c6a4bb32ea..adcb04516b41 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -4160,9 +4160,8 @@ static void drm_parse_y420cmdb_bitmap(struct 
drm_connector *connector,
 }
 
 static int
-add_cea_modes(struct drm_connector *connector, struct edid *edid)
+handle_cea_modes(struct drm_connector *connector, const u8 *cea)
 {
-   const u8 *cea = drm_find_cea_extension(edid);
const u8 *db, *hdmi = NULL, *video = NULL;
u8 dbl, hdmi_len, video_len = 0;
int modes = 0;
@@ -4206,6 +4205,55 @@ add_cea_modes(struct drm_connector *connector, struct 
edid *edid)
return modes;
 }
 
+static int
+add_cea_modes(struct drm_connector *connector, struct edid *edid)
+{
+   const u8 *cea = NULL;
+   u8 *edid_ext = NULL;
+   int modes = 0;
+   int block_index = 0;
+
+   /*
+* Based on HDMI(2.x) CTS: HF1-66 (Iter 06), two blocks with same
+* tag (Tag = 0x02: CTA 861) are included in EDID. Ori. solution
+* checks for all additional blocks, BUT it always checks from
+* HEAD. Result is only 1st CTA 861 can be found and checked.
+* Therefore, any following CTA 861 block is never found
+* to handle. The modified method is to check each additional
+* block by pointing to the start address of that block, instead
+* of finding from HEAD of EDID.
+*
+* TODO: EDID parser may need re-designed, since ori. parser can't
+* correctly parse multiple same ext. blocks (Tag = 0x02 in this
+* case), since it finds and parse the 1st target ext. block only.
+* 1. Ori. method is not flexible to work with EDID like HF1-66.
+* 2. Ori. method is not efficient: a block may be checked many times.
+* 3. Ori. method does not support new features, e.g. Ext. BLK MAP.
+* etc...
+*/
+   for (block_index = 0; block_index < edid->extensions; block_index++) {
+   edid_ext = (((u8 *)edid) + (EDID_LENGTH * (block_index + 1)));
+
+   if (edid_ext[0] == CEA_EXT) {
+   cea = ((const u8 *)edid_ext);
+   modes += handle_cea_modes(connector, cea);
+   }
+   }
+
+   /*
+* If no Video Data extension block, go check DisplayID block,
+* because CEA block may be embedded in DisplayID block.
+*/
+   if (!cea) {
+   cea = drm_find_cea_extension(edid);
+
+   if (cea)
+   modes += handle_cea_modes(connector, cea);
+   }
+
+   return modes;
+}
+
 static void fixup_detailed_cea_mode_clock(struct drm_display_mode *mode)
 {
const struct drm_display_mode *cea_mode;
-- 
2.25.1



Re: kprobes broken since 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")

2021-01-28 Thread Nikolay Borisov



On 29.01.21 г. 3:34 ч., Alexei Starovoitov wrote:
> On Thu, Jan 28, 2021 at 07:24:14PM +0100, Peter Zijlstra wrote:
>> On Thu, Jan 28, 2021 at 06:45:56PM +0200, Nikolay Borisov wrote:
>>> it would be placed on the __fentry__ (and not endbr64) hence it works.
>>> So perhaps a workaround outside of bpf could essentially detect this
>>> scenario and adjust the probe to be on the __fentry__ and not preceding
>>> instruction if it's detected to be endbr64 ?
>>
>> Arguably the fentry handler should also set the nmi context, it can,
>> after all, interrupt pretty much any other context by construction.
> 
> But that doesn't make it a real nmi.
> nmi can actually interrupt anything. Whereas kprobe via int3 cannot
> do nokprobe and noinstr sections. The exposure is a lot different.
> ftrace is even more contained. It's only at the start of the functions.
> It's even smaller subset of places than kprobes.
> So ftrace < kprobe < nmi.
> Grouping them all into nmi doesn't make sense to me.
> That bpf breaking change came unnoticed mostly because people use
> kprobes in the beginning of the functions only, but there are cases
> where kprobes are in the middle too. People just didn't upgrade kernels
> fast enough to notice.

nit: slight correction - I observed while I was putting kprobes at the
beginning of the function but __fentry__ wasn't the first thing in the
function's code. The reason why people haven't observed is because
everyone is running with retpolines enabled which disables CFI/CET.

> imo appropriate solution would be to have some distinction between
> ftrace < kprobe_via_int3 < nmi, so that bpf side can react accordingly.
> That code in trace_call_bpf:
>   if (in_nmi()) /* not supported yet */
> was necessary in the past. Now we have all sorts of protections built in.
> So it's probably ok to just drop it, but I think adding
> called_via_ftrace vs called_via_kprobe_int3 vs called_via_nmi
> is more appropriate solution long term.
> 


Re: [f2fs-dev] [PATCH] f2fs: fix to avoid inconsistent quota data

2021-01-28 Thread Chao Yu

On 2021/1/28 17:02, Chao Yu wrote:

From: Yi Chen 

Occasionally, quota data may be corrupted detected by fsck:

Info: checkpoint state = 45 :  crc compacted_summary unmount
[QUOTA WARNING] Usage inconsistent for ID 0:actual (1543036928, 762) != 
expected (1543032832, 762)
[ASSERT] (fsck_chk_quota_files:1986)  --> Quota file is missing or invalid 
quota file content found.
[QUOTA WARNING] Usage inconsistent for ID 0:actual (1352478720, 344) != 
expected (1352474624, 344)
[ASSERT] (fsck_chk_quota_files:1986)  --> Quota file is missing or invalid 
quota file content found.

[FSCK] Unreachable nat entries[Ok..] [0x0]
[FSCK] SIT valid block bitmap checking[Ok..]
[FSCK] Hard link checking for regular file[Ok..] [0x0]
[FSCK] valid_block_count matching with CP [Ok..] [0xdf299]
[FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0x2b01]
[FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0x2b01]
[FSCK] valid_inode_count matched with CP  [Ok..] [0x2665]
[FSCK] free segment_count matched with CP [Ok..] [0xcb04]
[FSCK] next block offset is free  [Ok..]
[FSCK] fixing SIT types
[FSCK] other corrupted bugs   [Fail]

The root cause is:
If we open file w/ readonly flag, disk quota info won't be initialized
for this file, however, following mmap() will force to convert inline
inode via f2fs_convert_inline_inode(), which may increase block usage
for this inode w/o updating quota data, it causes inconsistent disk quota
info.

The issue will happen in following stack:
open(file, O_RDONLY)
mmap(file)
- f2fs_convert_inline_inode
  - f2fs_convert_inline_page
   - f2fs_reserve_block
- f2fs_reserve_new_block
 - f2fs_reserve_new_blocks
  - f2fs_i_blocks_write
   - dquot_claim_block
inode->i_blocks increase, but the dqb_curspace keep the size for the dquots
is NULL.

To fix this issue, let's call dquot_initialize() anyway in both
f2fs_truncate() and f2fs_convert_inline_inode() functions to avoid potential
inconsistent quota data issue.

Fixes: 0abd675e97e6 ("f2fs: support plain user/group quota")
Signed-off-by: Daiyue Zhang 
Signed-off-by: Dehe Gu 
Signed-off-by: Junchao Jiang 
Signed-off-by: Ge Qiu 
Signed-off-by: Yi Chen 


[Chao Yu: clean up commit message a bit]
Reviewed-by: Chao Yu 

Thanks,


Re: [PATCH v12 6/8] drm/mediatek: enable dither function

2021-01-28 Thread Yongqiang Niu
On Fri, 2021-01-29 at 14:24 +0800, Hsin-Yi Wang wrote:
> On Fri, Jan 29, 2021 at 9:33 AM CK Hu  wrote:
> >
> > Hi, Hsin-Yi:
> >
> > On Thu, 2021-01-28 at 19:23 +0800, Hsin-Yi Wang wrote:
> > > From: Yongqiang Niu 
> > >
> > > for 5 or 6 bpc panel, we need enable dither function
> > > to improve the display quality
> > >
> > > Signed-off-by: Yongqiang Niu 
> > > Signed-off-by: Hsin-Yi Wang 
> > > ---
> > >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 15 +--
> > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> > > b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > index ac2cb25620357..6c8f246380a74 100644
> > > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > > @@ -53,6 +53,7 @@
> > >  #define DITHER_ENBIT(0)
> > >  #define DISP_DITHER_CFG  0x0020
> > >  #define DITHER_RELAY_MODEBIT(0)
> > > +#define DITHER_ENGINE_EN BIT(1)
> > >  #define DISP_DITHER_SIZE 0x0030
> > >
> > >  #define LUT_10BIT_MASK   0x03ff
> > > @@ -314,9 +315,19 @@ static void mtk_dither_config(struct device *dev, 
> > > unsigned int w,
> > > unsigned int bpc, struct cmdq_pkt *cmdq_pkt)
> > >  {
> > >   struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> > > + bool enable = (bpc == 5 || bpc == 6);
> >
> > I strongly believe that dither function in dither is identical to the
> > one in gamma and od, and in mtk_dither_set_common(), 'bpc >=
> > MTK_MIN_BPC' is valid, so I believe we need not to limit bpc to 5 or 6.
> > But we should consider the case that bpc is invalid in
> > mtk_dither_set_common(). Invalid case in gamma and od use different way
> > to process. For gamma, dither is default relay mode, so invalid bpc
> > would do nothing in mtk_dither_set_common() and result in relay mode.
> > For od, it set to relay mode first, them invalid bpc would do nothing in
> > mtk_dither_set_common() and result in relay mode. I would like dither,
> > gamma and od to process invalid bpc in the same way. One solution is to
> > set relay mode in mtk_dither_set_common() for invalid bpc.
> >
> > Regards,
> > CK
> >
> 
> I modify the mtk_dither_config() to follow:
> 
> 
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> index ac2cb25620357..5b7fcedb9f9a8 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> @@ -53,6 +53,7 @@
>  #define DITHER_EN  BIT(0)
>  #define DISP_DITHER_CFG0x0020
>  #define DITHER_RELAY_MODE  BIT(0)
> +#define DITHER_ENGINE_EN   BIT(1)
>  #define DISP_DITHER_SIZE   0x0030
> 
>  #define LUT_10BIT_MASK 0x03ff
> @@ -166,6 +167,8 @@ void mtk_dither_set_common(void __iomem *regs,
> struct cmdq_client_reg *cmdq_reg,
>   DITHER_ADD_LSHIFT_G(MTK_MAX_BPC - bpc),
>   cmdq_reg, regs, DISP_DITHER_16);
> mtk_ddp_write(cmdq_pkt, dither_en, cmdq_reg, regs, cfg);
> +   } else {
> +   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, cmdq_reg, regs, 
> cfg);
> }
>  }
> 
> @@ -315,8 +318,12 @@ static void mtk_dither_config(struct device *dev,
> unsigned int w,
>  {
> struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> 
> -   mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg,
> priv->regs, DISP_DITHER_SIZE);
> -   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg,
> priv->regs, DISP_DITHER_CFG);
> +   mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg, priv->regs,
> + DISP_DITHER_SIZE);
> +   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg, 
> priv->regs,
> + DISP_DITHER_CFG);
> +   mtk_dither_set_common(priv->regs, >cmdq_reg, bpc, 
> DISP_DITHER_CFG,
> +  DITHER_ENGINE_EN, cmdq_pkt);
>  }
> 
> So now, not only bpc==5 or 6, but all valid bpc, dither config will
> call mtk_dither_set_common() with the flag DITHER_ENGINE_EN(BIT(1)).
> od config will call mtk_dither_set_common() with the flag
> DISP_DITHERING(BIT(2)).
> Additionally for 8173, gamma config will call mtk_dither_set_common()
> with the flag DISP_DITHERING (BIT(2))
> 
> For invalid mode all of them will be DITHER_RELAY_MODE.
> 
> Just to make sure that this follows the spec? thanks
> 

for mt8173 gamma, there is no relay mode, only dither enable or not(bit
2).
for mt8183 dither, there is dither enable bit 1, and relay mode bit 0


> > >
> > > - mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg, priv->regs, 
> > > DISP_DITHER_SIZE);
> > > - mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg, 
> > > priv->regs, 

Re: [PATCH] kunit: make kunit_tool accept optional path to .kunitconfig fragment

2021-01-28 Thread David Gow
On Sat, Jan 23, 2021 at 8:17 AM Daniel Latypov  wrote:
>
> Currently running tests via KUnit tool means tweaking a .kunitconfig
> file, which you'd keep around locally and never commit.
> This changes makes it so users can pass in a path to a kunitconfig.
>
> One of the imagined use cases is having kunitconfig fragments in-tree
> to formalize interesting sets of tests for features/subsystems, e.g.
>   $ ./tools/testing/kunit/kunit.py run fs/ext4/kunitconfig
>
> For now, this hypothetical fs/ext4/kunitconfig would contain
>   CONFIG_KUNIT=y
>   CONFIG_EXT4_FS=y
>   CONFIG_EXT4_KUNIT_TESTS=y
>
> At the moment, it's not hard to manually whip up this file, but as more
> and more tests get added, this will get tedious.
>
> It also opens the door to documenting how to run all the tests relevant
> to a specific subsystem or feature as a simple one-liner.
>
> This can be seen as an analogue to tools/testing/selftests/*/config
> But in the case of KUnit, the tests live in the same directory as the
> code-under-test, so it feels more natural to allow the kunitconfig
> fragments to live anywhere. (Though, people could create a separate
> directory if wanted; this patch imposes no restrictions on the path).
>
> Signed-off-by: Daniel Latypov 
> ---

Really glad this is finally happening. I tried it out, and it seemed
to work pretty well.

I was wondering whether a positional argument like this was best, or
whether it'd be better to have an explicitly named argument
(--kunitconfig=path). Thinking about it though, I'm quite happy with
having this as-is: the only real other contender for a coveted
positional argument spot would've been the name of a test or test
suite (e.g., kunit.py run ext4_inode_test), and that's not really
possible with the kunit_tool architecture as-is.

One other comment below (should this work for kunit.py config?),
otherwise it looks good.

-- David

>  tools/testing/kunit/kunit.py   |  9 ++---
>  tools/testing/kunit/kunit_kernel.py| 12 
>  tools/testing/kunit/kunit_tool_test.py | 25 +
>  3 files changed, 39 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
> index e808a47c839b..3204a23bd16e 100755
> --- a/tools/testing/kunit/kunit.py
> +++ b/tools/testing/kunit/kunit.py
> @@ -188,6 +188,9 @@ def add_build_opts(parser) -> None:
> help='As in the make command, "Specifies  the 
> number of '
> 'jobs (commands) to run simultaneously."',
> type=int, default=8, metavar='jobs')
> +   parser.add_argument('kunitconfig',
> +help='Path to Kconfig fragment that enables 
> KUnit tests',
> +type=str, nargs='?', metavar='kunitconfig')
>

Should this maybe be in add_common_opts()? I'd assume that we want
kunit.py config to accept this custom kunitconfig path as well.

>  def add_exec_opts(parser) -> None:
> parser.add_argument('--timeout',
> @@ -256,7 +259,7 @@ def main(argv, linux=None):
> os.mkdir(cli_args.build_dir)
>
> if not linux:
> -   linux = 
> kunit_kernel.LinuxSourceTree(cli_args.build_dir)
> +   linux = 
> kunit_kernel.LinuxSourceTree(cli_args.build_dir, 
> kunitconfig_path=cli_args.kunitconfig)
>
> request = KunitRequest(cli_args.raw_output,
>cli_args.timeout,
> @@ -274,7 +277,7 @@ def main(argv, linux=None):
> os.mkdir(cli_args.build_dir)
>
> if not linux:
> -   linux = 
> kunit_kernel.LinuxSourceTree(cli_args.build_dir)
> +   linux = 
> kunit_kernel.LinuxSourceTree(cli_args.build_dir, 
> kunitconfig_path=cli_args.kunitconfig)
>
> request = KunitConfigRequest(cli_args.build_dir,
>  cli_args.make_options)
> @@ -286,7 +289,7 @@ def main(argv, linux=None):
> sys.exit(1)
> elif cli_args.subcommand == 'build':
> if not linux:
> -   linux = 
> kunit_kernel.LinuxSourceTree(cli_args.build_dir)
> +   linux = 
> kunit_kernel.LinuxSourceTree(cli_args.build_dir, 
> kunitconfig_path=cli_args.kunitconfig)
>
> request = KunitBuildRequest(cli_args.jobs,
> cli_args.build_dir,
> diff --git a/tools/testing/kunit/kunit_kernel.py 
> b/tools/testing/kunit/kunit_kernel.py
> index 2076a5a2d060..0b461663e7d9 100644
> --- a/tools/testing/kunit/kunit_kernel.py
> +++ b/tools/testing/kunit/kunit_kernel.py
> @@ -123,7 +123,7 @@ def get_outfile_path(build_dir) -> str:
>  class LinuxSourceTree(object):
> """Represents a Linux kernel source tree with KUnit tests."""
>
> -   def __init__(self, build_dir: str, load_config=True, 
> 

Re: [PATCH v3 2/3] scsi: ufs: Fix a race condition btw task management request send and compl

2021-01-28 Thread Can Guo

On 2021-01-29 14:06, Can Guo wrote:

On 2021-01-29 11:20, Bart Van Assche wrote:

On 1/27/21 8:16 PM, Can Guo wrote:
ufshcd_compl_tm() looks for all 0 bits in the 
REG_UTP_TASK_REQ_DOOR_BELL
and call complete() for each req who has the req->end_io_data set. 
There
can be a race condition btw tmc send/compl, because the 
req->end_io_data is
set, in __ufshcd_issue_tm_cmd(), without host lock protection, so it 
is
possible that when ufshcd_compl_tm() checks the req->end_io_data, it 
is set
but the corresponding tag has not been set in 
REG_UTP_TASK_REQ_DOOR_BELL.
Thus, ufshcd_tmc_handler() may wrongly complete TMRs which have not 
been
sent out. Fix it by protecting req->end_io_data with host lock, and 
let

ufshcd_compl_tm() only handle those tm cmds which have been completed
instead of looking for 0 bits in the REG_UTP_TASK_REQ_DOOR_BELL.


I don't know any other block driver that needs locking to protect 
races

between submission and completion context. Can the block layer timeout
mechanism be used instead of the mechanism introduced by this patch,
e.g. by using blk_execute_rq_nowait() to submit requests? That would
allow to reuse the existing mechanism in the block layer core to 
handle

races between request completion and timeout handling.


This patch is not introducing any new mechanism, it is fixing the
usage of completion (req->end_io_data = c) introduced by commit
69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate
and free TMFs"). If you have better idea to get it fixed once for
all, we are glad to take your change to get it fixed asap.

Regards,

Can Guo.



On second thought, actually the 1st fix alone is enough to eliminate the
race condition. Because blk_mq_tagset_busy_iter() only iterates over all
requests which are not in IDLE state, if blk_mq_start_request() is 
called

within the protection of host spin lock, ufshcd_compl_tm() shall not run
into the scenario where req->end_io_data is set but 
REG_UTP_TASK_REQ_DOOR_BELL

has not been set. What do you think?

Thanks,

Can Guo.



Thanks,

Bart.


[PATCH RESEND v2 09/10] xfs: Implement ->corrupted_range() for XFS

2021-01-28 Thread Shiyang Ruan
This function is used to handle errors which may cause data lost in
filesystem.  Such as memory failure in fsdax mode.

In XFS, it requires "rmapbt" feature in order to query for files or
metadata which associated to the corrupted data.  Then we could call fs
recover functions to try to repair the corrupted data.(did not
implemented in this patchset)

After that, the memory failure also needs to notify the processes who
are using those files.

Only support data device.  Realtime device is not supported for now.

Signed-off-by: Shiyang Ruan 
---
 fs/xfs/xfs_fsops.c |   5 +++
 fs/xfs/xfs_mount.h |   1 +
 fs/xfs/xfs_super.c | 109 +
 3 files changed, 115 insertions(+)

diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 959ce91a3755..f03901a5c673 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -498,6 +498,11 @@ xfs_do_force_shutdown(
 "Corruption of in-memory data detected.  Shutting down filesystem");
if (XFS_ERRLEVEL_HIGH <= xfs_error_level)
xfs_stack_trace();
+   } else if (flags & SHUTDOWN_CORRUPT_META) {
+   xfs_alert_tag(mp, XFS_PTAG_SHUTDOWN_CORRUPT,
+"Corruption of on-disk metadata detected.  Shutting down filesystem");
+   if (XFS_ERRLEVEL_HIGH <= xfs_error_level)
+   xfs_stack_trace();
} else if (logerror) {
xfs_alert_tag(mp, XFS_PTAG_SHUTDOWN_LOGERROR,
"Log I/O Error Detected. Shutting down filesystem");
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index dfa429b77ee2..8f0df67ffcc1 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -274,6 +274,7 @@ void xfs_do_force_shutdown(struct xfs_mount *mp, int flags, 
char *fname,
 #define SHUTDOWN_LOG_IO_ERROR  0x0002  /* write attempt to the log failed */
 #define SHUTDOWN_FORCE_UMOUNT  0x0004  /* shutdown from a forced unmount */
 #define SHUTDOWN_CORRUPT_INCORE0x0008  /* corrupt in-memory data 
structures */
+#define SHUTDOWN_CORRUPT_META  0x0010  /* corrupt metadata on device */
 
 /*
  * Flags for xfs_mountfs
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 813be879a5e5..93093fe0ee8a 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -35,6 +35,11 @@
 #include "xfs_refcount_item.h"
 #include "xfs_bmap_item.h"
 #include "xfs_reflink.h"
+#include "xfs_alloc.h"
+#include "xfs_rmap.h"
+#include "xfs_rmap_btree.h"
+#include "xfs_rtalloc.h"
+#include "xfs_bit.h"
 
 #include 
 #include 
@@ -1105,6 +1110,109 @@ xfs_fs_free_cached_objects(
return xfs_reclaim_inodes_nr(XFS_M(sb), sc->nr_to_scan);
 }
 
+static int
+xfs_corrupt_helper(
+   struct xfs_btree_cur*cur,
+   struct xfs_rmap_irec*rec,
+   void*data)
+{
+   struct xfs_inode*ip;
+   struct address_space*mapping;
+   int rc = 0;
+   int *flags = data;
+
+   if (XFS_RMAP_NON_INODE_OWNER(rec->rm_owner) ||
+   (rec->rm_flags & (XFS_RMAP_ATTR_FORK | XFS_RMAP_BMBT_BLOCK))) {
+   // TODO check and try to fix metadata
+   rc = -EFSCORRUPTED;
+   } else {
+   /*
+* Get files that incore, filter out others that are not in use.
+*/
+   rc = xfs_iget(cur->bc_mp, cur->bc_tp, rec->rm_owner,
+ XFS_IGET_INCORE, 0, );
+   if (rc || !ip)
+   return rc;
+   if (!VFS_I(ip)->i_mapping)
+   goto out;
+
+   mapping = VFS_I(ip)->i_mapping;
+   if (IS_DAX(VFS_I(ip)))
+   rc = mf_dax_mapping_kill_procs(mapping, rec->rm_offset,
+  *flags);
+   else
+   mapping_set_error(mapping, -EIO);
+
+   // TODO try to fix data
+out:
+   xfs_irele(ip);
+   }
+
+   return rc;
+}
+
+static int
+xfs_fs_corrupted_range(
+   struct super_block  *sb,
+   struct block_device *bdev,
+   loff_t  offset,
+   size_t  len,
+   void*data)
+{
+   struct xfs_mount*mp = XFS_M(sb);
+   struct xfs_trans*tp = NULL;
+   struct xfs_btree_cur*cur = NULL;
+   struct xfs_rmap_irecrmap_low, rmap_high;
+   struct xfs_buf  *agf_bp = NULL;
+   xfs_fsblock_t   fsbno = XFS_B_TO_FSB(mp, offset);
+   xfs_filblks_t   bcnt = XFS_B_TO_FSB(mp, len);
+   xfs_agnumber_t  agno = XFS_FSB_TO_AGNO(mp, fsbno);
+   xfs_agblock_t   agbno = XFS_FSB_TO_AGBNO(mp, fsbno);
+   int error = 0;
+
+   if (mp->m_rtdev_targp && mp->m_rtdev_targp->bt_bdev == bdev) {
+   xfs_warn(mp, "corrupted_range support not available for 
realtime 

[PATCH RESEND v2 07/10] dm: Introduce ->rmap() to find bdev offset

2021-01-28 Thread Shiyang Ruan
Pmem device could be a target of mapped device.  In order to obtain
superblock on the mapped device, we introduce this to translate offset
from target device to md device.

Currently, we implement it on linear target, which is easy to do the
translation.  Other targets will be supported in the future.  However,
some targets may not support it because of the non-linear mapping.

Signed-off-by: Shiyang Ruan 
---
 drivers/md/dm-linear.c| 20 
 include/linux/device-mapper.h |  5 +
 2 files changed, 25 insertions(+)

diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
index 00774b5d7668..90fdb4700afd 100644
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -5,6 +5,7 @@
  */
 
 #include "dm.h"
+#include "dm-core.h"
 #include 
 #include 
 #include 
@@ -119,6 +120,24 @@ static void linear_status(struct dm_target *ti, 
status_type_t type,
}
 }
 
+static int linear_rmap(struct dm_target *ti, sector_t offset,
+  rmap_callout_fn fn, void *data)
+{
+   struct linear_c *lc = (struct linear_c *) ti->private;
+   struct mapped_device *md = ti->table->md;
+   struct block_device *bdev;
+   sector_t disk_sect = offset - dm_target_offset(ti, lc->start);
+   int rc = -ENODEV;
+
+   bdev = bdget_disk_sector(md->disk, offset);
+   if (!bdev)
+   return rc;
+
+   rc = fn(ti, bdev, disk_sect, data);
+   bdput(bdev);
+   return rc;
+}
+
 static int linear_prepare_ioctl(struct dm_target *ti, struct block_device 
**bdev)
 {
struct linear_c *lc = (struct linear_c *) ti->private;
@@ -238,6 +257,7 @@ static struct target_type linear_target = {
.ctr= linear_ctr,
.dtr= linear_dtr,
.map= linear_map,
+   .rmap   = linear_rmap,
.status = linear_status,
.prepare_ioctl = linear_prepare_ioctl,
.iterate_devices = linear_iterate_devices,
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 61a66fb8ebb3..c5cd1009a08d 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -58,6 +58,10 @@ typedef void (*dm_dtr_fn) (struct dm_target *ti);
  * = 2: The target wants to push back the io
  */
 typedef int (*dm_map_fn) (struct dm_target *ti, struct bio *bio);
+typedef int (*rmap_callout_fn) (struct dm_target *ti, struct block_device 
*bdev,
+   sector_t sect, void *data);
+typedef int (*dm_rmap_fn) (struct dm_target *ti, sector_t offset,
+  rmap_callout_fn fn, void *data);
 typedef int (*dm_clone_and_map_request_fn) (struct dm_target *ti,
struct request *rq,
union map_info *map_context,
@@ -175,6 +179,7 @@ struct target_type {
dm_ctr_fn ctr;
dm_dtr_fn dtr;
dm_map_fn map;
+   dm_rmap_fn rmap;
dm_clone_and_map_request_fn clone_and_map_rq;
dm_release_clone_request_fn release_clone_rq;
dm_endio_fn end_io;
-- 
2.30.0





[PATCH RESEND v2 08/10] md: Implement ->corrupted_range()

2021-01-28 Thread Shiyang Ruan
With the support of ->rmap(), it is possible to obtain the superblock on
a mapped device.

If a pmem device is used as one target of mapped device, we cannot
obtain its superblock directly.  With the help of SYSFS, the mapped
device can be found on the target devices.  So, we iterate the
bdev->bd_holder_disks to obtain its mapped device.

Signed-off-by: Shiyang Ruan 
---
 drivers/md/dm.c   | 61 +++
 drivers/nvdimm/pmem.c | 11 +++-
 fs/block_dev.c| 42 -
 include/linux/genhd.h |  2 ++
 4 files changed, 107 insertions(+), 9 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 7bac564f3faa..31b0c340b695 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -507,6 +507,66 @@ static int dm_blk_report_zones(struct gendisk *disk, 
sector_t sector,
 #define dm_blk_report_zonesNULL
 #endif /* CONFIG_BLK_DEV_ZONED */
 
+struct corrupted_hit_info {
+   struct block_device *bdev;
+   sector_t offset;
+};
+
+static int dm_blk_corrupted_hit(struct dm_target *ti, struct dm_dev *dev,
+   sector_t start, sector_t count, void *data)
+{
+   struct corrupted_hit_info *bc = data;
+
+   return bc->bdev == (void *)dev->bdev &&
+   (start <= bc->offset && bc->offset < start + count);
+}
+
+struct corrupted_do_info {
+   size_t length;
+   void *data;
+};
+
+static int dm_blk_corrupted_do(struct dm_target *ti, struct block_device *bdev,
+  sector_t disk_sect, void *data)
+{
+   struct corrupted_do_info *bc = data;
+   loff_t disk_off = to_bytes(disk_sect);
+   loff_t bdev_off = to_bytes(disk_sect - get_start_sect(bdev));
+
+   return bd_corrupted_range(bdev, disk_off, bdev_off, bc->length, 
bc->data);
+}
+
+static int dm_blk_corrupted_range(struct gendisk *disk,
+ struct block_device *target_bdev,
+ loff_t target_offset, size_t len, void *data)
+{
+   struct mapped_device *md = disk->private_data;
+   struct dm_table *map;
+   struct dm_target *ti;
+   sector_t target_sect = to_sector(target_offset);
+   struct corrupted_hit_info hi = {target_bdev, target_sect};
+   struct corrupted_do_info di = {len, data};
+   int srcu_idx, i, rc = -ENODEV;
+
+   map = dm_get_live_table(md, _idx);
+   if (!map)
+   return rc;
+
+   for (i = 0; i < dm_table_get_num_targets(map); i++) {
+   ti = dm_table_get_target(map, i);
+   if (!(ti->type->iterate_devices && ti->type->rmap))
+   continue;
+   if (!ti->type->iterate_devices(ti, dm_blk_corrupted_hit, ))
+   continue;
+
+   rc = ti->type->rmap(ti, target_sect, dm_blk_corrupted_do, );
+   break;
+   }
+
+   dm_put_live_table(md, srcu_idx);
+   return rc;
+}
+
 static int dm_prepare_ioctl(struct mapped_device *md, int *srcu_idx,
struct block_device **bdev)
 {
@@ -3062,6 +3122,7 @@ static const struct block_device_operations dm_blk_dops = 
{
.getgeo = dm_blk_getgeo,
.report_zones = dm_blk_report_zones,
.pr_ops = _pr_ops,
+   .corrupted_range = dm_blk_corrupted_range,
.owner = THIS_MODULE
 };
 
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 501959947d48..3d9f4ccbbd9e 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -256,21 +256,16 @@ static int pmem_rw_page(struct block_device *bdev, 
sector_t sector,
 static int pmem_corrupted_range(struct gendisk *disk, struct block_device 
*bdev,
loff_t disk_offset, size_t len, void *data)
 {
-   struct super_block *sb;
loff_t bdev_offset;
sector_t disk_sector = disk_offset >> SECTOR_SHIFT;
-   int rc = 0;
+   int rc = -ENODEV;
 
bdev = bdget_disk_sector(disk, disk_sector);
if (!bdev)
-   return -ENODEV;
+   return rc;
 
bdev_offset = (disk_sector - get_start_sect(bdev)) << SECTOR_SHIFT;
-   sb = get_super(bdev);
-   if (sb && sb->s_op->corrupted_range) {
-   rc = sb->s_op->corrupted_range(sb, bdev, bdev_offset, len, 
data);
-   drop_super(sb);
-   }
+   rc = bd_corrupted_range(bdev, bdev_offset, bdev_offset, len, data);
 
bdput(bdev);
return rc;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 3b8963e228a1..3cc2b2911e3a 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1079,6 +1079,27 @@ struct bd_holder_disk {
int refcnt;
 };
 
+static int bd_disk_holder_corrupted_range(struct block_device *bdev, loff_t 
off,
+ size_t len, void *data)
+{
+   struct bd_holder_disk *holder;
+   struct gendisk *disk;
+   int rc = 0;
+
+   if (list_empty(&(bdev->bd_holder_disks)))
+   

[PATCH RESEND v2 10/10] fs/dax: Remove useless functions

2021-01-28 Thread Shiyang Ruan
Since owner tarcking is triggerred by pmem device, these functions are
useless.  So remove it.

Signed-off-by: Shiyang Ruan 
---
 fs/dax.c | 46 --
 1 file changed, 46 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index c64c3a0e76a6..e20a5df03eec 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -323,48 +323,6 @@ static unsigned long dax_end_pfn(void *entry)
for (pfn = dax_to_pfn(entry); \
pfn < dax_end_pfn(entry); pfn++)
 
-/*
- * TODO: for reflink+dax we need a way to associate a single page with
- * multiple address_space instances at different linear_page_index()
- * offsets.
- */
-static void dax_associate_entry(void *entry, struct address_space *mapping,
-   struct vm_area_struct *vma, unsigned long address)
-{
-   unsigned long size = dax_entry_size(entry), pfn, index;
-   int i = 0;
-
-   if (IS_ENABLED(CONFIG_FS_DAX_LIMITED))
-   return;
-
-   index = linear_page_index(vma, address & ~(size - 1));
-   for_each_mapped_pfn(entry, pfn) {
-   struct page *page = pfn_to_page(pfn);
-
-   WARN_ON_ONCE(page->mapping);
-   page->mapping = mapping;
-   page->index = index + i++;
-   }
-}
-
-static void dax_disassociate_entry(void *entry, struct address_space *mapping,
-   bool trunc)
-{
-   unsigned long pfn;
-
-   if (IS_ENABLED(CONFIG_FS_DAX_LIMITED))
-   return;
-
-   for_each_mapped_pfn(entry, pfn) {
-   struct page *page = pfn_to_page(pfn);
-
-   WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
-   WARN_ON_ONCE(page->mapping && page->mapping != mapping);
-   page->mapping = NULL;
-   page->index = 0;
-   }
-}
-
 static struct page *dax_busy_page(void *entry)
 {
unsigned long pfn;
@@ -543,7 +501,6 @@ static void *grab_mapping_entry(struct xa_state *xas,
xas_lock_irq(xas);
}
 
-   dax_disassociate_entry(entry, mapping, false);
xas_store(xas, NULL);   /* undo the PMD join */
dax_wake_entry(xas, entry, true);
mapping->nrexceptional--;
@@ -680,7 +637,6 @@ static int __dax_invalidate_entry(struct address_space 
*mapping,
(xas_get_mark(, PAGECACHE_TAG_DIRTY) ||
 xas_get_mark(, PAGECACHE_TAG_TOWRITE)))
goto out;
-   dax_disassociate_entry(entry, mapping, trunc);
xas_store(, NULL);
mapping->nrexceptional--;
ret = 1;
@@ -774,8 +730,6 @@ static void *dax_insert_entry(struct xa_state *xas,
if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) {
void *old;
 
-   dax_disassociate_entry(entry, mapping, false);
-   dax_associate_entry(new_entry, mapping, vmf->vma, vmf->address);
/*
 * Only swap our new entry into the page cache if the current
 * entry is a zero page or an empty entry.  If a normal PTE or
-- 
2.30.0





[PATCH RESEND v2 04/10] mm, fsdax: Refactor memory-failure handler for dax mapping

2021-01-28 Thread Shiyang Ruan
The current memory_failure_dev_pagemap() can only handle single-mapped
dax page for fsdax mode.  The dax page could be mapped by multiple files
and offsets if we let reflink feature & fsdax mode work together.  So,
we refactor current implementation to support handle memory failure on
each file and offset.

Signed-off-by: Shiyang Ruan 
---
 fs/dax.c| 21 ++
 include/linux/dax.h |  1 +
 include/linux/mm.h  |  9 +
 mm/memory-failure.c | 98 ++---
 4 files changed, 105 insertions(+), 24 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 26d5dcd2d69e..c64c3a0e76a6 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -378,6 +378,27 @@ static struct page *dax_busy_page(void *entry)
return NULL;
 }
 
+/*
+ * dax_load_pfn - Load pfn of the DAX entry corresponding to a page
+ * @mapping: The file whose entry we want to load
+ * @index:   The offset where the DAX entry located in
+ *
+ * Return:   pfn of the DAX entry
+ */
+unsigned long dax_load_pfn(struct address_space *mapping, unsigned long index)
+{
+   XA_STATE(xas, >i_pages, index);
+   void *entry;
+   unsigned long pfn;
+
+   xas_lock_irq();
+   entry = xas_load();
+   pfn = dax_to_pfn(entry);
+   xas_unlock_irq();
+
+   return pfn;
+}
+
 /*
  * dax_lock_mapping_entry - Lock the DAX entry corresponding to a page
  * @page: The page whose entry we want to lock
diff --git a/include/linux/dax.h b/include/linux/dax.h
index b52f084aa643..89e56ceeffc7 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -150,6 +150,7 @@ int dax_writeback_mapping_range(struct address_space 
*mapping,
 
 struct page *dax_layout_busy_page(struct address_space *mapping);
 struct page *dax_layout_busy_page_range(struct address_space *mapping, loff_t 
start, loff_t end);
+unsigned long dax_load_pfn(struct address_space *mapping, unsigned long index);
 dax_entry_t dax_lock_page(struct page *page);
 void dax_unlock_page(struct page *page, dax_entry_t cookie);
 #else
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecdf8a8cd6ae..ab52bc633d84 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1157,6 +1157,14 @@ static inline bool is_device_private_page(const struct 
page *page)
page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }
 
+static inline bool is_device_fsdax_page(const struct page *page)
+{
+   return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+   IS_ENABLED(CONFIG_FS_DAX) &&
+   is_zone_device_page(page) &&
+   page->pgmap->type == MEMORY_DEVICE_FS_DAX;
+}
+
 static inline bool is_pci_p2pdma_page(const struct page *page)
 {
return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
@@ -3045,6 +3053,7 @@ enum mf_flags {
MF_MUST_KILL = 1 << 2,
MF_SOFT_OFFLINE = 1 << 3,
 };
+extern int mf_dax_mapping_kill_procs(struct address_space *mapping, pgoff_t 
index, int flags);
 extern int memory_failure(unsigned long pfn, int flags);
 extern void memory_failure_queue(unsigned long pfn, int flags);
 extern void memory_failure_queue_kick(int cpu);
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index e9481632fcd1..158fe0c8e602 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -56,6 +56,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "internal.h"
 #include "ras/ras_event.h"
 
@@ -120,6 +121,13 @@ static int hwpoison_filter_dev(struct page *p)
if (PageSlab(p))
return -EINVAL;
 
+   if (pfn_valid(page_to_pfn(p))) {
+   if (is_device_fsdax_page(p))
+   return 0;
+   else
+   return -EINVAL;
+   }
+
mapping = page_mapping(p);
if (mapping == NULL || mapping->host == NULL)
return -EINVAL;
@@ -286,10 +294,9 @@ void shake_page(struct page *p, int access)
 }
 EXPORT_SYMBOL_GPL(shake_page);
 
-static unsigned long dev_pagemap_mapping_shift(struct page *page,
-   struct vm_area_struct *vma)
+static unsigned long dev_pagemap_mapping_shift(struct vm_area_struct *vma,
+  unsigned long address)
 {
-   unsigned long address = vma_address(page, vma);
pgd_t *pgd;
p4d_t *p4d;
pud_t *pud;
@@ -329,9 +336,8 @@ static unsigned long dev_pagemap_mapping_shift(struct page 
*page,
  * Schedule a process for later kill.
  * Uses GFP_ATOMIC allocations to avoid potential recursions in the VM.
  */
-static void add_to_kill(struct task_struct *tsk, struct page *p,
-  struct vm_area_struct *vma,
-  struct list_head *to_kill)
+static void add_to_kill(struct task_struct *tsk, struct page *p, pgoff_t pgoff,
+   struct vm_area_struct *vma, struct list_head *to_kill)
 {
struct to_kill *tk;
 
@@ -342,9 +348,12 @@ static void add_to_kill(struct task_struct *tsk, struct 
page *p,
}
 
tk->addr = page_address_in_vma(p, vma);
-   if 

[PATCH RESEND v2 05/10] mm, pmem: Implement ->memory_failure() in pmem driver

2021-01-28 Thread Shiyang Ruan
Call the ->memory_failure() which is implemented by pmem driver, in
order to finally notify filesystem to handle the corrupted data.  The
handler which collects and kills processes are moved into
mf_dax_mapping_kill_procs(), which will be called by filesystem.

Keep the old handler in order to roll back if driver/device/filesystem
does not support ->memory_failure()/->corrupted_range().

Signed-off-by: Shiyang Ruan 
---
 drivers/nvdimm/pmem.c |  25 +++
 mm/memory-failure.c   | 102 +-
 2 files changed, 86 insertions(+), 41 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 875076b0ea6c..c9e4fb38f94a 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -363,9 +363,34 @@ static void pmem_release_disk(void *__pmem)
put_disk(pmem->disk);
 }
 
+static int pmem_pagemap_memory_failure(struct dev_pagemap *pgmap,
+   unsigned long pfn, int flags)
+{
+   struct pmem_device *pdev;
+   struct gendisk *disk;
+   loff_t disk_offset;
+   int rc = 0;
+   unsigned long size = page_size(pfn_to_page(pfn));
+
+   pdev = container_of(pgmap, struct pmem_device, pgmap);
+   disk = pdev->disk;
+   if (!disk)
+   return -ENXIO;
+
+   disk_offset = PFN_PHYS(pfn) - pdev->phys_addr - pdev->data_offset;
+   if (disk->fops->corrupted_range) {
+   rc = disk->fops->corrupted_range(disk, NULL, disk_offset, size, 
);
+   if (rc == -ENODEV)
+   rc = -ENXIO;
+   } else
+   rc = -EOPNOTSUPP;
+   return rc;
+}
+
 static const struct dev_pagemap_ops fsdax_pagemap_ops = {
.kill   = pmem_pagemap_kill,
.cleanup= pmem_pagemap_cleanup,
+   .memory_failure = pmem_pagemap_memory_failure,
 };
 
 static int pmem_attach_disk(struct device *dev,
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 158fe0c8e602..670e29cd263e 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1219,6 +1219,54 @@ static int try_to_split_thp_page(struct page *page, 
const char *msg)
return 0;
 }
 
+int mf_generic_kill_procs(unsigned long long pfn, int flags)
+{
+   struct page *page = pfn_to_page(pfn);
+   const bool unmap_success = true;
+   unsigned long size = 0;
+   struct to_kill *tk;
+   LIST_HEAD(to_kill);
+   loff_t start;
+   dax_entry_t cookie;
+
+   /*
+* Prevent the inode from being freed while we are interrogating
+* the address_space, typically this would be handled by
+* lock_page(), but dax pages do not use the page lock. This
+* also prevents changes to the mapping of this pfn until
+* poison signaling is complete.
+*/
+   cookie = dax_lock_page(page);
+   if (!cookie)
+   return -EBUSY;
+   /*
+* Unlike System-RAM there is no possibility to swap in a
+* different physical page at a given virtual address, so all
+* userspace consumption of ZONE_DEVICE memory necessitates
+* SIGBUS (i.e. MF_MUST_KILL)
+*/
+   flags |= MF_ACTION_REQUIRED | MF_MUST_KILL;
+   collect_procs(page, _kill, flags & MF_ACTION_REQUIRED);
+
+   list_for_each_entry(tk, _kill, nd)
+   if (tk->size_shift)
+   size = max(size, 1UL << tk->size_shift);
+   if (size) {
+   /*
+* Unmap the largest mapping to avoid breaking up
+* device-dax mappings which are constant size. The
+* actual size of the mapping being torn down is
+* communicated in siginfo, see kill_proc()
+*/
+   start = (page->index << PAGE_SHIFT) & ~(size - 1);
+   unmap_mapping_range(page->mapping, start, start + size, 0);
+   }
+   kill_procs(_kill, flags & MF_MUST_KILL, !unmap_success, pfn, flags);
+
+   dax_unlock_page(page, cookie);
+   return 0;
+}
+
 int mf_dax_mapping_kill_procs(struct address_space *mapping, pgoff_t index, 
int flags)
 {
const bool unmap_success = true;
@@ -1343,13 +1391,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, 
int flags,
struct dev_pagemap *pgmap)
 {
struct page *page = pfn_to_page(pfn);
-   const bool unmap_success = true;
-   unsigned long size = 0;
-   struct to_kill *tk;
-   LIST_HEAD(to_kill);
int rc = -EBUSY;
-   loff_t start;
-   dax_entry_t cookie;
 
if (flags & MF_COUNT_INCREASED)
/*
@@ -1357,20 +1399,9 @@ static int memory_failure_dev_pagemap(unsigned long pfn, 
int flags,
 */
put_page(page);
 
-   /*
-* Prevent the inode from being freed while we are interrogating
-* the address_space, typically this would be handled by
-* lock_page(), but dax pages do not use the page lock. This
-* also 

Re: [RFC PATCH v3 03/13] af_vsock: implement SEQPACKET rx loop

2021-01-28 Thread Arseny Krasnov


On 28.01.2021 19:55, Stefano Garzarella wrote:
> On Mon, Jan 25, 2021 at 02:12:36PM +0300, Arseny Krasnov wrote:
>> This adds receive loop for SEQPACKET. It looks like receive loop for
>> SEQPACKET, but there is a little bit difference:
>> 1) It doesn't call notify callbacks.
>> 2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
>>   there is no sense for these values in SEQPACKET case.
>> 3) It waits until whole record is received or error is found during
>>   receiving.
>> 4) It processes and sets 'MSG_TRUNC' flag.
>>
>> So to avoid extra conditions for two types of socket inside one loop, two
>> independent functions were created.
>>
>> Signed-off-by: Arseny Krasnov 
>> ---
>> include/net/af_vsock.h   |   5 ++
>> net/vmw_vsock/af_vsock.c | 102 ++-
>> 2 files changed, 106 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>> index b1c717286993..46073842d489 100644
>> --- a/include/net/af_vsock.h
>> +++ b/include/net/af_vsock.h
>> @@ -135,6 +135,11 @@ struct vsock_transport {
>>  bool (*stream_is_active)(struct vsock_sock *);
>>  bool (*stream_allow)(u32 cid, u32 port);
>>
>> +/* SEQ_PACKET. */
>> +size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
>> +ssize_t (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
>> + size_t len, int flags);
>> +
>>  /* Notification. */
>>  int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
>>  int (*notify_poll_out)(struct vsock_sock *, size_t, bool *);
>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> index 524df8fc84cd..3b266880b7c8 100644
>> --- a/net/vmw_vsock/af_vsock.c
>> +++ b/net/vmw_vsock/af_vsock.c
>> @@ -2006,7 +2006,107 @@ static int __vsock_stream_recvmsg(struct sock *sk, 
>> struct msghdr *msg,
>> static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
>>   size_t len, int flags)
>> {
>> -return -1;
>> +const struct vsock_transport *transport;
>> +const struct iovec *orig_iov;
>> +unsigned long orig_nr_segs;
>> +ssize_t dequeued_total = 0;
>> +struct vsock_sock *vsk;
>> +size_t record_len;
>> +long timeout;
>> +int err = 0;
>> +DEFINE_WAIT(wait);
>> +
>> +vsk = vsock_sk(sk);
>> +transport = vsk->transport;
>> +
>> +timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
>> +msg->msg_flags &= ~MSG_EOR;
> Maybe add a comment about why we need to clear MSG_EOR.
>
>> +orig_nr_segs = msg->msg_iter.nr_segs;
>> +orig_iov = msg->msg_iter.iov;
>> +
>> +while (1) {
>> +ssize_t dequeued;
>> +s64 ready;
>> +
>> +prepare_to_wait(sk_sleep(sk), , TASK_INTERRUPTIBLE);
>> +ready = vsock_stream_has_data(vsk);
>> +
>> +if (ready == 0) {
>> +if (vsock_wait_data(sk, , timeout, NULL, 0)) {
>> +/* In case of any loop break(timeout, signal
>> + * interrupt or shutdown), we report user that
>> + * nothing was copied.
>> + */
>> +dequeued_total = 0;
>> +break;
>> +}
>> +continue;
>> +}
>> +
>> +finish_wait(sk_sleep(sk), );
>> +
>> +if (ready < 0) {
>> +err = -ENOMEM;
>> +goto out;
>> +}
>> +
>> +if (dequeued_total == 0) {
>> +record_len =
>> +transport->seqpacket_seq_get_len(vsk);
>> +
>> +if (record_len == 0)
>> +continue;
>> +}
>> +
>> +/* 'msg_iter.count' is number of unused bytes in iov.
>> + * On every copy to iov iterator it is decremented at
>> + * size of data.
>> + */
>> +dequeued = transport->seqpacket_dequeue(vsk, msg,
>> +msg->msg_iter.count, flags);
>  ^
>  Is this needed or 'msg' can be 
>  used in the transport?
Yes, right
>> +
>> +if (dequeued < 0) {
>> +dequeued_total = 0;
>> +
>> +if (dequeued == -EAGAIN) {
>> +iov_iter_init(>msg_iter, READ,
>> +  orig_iov, orig_nr_segs,
>> +  len);
>> +msg->msg_flags &= ~MSG_EOR;
>> +continue;
> Why we need to reset MSG_EOR here?

Because if previous attempt to receive record was failed, but

MSG_EOR was set, so we clear it for next attempt to get record

>
>> +}
>> +
>> +   

[PATCH RESEND v2 06/10] pmem: Implement ->corrupted_range() for pmem driver

2021-01-28 Thread Shiyang Ruan
Obtain the superblock of a pmem disk, and call filesystem's
->corrupted_range() to handle the corrupted data.

Signed-off-by: Shiyang Ruan 
---
 block/genhd.c |  6 ++
 drivers/nvdimm/pmem.c | 24 
 include/linux/genhd.h |  1 +
 3 files changed, 31 insertions(+)

diff --git a/block/genhd.c b/block/genhd.c
index 419548e92d82..fd7cf03b65a8 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -936,6 +936,12 @@ struct block_device *bdget_disk(struct gendisk *disk, int 
partno)
return bdev;
 }
 
+struct block_device *bdget_disk_sector(struct gendisk *disk, sector_t sector)
+{
+   return disk_map_sector_rcu(disk, sector);
+}
+EXPORT_SYMBOL(bdget_disk_sector);
+
 /*
  * print a full list of all partitions - intended for places where the root
  * filesystem can't be mounted and thus to give the victim some idea of what
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index c9e4fb38f94a..501959947d48 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -253,6 +253,29 @@ static int pmem_rw_page(struct block_device *bdev, 
sector_t sector,
return blk_status_to_errno(rc);
 }
 
+static int pmem_corrupted_range(struct gendisk *disk, struct block_device 
*bdev,
+   loff_t disk_offset, size_t len, void *data)
+{
+   struct super_block *sb;
+   loff_t bdev_offset;
+   sector_t disk_sector = disk_offset >> SECTOR_SHIFT;
+   int rc = 0;
+
+   bdev = bdget_disk_sector(disk, disk_sector);
+   if (!bdev)
+   return -ENODEV;
+
+   bdev_offset = (disk_sector - get_start_sect(bdev)) << SECTOR_SHIFT;
+   sb = get_super(bdev);
+   if (sb && sb->s_op->corrupted_range) {
+   rc = sb->s_op->corrupted_range(sb, bdev, bdev_offset, len, 
data);
+   drop_super(sb);
+   }
+
+   bdput(bdev);
+   return rc;
+}
+
 /* see "strong" declaration in tools/testing/nvdimm/pmem-dax.c */
 __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
long nr_pages, void **kaddr, pfn_t *pfn)
@@ -281,6 +304,7 @@ static const struct block_device_operations pmem_fops = {
.owner =THIS_MODULE,
.submit_bio =   pmem_submit_bio,
.rw_page =  pmem_rw_page,
+   .corrupted_range =  pmem_corrupted_range,
 };
 
 static int pmem_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 809aaa32d53c..4da480798955 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -248,6 +248,7 @@ static inline void add_disk_no_queue_reg(struct gendisk 
*disk)
 
 extern void del_gendisk(struct gendisk *gp);
 extern struct block_device *bdget_disk(struct gendisk *disk, int partno);
+extern struct block_device *bdget_disk_sector(struct gendisk *disk, sector_t 
sector);
 
 extern void set_disk_ro(struct gendisk *disk, int flag);
 
-- 
2.30.0





[PATCH RESEND v2 00/10] fsdax: introduce fs query to support reflink

2021-01-28 Thread Shiyang Ruan
This patchset is aimed to support shared pages tracking for fsdax.

Resend V2:
  - Cc dm-devel instead of linux-raid

Change from V1:
  - Add the old memory-failure handler back for rolling back
  - Add callback in MD's ->rmap() to support multiple mapping of dm device
  - Add judgement for CONFIG_SYSFS
  - Add pfn_valid() judgement in hwpoison_filter()
  - Rebased to v5.11-rc5

Change from RFC v3:
  - Do not lock dax entry in memory failure handler
  - Add a helper function for corrupted_range
  - Add restrictions in xfs code
  - Fix code style
  - remove the useless association and lock in fsdax

Change from RFC v2:
  - Adjust the order of patches
  - Divide the infrastructure and the drivers that use it
  - Rebased to v5.10

Change from RFC v1:
  - Introduce ->block_lost() for block device
  - Support mapped device
  - Add 'not available' warning for realtime device in XFS
  - Rebased to v5.10-rc1

This patchset moves owner tracking from dax_assocaite_entry() to pmem
device driver, by introducing an interface ->memory_failure() of struct
pagemap.  This interface is called by memory_failure() in mm, and
implemented by pmem device.  Then pmem device calls its ->corrupted_range()
to find the filesystem which the corrupted data located in, and call
filesystem handler to track files or metadata assocaited with this page.
Finally we are able to try to fix the corrupted data in filesystem and do
other necessary processing, such as killing processes who are using the
files affected.

The call trace is like this:
memory_failure()
 pgmap->ops->memory_failure()  => pmem_pgmap_memory_failure()
  gendisk->fops->corrupted_range() => - pmem_corrupted_range()
  - md_blk_corrupted_range()
   sb->s_ops->currupted_range()=> xfs_fs_corrupted_range()
xfs_rmap_query_range()
 xfs_currupt_helper()
  * corrupted on metadata
  try to recover data, call xfs_force_shutdown()
  * corrupted on file data 
  try to recover data, call mf_dax_mapping_kill_procs()

The fsdax & reflink support for XFS is not contained in this patchset.

(Rebased on v5.11-rc5)

Shiyang Ruan (10):
  pagemap: Introduce ->memory_failure()
  blk: Introduce ->corrupted_range() for block device
  fs: Introduce ->corrupted_range() for superblock
  mm, fsdax: Refactor memory-failure handler for dax mapping
  mm, pmem: Implement ->memory_failure() in pmem driver
  pmem: Implement ->corrupted_range() for pmem driver
  dm: Introduce ->rmap() to find bdev offset
  md: Implement ->corrupted_range()
  xfs: Implement ->corrupted_range() for XFS
  fs/dax: Remove useless functions

 block/genhd.c |   6 ++
 drivers/md/dm-linear.c|  20 
 drivers/md/dm.c   |  61 +++
 drivers/nvdimm/pmem.c |  44 
 fs/block_dev.c|  42 +++-
 fs/dax.c  |  63 ---
 fs/xfs/xfs_fsops.c|   5 +
 fs/xfs/xfs_mount.h|   1 +
 fs/xfs/xfs_super.c| 109 +++
 include/linux/blkdev.h|   2 +
 include/linux/dax.h   |   1 +
 include/linux/device-mapper.h |   5 +
 include/linux/fs.h|   2 +
 include/linux/genhd.h |   3 +
 include/linux/memremap.h  |   8 ++
 include/linux/mm.h|   9 ++
 mm/memory-failure.c   | 190 +++---
 17 files changed, 466 insertions(+), 105 deletions(-)

-- 
2.30.0





[PATCH RESEND v2 01/10] pagemap: Introduce ->memory_failure()

2021-01-28 Thread Shiyang Ruan
When memory-failure occurs, we call this function which is implemented
by each kind of devices.  For the fsdax case, pmem device driver
implements it.  Pmem device driver will find out the block device where
the error page locates in, and try to get the filesystem on this block
device.  And finally call filesystem handler to deal with the error.
The filesystem will try to recover the corrupted data if possiable.

Signed-off-by: Shiyang Ruan 
---
 include/linux/memremap.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 79c49e7f5c30..0bcf2b1e20bd 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -87,6 +87,14 @@ struct dev_pagemap_ops {
 * the page back to a CPU accessible page.
 */
vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf);
+
+   /*
+* Handle the memory failure happens on one page.  Notify the processes
+* who are using this page, and try to recover the data on this page
+* if necessary.
+*/
+   int (*memory_failure)(struct dev_pagemap *pgmap, unsigned long pfn,
+ int flags);
 };
 
 #define PGMAP_ALTMAP_VALID (1 << 0)
-- 
2.30.0





[PATCH RESEND v2 03/10] fs: Introduce ->corrupted_range() for superblock

2021-01-28 Thread Shiyang Ruan
Memory failure occurs in fsdax mode will finally be handled in
filesystem.  We introduce this interface to find out files or metadata
affected by the corrupted range, and try to recover the corrupted data
if possiable.

Signed-off-by: Shiyang Ruan 
---
 include/linux/fs.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index fd47deea7c17..4cc9ff9caa87 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1963,6 +1963,8 @@ struct super_operations {
  struct shrink_control *);
long (*free_cached_objects)(struct super_block *,
struct shrink_control *);
+   int (*corrupted_range)(struct super_block *sb, struct block_device 
*bdev,
+  loff_t offset, size_t len, void *data);
 };
 
 /*
-- 
2.30.0





[PATCH RESEND v2 02/10] blk: Introduce ->corrupted_range() for block device

2021-01-28 Thread Shiyang Ruan
In fsdax mode, the memory failure happens on block device.  So, it is
needed to introduce an interface for block devices.  Each kind of block
device can handle the memory failure in ther own ways.

Signed-off-by: Shiyang Ruan 
---
 include/linux/blkdev.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f94ee3089e01..e0f5585aa06f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1867,6 +1867,8 @@ struct block_device_operations {
int (*report_zones)(struct gendisk *, sector_t sector,
unsigned int nr_zones, report_zones_cb cb, void *data);
char *(*devnode)(struct gendisk *disk, umode_t *mode);
+   int (*corrupted_range)(struct gendisk *disk, struct block_device *bdev,
+  loff_t offset, size_t len, void *data);
struct module *owner;
const struct pr_ops *pr_ops;
 };
-- 
2.30.0





Re: [PATCH -next] ACPI: tables: Mark acpi_init_fpdt with static keyword

2021-01-28 Thread Zhang Rui
Hi, Wei,

Thanks for the patch.

Given that there are a couple of things need to be fixed in the orignal
patch, I'd prefer to refresh the patch with all the fixes included

https://patchwork.kernel.org/project/linux-acpi/patch/20210129061548.13448-1-rui.zh...@intel.com/

what do you think?

thanks,
rui

On Thu, 2021-01-28 at 19:31 +0800, Zou Wei wrote:
> Fix the following sparse warning:
> 
> drivers/acpi/acpi_fpdt.c:230:6: warning: symbol 'acpi_init_fpdt' was
> not declared. Should it be static?
> 
> Signed-off-by: Zou Wei 
> ---
>  drivers/acpi/acpi_fpdt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/acpi_fpdt.c b/drivers/acpi/acpi_fpdt.c
> index b810811..968f9cc 100644
> --- a/drivers/acpi/acpi_fpdt.c
> +++ b/drivers/acpi/acpi_fpdt.c
> @@ -227,7 +227,7 @@ static int fpdt_process_subtable(u64 address, u32
> subtable_type)
>   return 0;
>  }
>  
> -void acpi_init_fpdt(void)
> +static void acpi_init_fpdt(void)
>  {
>   acpi_status status;
>   struct acpi_table_header *header;



Re: [PATCH v12 6/8] drm/mediatek: enable dither function

2021-01-28 Thread Hsin-Yi Wang
On Fri, Jan 29, 2021 at 9:33 AM CK Hu  wrote:
>
> Hi, Hsin-Yi:
>
> On Thu, 2021-01-28 at 19:23 +0800, Hsin-Yi Wang wrote:
> > From: Yongqiang Niu 
> >
> > for 5 or 6 bpc panel, we need enable dither function
> > to improve the display quality
> >
> > Signed-off-by: Yongqiang Niu 
> > Signed-off-by: Hsin-Yi Wang 
> > ---
> >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 15 +--
> >  1 file changed, 13 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> > b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > index ac2cb25620357..6c8f246380a74 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > @@ -53,6 +53,7 @@
> >  #define DITHER_ENBIT(0)
> >  #define DISP_DITHER_CFG  0x0020
> >  #define DITHER_RELAY_MODEBIT(0)
> > +#define DITHER_ENGINE_EN BIT(1)
> >  #define DISP_DITHER_SIZE 0x0030
> >
> >  #define LUT_10BIT_MASK   0x03ff
> > @@ -314,9 +315,19 @@ static void mtk_dither_config(struct device *dev, 
> > unsigned int w,
> > unsigned int bpc, struct cmdq_pkt *cmdq_pkt)
> >  {
> >   struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> > + bool enable = (bpc == 5 || bpc == 6);
>
> I strongly believe that dither function in dither is identical to the
> one in gamma and od, and in mtk_dither_set_common(), 'bpc >=
> MTK_MIN_BPC' is valid, so I believe we need not to limit bpc to 5 or 6.
> But we should consider the case that bpc is invalid in
> mtk_dither_set_common(). Invalid case in gamma and od use different way
> to process. For gamma, dither is default relay mode, so invalid bpc
> would do nothing in mtk_dither_set_common() and result in relay mode.
> For od, it set to relay mode first, them invalid bpc would do nothing in
> mtk_dither_set_common() and result in relay mode. I would like dither,
> gamma and od to process invalid bpc in the same way. One solution is to
> set relay mode in mtk_dither_set_common() for invalid bpc.
>
> Regards,
> CK
>

I modify the mtk_dither_config() to follow:


diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
index ac2cb25620357..5b7fcedb9f9a8 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
@@ -53,6 +53,7 @@
 #define DITHER_EN  BIT(0)
 #define DISP_DITHER_CFG0x0020
 #define DITHER_RELAY_MODE  BIT(0)
+#define DITHER_ENGINE_EN   BIT(1)
 #define DISP_DITHER_SIZE   0x0030

 #define LUT_10BIT_MASK 0x03ff
@@ -166,6 +167,8 @@ void mtk_dither_set_common(void __iomem *regs,
struct cmdq_client_reg *cmdq_reg,
  DITHER_ADD_LSHIFT_G(MTK_MAX_BPC - bpc),
  cmdq_reg, regs, DISP_DITHER_16);
mtk_ddp_write(cmdq_pkt, dither_en, cmdq_reg, regs, cfg);
+   } else {
+   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, cmdq_reg, regs, cfg);
}
 }

@@ -315,8 +318,12 @@ static void mtk_dither_config(struct device *dev,
unsigned int w,
 {
struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);

-   mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg,
priv->regs, DISP_DITHER_SIZE);
-   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg,
priv->regs, DISP_DITHER_CFG);
+   mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg, priv->regs,
+ DISP_DITHER_SIZE);
+   mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg, priv->regs,
+ DISP_DITHER_CFG);
+   mtk_dither_set_common(priv->regs, >cmdq_reg, bpc, DISP_DITHER_CFG,
+  DITHER_ENGINE_EN, cmdq_pkt);
 }

So now, not only bpc==5 or 6, but all valid bpc, dither config will
call mtk_dither_set_common() with the flag DITHER_ENGINE_EN(BIT(1)).
od config will call mtk_dither_set_common() with the flag
DISP_DITHERING(BIT(2)).
Additionally for 8173, gamma config will call mtk_dither_set_common()
with the flag DISP_DITHERING (BIT(2))

For invalid mode all of them will be DITHER_RELAY_MODE.

Just to make sure that this follows the spec? thanks

> >
> > - mtk_ddp_write(cmdq_pkt, h << 16 | w, >cmdq_reg, priv->regs, 
> > DISP_DITHER_SIZE);
> > - mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg, 
> > priv->regs, DISP_DITHER_CFG);
> > + if (enable) {
> > + mtk_dither_set_common(priv->regs, >cmdq_reg, bpc,
> > +   DISP_DITHER_CFG, DITHER_ENGINE_EN,
> > +   cmdq_pkt);
> > + } else {
> > + mtk_ddp_write(cmdq_pkt, DITHER_RELAY_MODE, >cmdq_reg,
> > +   priv->regs, DISP_DITHER_CFG);
> > + }
> > +
> > + 

Re: [PATCH] x86: Disable CET instrumentation in the kernel

2021-01-28 Thread Nikolay Borisov



On 28.01.21 г. 23:52 ч., Josh Poimboeuf wrote:
> 
> With retpolines disabled, some configurations of GCC will add Intel CET
> instrumentation to the kernel by default.  That breaks certain tracing
> scenarios by adding a superfluous ENDBR64 instruction before the fentry
> call, for functions which can be called indirectly.
> 
> CET instrumentation isn't currently necessary in the kernel, as CET is
> only supported in user space.  Disable it unconditionally.
> 
> Reported-by: Nikolay Borisov 
> Signed-off-by: Josh Poimboeuf 

Tested-by: Nikolay Borisov 
Reviewed-by: Nikolay Borisov 


Re: [PATCH -next] acpi: fpdt: drop errant comma in pr_info()

2021-01-28 Thread Zhang Rui
Hi, Randy,

Thanks for the patch, a similar patch has been posted earlier, but I
forgot to cc linux-acpi mailing list.
https://marc.info/?l=linux-next=161172750710666=2

Now given that there are a couple of fixes needed for the original
patch, I just refreshed the original patch to include all the fixes.

https://patchwork.kernel.org/project/linux-acpi/patch/20210129061548.13448-1-rui.zh...@intel.com/

thanks,
rui

On Thu, 2021-01-28 at 15:25 -0800, Randy Dunlap wrote:
> Drop a mistaken comma in the pr_info() args to prevent the
> build warning.
> 
> ../drivers/acpi/acpi_fpdt.c: In function 'acpi_init_fpdt':
> ../include/linux/kern_levels.h:5:18: warning: too many arguments for
> format [-Wformat-extra-args]
> ../drivers/acpi/acpi_fpdt.c:255:4: note: in expansion of macro
> 'pr_info'
> pr_info(FW_BUG, "Invalid subtable type %d found.\n",
> 
> Fixes: 208757d71098 ("ACPI: tables: introduce support for FPDT
> table")
> Signed-off-by: Randy Dunlap 
> Cc: "Rafael J. Wysocki" 
> Cc: Rafael J. Wysocki 
> Cc: Len Brown 
> Cc: linux-a...@vger.kernel.org
> Cc: Zhang Rui 
> ---
>  drivers/acpi/acpi_fpdt.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- linux-next-20210128.orig/drivers/acpi/acpi_fpdt.c
> +++ linux-next-20210128/drivers/acpi/acpi_fpdt.c
> @@ -252,7 +252,7 @@ void acpi_init_fpdt(void)
> subtable->type);
>   break;
>   default:
> - pr_info(FW_BUG, "Invalid subtable type %d
> found.\n",
> + pr_info(FW_BUG "Invalid subtable type %d
> found.\n",
>  subtable->type);
>   return;
>   }



Re: [PATCH v2] xen-blkback: fix compatibility bug with single page rings

2021-01-28 Thread Dongli Zhang



On 1/28/21 5:04 AM, Paul Durrant wrote:
> From: Paul Durrant 
> 
> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> behaviour of xen-blkback when connecting to a frontend was:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring specified by 'ring-ref'
> - else expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> 
> This was correct behaviour, but was broken by the afforementioned commit to
> become:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring (i.e. ring-page-order = 0)
> - expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> - if that didn't work then see if there's a single page ring specified by
>   'ring-ref'
> 
> This incorrect behaviour works most of the time but fails when a frontend
> that sets 'ring-page-order' is unloaded and replaced by one that does not
> because, instead of reading 'ring-ref', xen-blkback will read the stale
> 'ring-ref0' left around by the previous frontend will try to map the wrong
> grant reference.
> 
> This patch restores the original behaviour.
> 
> Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid 
> inconsistent xenstore 'ring-page-order' set by malicious blkfront")
> Signed-off-by: Paul Durrant 
> ---
> Cc: Konrad Rzeszutek Wilk 
> Cc: "Roger Pau Monné" 
> Cc: Jens Axboe 
> Cc: Dongli Zhang 
> 
> v2:
>  - Remove now-spurious error path special-case when nr_grefs == 1
> ---
>  drivers/block/xen-blkback/common.h |  1 +
>  drivers/block/xen-blkback/xenbus.c | 38 +-
>  2 files changed, 17 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/common.h 
> b/drivers/block/xen-blkback/common.h
> index b0c71d3a81a0..524a79f10de6 100644
> --- a/drivers/block/xen-blkback/common.h
> +++ b/drivers/block/xen-blkback/common.h
> @@ -313,6 +313,7 @@ struct xen_blkif {
>  
>   struct work_struct  free_work;
>   unsigned intnr_ring_pages;
> + boolmulti_ref;

Is it really necessary to introduce 'multi_ref' here or we may just re-use
'nr_ring_pages'?

According to blkfront code, 'ring-page-order' is set only when it is not zero,
that is, only when (info->nr_ring_pages > 1).

1819 if (info->nr_ring_pages > 1) {
1820 err = xenbus_printf(xbt, dev->nodename, "ring-page-order",
"%u",
1821 ring_page_order);
1822 if (err) {
1823 message = "writing ring-page-order";
1824 goto abort_transaction;
1825 }
1826 }

Therefore, can we assume 'ring-page-order' can never be 0? Once we have
'ring-page-order' set, it should be >= 1 and we should read from "ring-ref%u"?

If the specification allows 'ring-page-order' to be zero with "ring-ref%u"
available, we should introduce 'multi_ref'.

Thank you very much!

Dongli Zhang


>   /* All rings for this device. */
>   struct xen_blkif_ring   *rings;
>   unsigned intnr_rings;
> diff --git a/drivers/block/xen-blkback/xenbus.c 
> b/drivers/block/xen-blkback/xenbus.c
> index 9860d4842f36..6c5e9373e91c 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -998,14 +998,17 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> *ring, const char *dir)
>   for (i = 0; i < nr_grefs; i++) {
>   char ring_ref_name[RINGREF_NAME_LEN];
>  
> - snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> + if (blkif->multi_ref)
> + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", 
> i);
> + else {
> + WARN_ON(i != 0);
> + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref");
> + }
> +
>   err = xenbus_scanf(XBT_NIL, dir, ring_ref_name,
>  "%u", _ref[i]);
>  
>   if (err != 1) {
> - if (nr_grefs == 1)
> - break;
> -
>   err = -EINVAL;
>   xenbus_dev_fatal(dev, err, "reading %s/%s",
>dir, ring_ref_name);
> @@ -1013,18 +1016,6 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> *ring, const char *dir)
>   }
>   }
>  
> - if (err != 1) {
> - WARN_ON(nr_grefs != 1);
> -
> - err = xenbus_scanf(XBT_NIL, dir, "ring-ref", "%u",
> -_ref[0]);
> - if (err != 1) {
> - err = -EINVAL;
> - xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
> - return err;
> - }
> - }
> -
>   err = -ENOMEM;
>   for (i = 0; i < nr_grefs * 

Re: [workqueue] d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one_work

2021-01-28 Thread Xing Zhengjun




On 1/29/2021 2:08 AM, Paul E. McKenney wrote:

On Thu, Jan 28, 2021 at 05:09:05PM +0800, Hillf Danton wrote:

On Thu, 28 Jan 2021 15:52:40 +0800 Xing Zhengjun wrote:


[ . . . ]


I test the patch 4 times, no warning appears in the kernel log.


Thank you so much Zhengjun!

And the overall brain dump so far is

1/ before and after d5bff968ea, changing the allowed ptr at online time
is the key to quiesce the warning in process_one_work().

2/ marking pcpu before changing aptr in rebind_workers() is mandatory in
regards to cutting the risk of triggering such a warning.

3/ we canot maintain such an order without quiescing the 508 warning for
kworkers. And we have a couple of excuses to do so, a) the number of
allowed CPUs is no longer checked in is_per_cpu_kthread() instead of
PF_NO_SETAFFINITY, b) there is always a followup act to change the aptr
in order to fix the number of aCPUs.

4/ same order is maintained also at rescue time.


Just out of curiosity, does this test still fail on current mainline?

Thanx, Paul

I test mainline v5.11-rc5, it has no issue. The issue is only for 
d5bff968ea which is in 
https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git 
dev.2021.01.11b.


--
Zhengjun Xing


Re: [PATCH -next] acpi: fpdt: drop errant comma in pr_info()

2021-01-28 Thread Zhang Rui
On Thu, 2021-01-28 at 15:56 -0800, Joe Perches wrote:
> On Thu, 2021-01-28 at 15:25 -0800, Randy Dunlap wrote:
> > Drop a mistaken comma in the pr_info() args to prevent the
> > build warning.
> > 
> > ../drivers/acpi/acpi_fpdt.c: In function 'acpi_init_fpdt':
> > ../include/linux/kern_levels.h:5:18: warning: too many arguments
> > for format [-Wformat-extra-args]
> > ../drivers/acpi/acpi_fpdt.c:255:4: note: in expansion of macro
> > 'pr_info'
> > pr_info(FW_BUG, "Invalid subtable type %d found.\n",
> 
> []
> > --- linux-next-20210128.orig/drivers/acpi/acpi_fpdt.c
> > +++ linux-next-20210128/drivers/acpi/acpi_fpdt.c
> > @@ -252,7 +252,7 @@ void acpi_init_fpdt(void)
> >   subtable->type);
> > break;
> > default:
> > -   pr_info(FW_BUG, "Invalid subtable type %d
> > found.\n",
> > +   pr_info(FW_BUG "Invalid subtable type %d
> > found.\n",
> >subtable->type);
> 
> Another question would be why is the pr_info when all the other
> FW_BUG uses in this file are pr_err
> 
Here, this FW_BUG just means an unrecognized subtable is found, and it
should not affect the other subtables that are already supported by
this driver. So that's why we didn't use pr_err.
In fact, I've just posted a V2 patch, 
https://patchwork.kernel.org/project/linux-acpi/patch/20210129061548.13448-1-rui.zh...@intel.com/
and I prefer to continue processing even if this FW_BUG is detected.

> One would think it's at least a defect of some time.
> I would think it should at least be pr_notice or pr_warn

I'm also okay with pr_notice/pr_warn here.
This FW_BUG should be really rare.

thanks,
rui
> 
> Documentation/admin-guide/kernel
> -parameters.txt-1
> (KERN_ALERT)  action must be taken immediately
> Documentation/admin-guide/kernel
> -parameters.txt-2
> (KERN_CRIT)   critical conditions
> Documentation/admin-guide/kernel
> -parameters.txt-3 (KERN_ERR)error
> conditions
> Documentation/admin-guide/kernel
> -parameters.txt-4
> (KERN_WARNING)warning conditions
> Documentation/admin-guide/kernel
> -parameters.txt-5
> (KERN_NOTICE) normal but significant condition
> Documentation/admin-guide/kernel-
> parameters.txt:6
> (KERN_INFO)   informational
> Documentation/admin-guide/kernel
> -parameters.txt-7
> (KERN_DEBUG)  debug-level messages
> 
> 



Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page

2021-01-28 Thread Muchun Song
On Fri, Jan 29, 2021 at 6:29 AM Oscar Salvador  wrote:
>
> On Wed, Jan 27, 2021 at 11:36:15AM +0100, David Hildenbrand wrote:
> > Extending on that, I just discovered that only x86-64, ppc64, and arm64
> > really support hugepage migration.
> >
> > Maybe one approach with the "magic switch" really would be to disable
> > hugepage migration completely in hugepage_migration_supported(), and
> > consequently making hugepage_movable_supported() always return false.
>
> Ok, so migration would not fork for these pages, and since them would
> lay in !ZONE_MOVABLE there is no guarantee we can unplug the memory.
> Well, we really cannot unplug it unless the hugepage is not used
> (it can be dissolved at least).
>
> Now to the allocation-when-freeing.
> Current implementation uses GFP_ATOMIC(or wants to use) + forever loop.
> One of the problems I see with GFP_ATOMIC is that gives you access
> to memory reserves, but there are more users using those reserves.
> Then, worst-scenario case we need to allocate 16MB order-0 pages
> to free up 1GB hugepage, so the question would be whether reserves
> really scale to 16MB + more users accessing reserves.
>
> As I said, if anything I would go for an optimistic allocation-try
> , if we fail just refuse to shrink the pool.
> User can always try to shrink it later again via /sys interface.

Yeah. It seems that this is the easy way to move on.

Thanks.

>
> Since hugepages would not be longer in ZONE_MOVABLE/CMA and are not
> expected to be migratable, is that ok?
>
> Using the hugepage for the vmemmap array was brought up several times,
> but that would imply fragmenting memory over time.
>
> All in all seems to be overly complicated (I might be wrong).
>
>
> > Huge pages would never get placed onto ZONE_MOVABLE/CMA and cannot be
> > migrated. The problem I describe would apply (careful with using
> > ZONE_MOVABLE), but well, it can at least be documented.
>
> I am not a page allocator expert but cannot the allocation fallback
> to ZONE_MOVABLE under memory shortage on other zones?
>
>
> --
> Oscar Salvador
> SUSE L3


[PATCH 4/6] platform/chrome: cros_ec_typec: Report SOP' PD revision from status

2021-01-28 Thread Benson Leung
cros_typec_handle_sop_prime_disc now takes the PD revision provided
by the EC_CMD_TYPEC_STATUS command response for the SOP'.

Attach the properly formatted pd_revision to the cable desc before
registering the cable.

Signed-off-by: Benson Leung 
---
 drivers/platform/chrome/cros_ec_typec.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/chrome/cros_ec_typec.c 
b/drivers/platform/chrome/cros_ec_typec.c
index e724a5eaef1c..30600e9454e1 100644
--- a/drivers/platform/chrome/cros_ec_typec.c
+++ b/drivers/platform/chrome/cros_ec_typec.c
@@ -748,7 +748,7 @@ static void cros_typec_parse_pd_identity(struct 
usb_pd_identity *id,
id->vdo[i - 3] = disc->discovery_vdo[i];
 }
 
-static int cros_typec_handle_sop_prime_disc(struct cros_typec_data *typec, int 
port_num)
+static int cros_typec_handle_sop_prime_disc(struct cros_typec_data *typec, int 
port_num, u16 pd_revision)
 {
struct cros_typec_port *port = typec->ports[port_num];
struct ec_response_typec_discovery *disc = port->disc_data;
@@ -794,6 +794,7 @@ static int cros_typec_handle_sop_prime_disc(struct 
cros_typec_data *typec, int p
}
 
c_desc.identity = >c_identity;
+   c_desc.pd_revision = pd_revision;
 
port->cable = typec_register_cable(port->port, _desc);
if (IS_ERR(port->cable)) {
@@ -893,7 +894,11 @@ static void cros_typec_handle_status(struct 
cros_typec_data *typec, int port_num
 
if (resp.events & PD_STATUS_EVENT_SOP_PRIME_DISC_DONE &&
!typec->ports[port_num]->sop_prime_disc_done) {
-   ret = cros_typec_handle_sop_prime_disc(typec, port_num);
+   u16 sop_prime_revision;
+
+   /* Convert BCD to the format preferred by the TypeC framework */
+   sop_prime_revision = (le16_to_cpu(resp.sop_prime_revision) & 
0xff00) >> 4;
+   ret = cros_typec_handle_sop_prime_disc(typec, port_num, 
sop_prime_revision);
if (ret < 0)
dev_err(typec->dev, "Couldn't parse SOP' Disc data, 
port: %d\n", port_num);
else
-- 
2.30.0.365.g02bc693789-goog



[PATCH 1/6] usb: typec: Standardize PD Revision format with Type-C Revision

2021-01-28 Thread Benson Leung
The Type-C Revision was in a specific BCD format "0120H" for 1.2.
USB PD revision numbers follow a similar pattern with "0300H" for 3.0.

Standardizes the sysfs format for usb_power_delivery_revision
to align with the BCD format used for usb_typec_revision.

Example values:
- "2.0": USB Power Delivery Release 2.0
- "3.0": USB Power Delivery Release 3.0
- "3.1": USB Power Delivery Release 3.1

Signed-off-by: Benson Leung 
---
 Documentation/ABI/testing/sysfs-class-typec | 7 ++-
 drivers/usb/typec/class.c   | 3 ++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-typec 
b/Documentation/ABI/testing/sysfs-class-typec
index 8eab41e79ce6..b61480535fdc 100644
--- a/Documentation/ABI/testing/sysfs-class-typec
+++ b/Documentation/ABI/testing/sysfs-class-typec
@@ -105,7 +105,12 @@ Date:  April 2017
 Contact:   Heikki Krogerus 
 Description:
Revision number of the supported USB Power Delivery
-   specification, or 0 when USB Power Delivery is not supported.
+   specification, or 0.0 when USB Power Delivery is not supported.
+
+   Example values:
+   - "2.0": USB Power Delivery Release 2.0
+   - "3.0": USB Power Delivery Release 3.0
+   - "3.1": USB Power Delivery Release 3.1
 
 What:  /sys/class/typec//usb_typec_revision
 Date:  April 2017
diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
index 8f77669f9cf4..4f60ee7ba76a 100644
--- a/drivers/usb/typec/class.c
+++ b/drivers/usb/typec/class.c
@@ -1500,8 +1500,9 @@ static ssize_t usb_power_delivery_revision_show(struct 
device *dev,
char *buf)
 {
struct typec_port *p = to_typec_port(dev);
+   u16 rev = p->cap->pd_revision;
 
-   return sprintf(buf, "%d\n", (p->cap->pd_revision >> 8) & 0xff);
+   return sprintf(buf, "%d.%d\n", (rev >> 8) & 0xff, (rev >> 4) & 0xf);
 }
 static DEVICE_ATTR_RO(usb_power_delivery_revision);
 
-- 
2.30.0.365.g02bc693789-goog



  1   2   3   4   5   6   7   8   9   10   >