date:20200709

Re: [PATCH RFC v8 02/11] vhost: use batched get_vq_desc version

2020-07-09 Thread Michael S. Tsirkin

On Fri, Jul 10, 2020 at 07:39:26AM +0200, Eugenio Perez Martin wrote:
> > > How about playing with the batch size? Make it a mod parameter instead
> > > of the hard coded 64, and measure for all values 1 to 64 ...
> >
> >
> > Right, according to the test result, 64 seems to be too aggressive in
> > the case of TX.
> >
> 
> Got it, thanks both!

In particular I wonder whether with batch size 1
we get same performance as without batching
(would indicate 64 is too aggressive)
or not (would indicate one of the code changes
affects performance in an unexpected way).

-- 
MST

[v2 PATCH] usb: xhci-mtk: fix the failure of bandwidth allocation

2020-07-09 Thread Chunfeng Yun

The wMaxPacketSize field of endpoint descriptor may be zero
as default value in alternate interface, and they are not
actually selected when start stream, so skip them when try to
allocate bandwidth.

Cc: stable 
Fixes: 0cbd4b34cda9("xhci: mediatek: support MTK xHCI host controller")
Signed-off-by: Chunfeng Yun 
---
V2: add Fixes suggested by Nicolas
---
 drivers/usb/host/xhci-mtk-sch.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/usb/host/xhci-mtk-sch.c b/drivers/usb/host/xhci-mtk-sch.c
index fea..45c54d56 100644
--- a/drivers/usb/host/xhci-mtk-sch.c
+++ b/drivers/usb/host/xhci-mtk-sch.c
@@ -557,6 +557,10 @@ static bool need_bw_sch(struct usb_host_endpoint *ep,
if (is_fs_or_ls(speed) && !has_tt)
return false;
 
+   /* skip endpoint with zero maxpkt */
+   if (usb_endpoint_maxp(>desc) == 0)
+   return false;
+
return true;
 }
 
-- 
1.9.1

Re: [PATCH] usb: xhci-mtk: fix the failure of bandwidth allocation

2020-07-09 Thread Chunfeng Yun

On Fri, 2020-07-10 at 11:14 +0800, Nicolas Boichat wrote:
> On Fri, Jul 10, 2020 at 10:30 AM Chunfeng Yun  
> wrote:
> >
> > The wMaxPacketSize field of endpoint descriptor may be zero
> > as default value in alternate interface, and they are not
> > actually selected when start stream, so skip them when try to
> > allocate bandwidth.
> >
> > Cc: stable 
> > Signed-off-by: Chunfeng Yun 
> 
> Add this?
> Fixes: 0cbd4b34cda9dfd ("xhci: mediatek: support MTK xHCI host controller")
Ok, thanks

> 
> > ---
> >  drivers/usb/host/xhci-mtk-sch.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/usb/host/xhci-mtk-sch.c 
> > b/drivers/usb/host/xhci-mtk-sch.c
> > index fea..45c54d56 100644
> > --- a/drivers/usb/host/xhci-mtk-sch.c
> > +++ b/drivers/usb/host/xhci-mtk-sch.c
> > @@ -557,6 +557,10 @@ static bool need_bw_sch(struct usb_host_endpoint *ep,
> > if (is_fs_or_ls(speed) && !has_tt)
> > return false;
> >
> > +   /* skip endpoint with zero maxpkt */
> > +   if (usb_endpoint_maxp(>desc) == 0)
> > +   return false;
> > +
> > return true;
> >  }
> >
> > --
> > 1.9.1
> > ___
> > Linux-mediatek mailing list
> > linux-media...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-mediatek

fbconsole needs more parameter validations.

2020-07-09 Thread Tetsuo Handa

Hello.

While trying to debug 
https://syzkaller.appspot.com/bug?extid=017265e8553724e514e8 ,
I noticed that a crash can happen without opening /dev/ttyXX .

For example, while a driver which syzbot is reporting accepts screen with
var.xres = var.yres = 0 (and a crash is not visible until trying to write to
/dev/ttyXX ), a driver for VMware environment which I'm using (dmesg says 
"fbcon:
svgadrmfb (fb0) is primary device") rejects screen with var.xres = var.yres = 0.
However, specifying var.xres = var.yres = 1 like below reproducer causes a crash
in my VMware environment.

--
#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[])
{
const int fd = open("/dev/fb0", O_ACCMODE);
struct fb_var_screeninfo var = { };
ioctl(fd, FBIOGET_VSCREENINFO, );
var.xres = var.yres = 1;
ioctl(fd, FBIOPUT_VSCREENINFO, );
return 0;
}
--

--
[   20.10] BUG: unable to handle page fault for address: b80500d7b000
[   20.102225] #PF: supervisor write access in kernel mode
[   20.102226] #PF: error_code(0x0002) - not-present page
[   20.102227] PGD 13a48c067 P4D 13a48c067 PUD 13a48d067 PMD 132525067 PTE 0
[   20.102230] Oops: 0002 [#1] SMP
[   20.102232] CPU: 3 PID: 2786 Comm: a.out Not tainted 5.8.0-rc4+ #749
[   20.102233] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 02/27/2020
[   20.102237] RIP: 0010:bitfill_aligned+0x87/0x120 [cfbfillrect]
[   20.102238] Code: c3 45 85 db 0f 85 85 00 00 00 44 89 c0 31 d2 41 f7 f1 89 
c2 83 f8 07 76 41 8d 48 f8 c1 e9 03 48 83 c1 01 48 c1 e1 06 48 01 f1 <48> 89 3e 
48 89 7e 08 48 89 7e 10 48 89 7e 18 48 89 7e 20 48 89 7e
[   20.102239] RSP: 0018:b805012939a8 EFLAGS: 00010206
[   20.102240] RAX: 03fffe70 RBX: 9c20 RCX: b80520982000
[   20.102241] RDX: 03fffe70 RSI: b80500d7b000 RDI: 
[   20.102242] RBP: b805012939b8 R08: 9c20 R09: b80500d7aff8
[   20.102242] R10:  R11:  R12: 
[   20.102243] R13: 976734c0c000 R14:  R15: b80500982c80
[   20.102244] FS:  7f0c9589e740() GS:97673aec() 
knlGS:
[   20.102265] CS:  0010 DS:  ES:  CR0: 80050033
[   20.102265] CR2: b80500d7b000 CR3: 000136cdf004 CR4: 001606e0
[   20.102277] Call Trace:
[   20.102281]  cfb_fillrect+0x159/0x340 [cfbfillrect]
[   20.102385]  ? __mutex_unlock_slowpath+0x158/0x2d0
[   20.102493]  ? cfb_fillrect+0x340/0x340 [cfbfillrect]
[   20.102747]  vmw_fb_fillrect+0x12/0x30 [vmwgfx]
[   20.102755]  bit_clear_margins+0x92/0xf0 [fb]
[   20.102760]  fbcon_clear_margins+0x4c/0x50 [fb]
[   20.102763]  fbcon_switch+0x321/0x570 [fb]
[   20.102771]  redraw_screen+0xe0/0x250
[   20.102775]  fbcon_modechanged+0x164/0x1b0 [fb]
[   20.102779]  fbcon_update_vcs+0x15/0x20 [fb]
[   20.102781]  fb_set_var+0x364/0x3c0 [fb]
[   20.102817]  do_fb_ioctl+0x2ff/0x3f0 [fb]
[   20.102894]  ? find_held_lock+0x35/0xa0
[   20.103126]  ? __audit_syscall_entry+0xd8/0x120
[   20.103135]  ? kfree+0x25a/0x2b0
[   20.103139]  fb_ioctl+0x2e/0x40 [fb]
[   20.103141]  ksys_ioctl+0x86/0xc0
[   20.103144]  ? do_syscall_64+0x20/0xa0
[   20.103146]  __x64_sys_ioctl+0x15/0x20
[   20.103148]  do_syscall_64+0x54/0xa0
[   20.103151]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   20.103152] RIP: 0033:0x7f0c953b8307
[   20.103153] Code: Bad RIP value.
[   20.103154] RSP: 002b:7ffecbdce0f8 EFLAGS: 0246 ORIG_RAX: 
0010
[   20.103155] RAX: ffda RBX: 0003 RCX: 7f0c953b8307
[   20.103156] RDX: 7ffecbdce100 RSI: 4601 RDI: 0003
[   20.103156] RBP:  R08: 7f0c9568be80 R09: 
[   20.103157] R10: 7ffecbdcdb60 R11: 0246 R12: 004004f2
[   20.103158] R13: 7ffecbdce280 R14:  R15: 
[   20.103162] Modules linked in: mousedev rapl evdev input_leds led_class 
mac_hid psmouse pcspkr xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 
ipt_REJECT nf_reject_ipv4 xt_conntrack sg ebtable_nat af_packet ip6table_nat 
ip6table_mangle ip6table_raw iptable_nat nf_nat iptable_mangle iptable_raw 
nf_conntrack rtc_cmos nf_defrag_ipv4 ip_set nfnetlink ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter bpfilter i2c_piix4 vmw_vmci ac 
intel_agp button intel_gtt ip_tables x_tables ata_generic pata_acpi serio_raw 
atkbd libps2 vmwgfx drm_kms_helper cfbfillrect syscopyarea cfbimgblt 
sysfillrect sysimgblt fb_sys_fops cfbcopyarea fb fbdev ttm drm i2c_core ahci 
drm_panel_orientation_quirks libahci backlight e1000 agpgart ata_piix libata 
i8042 serio unix ipv6 nf_defrag_ipv6
[   20.103194] CR2: b80500d7b000
[   20.103196] ---[ end trace b2348f839f6524f9 ]---
[   20.103198] RIP: 0010:bitfill_aligned+0x87/0x120 [cfbfillrect]
[   20.103200] Code: c3 45 85 db 0f 85 85 00

Re: [PATCH 1/5] lib: Add a generic version of devmem_is_allowed()

2020-07-09 Thread Nick Kossifidis

Στις 2020-07-10 08:38, Christoph Hellwig έγραψε:

On Thu, Jul 09, 2020 at 11:49:21PM +0300, Mike Rapoport wrote:

> +#ifndef CONFIG_GENERIC_DEVMEM_IS_ALLOWED
> +extern int devmem_is_allowed(unsigned long pfn);
> +#endif

Nit: no need for the extern here.

> +config GENERIC_LIB_DEVMEM_IS_ALLOWED
> +  bool
> +  select ARCH_HAS_DEVMEM_IS_ALLOWED

This seems to work the other way around from the usual Kconfig chains.
In the most cases ARCH_HAS_SOMETHING selects GENERIC_SOMETHING.

I believe nicer way would be to make

config STRICT_DEVMEM
bool "Filter access to /dev/mem"
depends on MMU && DEVMEM
	depends on ARCH_HAS_DEVMEM_IS_ALLOWED || 
GENERIC_LIB_DEVMEM_IS_ALLOWED

config GENERIC_LIB_DEVMEM_IS_ALLOWED
bool

and then s/select ARCH_HAS_DEVMEM_IS_ALLOWED/select 
GENERIC_LIB_DEVMEM_IS_ALLOWED/

in the arch Kconfigs and drop ARCH_HAS_DEVMEM_IS_ALLOWED in the end.

To take a step back:  Is there any reason to not just always
STRICT_DEVMEM? Maybe for a few architectures that don't currently
support a strict /dev/mem the generic version isn't quite correct, but
someone selecting the option and finding the issue is the best way to
figure that out..

During prototyping / testing having full access to all physical memory 
through /dev/mem is very useful. We should have it enabled by default 
but leave the config option there so that users / developers can disable 
it if needed IMHO.

[PATCH] vt: Reject zero-sized screen buffer size.

2020-07-09 Thread Tetsuo Handa

syzbot is reporting general protection fault in do_con_write() [1] caused
by vc->vc_screenbuf == ZERO_SIZE_PTR caused by vc->vc_screenbuf_size == 0
caused by vc->vc_cols == vc->vc_rows == vc->vc_size_row == 0 being passed
to ioctl(FBIOPUT_VSCREENINFO) request on /dev/fb0 , for gotoxy(vc, 0, 0)
 from reset_terminal() from vc_init() from vc_allocate() on such console
causes vc->vc_pos == 0x1000e due to
((unsigned long) ZERO_SIZE_PTR) + -1U * 0 + (-1U << 1).

I don't think that a console with 0 column and/or 0 row makes sense, and
I think that we can reject such bogus arguments in fb_set_var() from
ioctl(FBIOPUT_VSCREENINFO). Regardless, I think that it is safer to also
check ZERO_SIZE_PTR when allocating vc->vc_screenbuf from vc_allocate()
 from con_install() from tty_init_dev() from tty_open().

[1] https://syzkaller.appspot.com/bug?extid=017265e8553724e514e8

Reported-by: syzbot 
Signed-off-by: Tetsuo Handa 
---
 drivers/tty/vt/vt.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 48a8199f7845..8497e9206607 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1126,7 +1126,7 @@ int vc_allocate(unsigned int currcons)/* return 0 on 
success */
con_set_default_unimap(vc);
 
vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_KERNEL);
-   if (!vc->vc_screenbuf)
+   if (ZERO_OR_NULL_PTR(vc->vc_screenbuf))
goto err_free;
 
/* If no drivers have overridden us and the user didn't pass a
@@ -1212,7 +1212,7 @@ static int vc_do_resize(struct tty_struct *tty, struct 
vc_data *vc,
if (new_cols == vc->vc_cols && new_rows == vc->vc_rows)
return 0;
 
-   if (new_screen_size > KMALLOC_MAX_SIZE)
+   if (new_screen_size > KMALLOC_MAX_SIZE || !new_screen_size)
return -EINVAL;
newscreen = kzalloc(new_screen_size, GFP_USER);
if (!newscreen)
@@ -3393,6 +3393,7 @@ static int __init con_init(void)
INIT_WORK(_cons[currcons].SAK_work, vc_SAK);
tty_port_init(>port);
visual_init(vc, currcons, 1);
+   /* Assuming vc->vc_screenbuf_size is sane here, for this is 
__init code. */
vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_NOWAIT);
vc_init(vc, vc->vc_rows, vc->vc_cols,
currcons || !vc->vc_sw->con_save_screen);
-- 
2.18.4

Re: [PATCH 1/5] lib: Add a generic version of devmem_is_allowed()

2020-07-09 Thread Christoph Hellwig

On Fri, Jul 10, 2020 at 08:48:17AM +0300, Nick Kossifidis wrote:
>  2020-07-10 08:38, Christoph Hellwig :
> > On Thu, Jul 09, 2020 at 11:49:21PM +0300, Mike Rapoport wrote:
> > > > +#ifndef CONFIG_GENERIC_DEVMEM_IS_ALLOWED
> > > > +extern int devmem_is_allowed(unsigned long pfn);
> > > > +#endif
> > 
> > Nit: no need for the extern here.
> > 
> > > > +config GENERIC_LIB_DEVMEM_IS_ALLOWED
> > > > +   bool
> > > > +   select ARCH_HAS_DEVMEM_IS_ALLOWED
> > > 
> > > This seems to work the other way around from the usual Kconfig chains.
> > > In the most cases ARCH_HAS_SOMETHING selects GENERIC_SOMETHING.
> > > 
> > > I believe nicer way would be to make
> > > 
> > > config STRICT_DEVMEM
> > >   bool "Filter access to /dev/mem"
> > >   depends on MMU && DEVMEM
> > >   depends on ARCH_HAS_DEVMEM_IS_ALLOWED ||
> > > GENERIC_LIB_DEVMEM_IS_ALLOWED
> > > 
> > > config GENERIC_LIB_DEVMEM_IS_ALLOWED
> > >   bool
> > > 
> > > and then s/select ARCH_HAS_DEVMEM_IS_ALLOWED/select
> > > GENERIC_LIB_DEVMEM_IS_ALLOWED/
> > > in the arch Kconfigs and drop ARCH_HAS_DEVMEM_IS_ALLOWED in the end.
> > 
> > To take a step back:  Is there any reason to not just always
> > STRICT_DEVMEM? Maybe for a few architectures that don't currently
> > support a strict /dev/mem the generic version isn't quite correct, but
> > someone selecting the option and finding the issue is the best way to
> > figure that out..
> > 
> 
> During prototyping / testing having full access to all physical memory
> through /dev/mem is very useful. We should have it enabled by default but
> leave the config option there so that users / developers can disable it if
> needed IMHO.

I did not suggest to take the config option away.  Just to
unconditionally allow enabling the option on all architectures.

Re: [PATCH] ARM: dts: prima: Align L2 cache-controller nodename with dtschema

2020-07-09 Thread Barry Song

Krzysztof Kozlowski  于2020年6月26日周五 下午8:06写道：
>
> Fix dtschema validator warnings like:
> l2-cache-controller@8004: $nodename:0:
> 'l2-cache-controller@8004' does not match 
> '^(cache-controller|cpu)(@[0-9a-f,]+)*$'
>
> Signed-off-by: Krzysztof Kozlowski 

Acked-by: Barry Song 

> ---
>  arch/arm/boot/dts/prima2.dtsi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/boot/dts/prima2.dtsi b/arch/arm/boot/dts/prima2.dtsi
> index 9c7b46b90c3c..7d3d93c22ed9 100644
> --- a/arch/arm/boot/dts/prima2.dtsi
> +++ b/arch/arm/boot/dts/prima2.dtsi
> @@ -50,7 +50,7 @@
> #size-cells = <1>;
> ranges = <0x4000 0x4000 0x8000>;
>
> -   l2-cache-controller@8004 {
> +   cache-controller@8004 {
> compatible = "arm,pl310-cache";
> reg = <0x8004 0x1000>;
> interrupts = <59>;
> --
> 2.17.1
>

RE: [PATCH v3 4/4] iommu/vt-d: Add page response ops support

2020-07-09 Thread Tian, Kevin

> From: Lu Baolu 
> Sent: Friday, July 10, 2020 1:37 PM
> 
> Hi Kevin,
> 
> On 2020/7/10 10:42, Tian, Kevin wrote:
> >> From: Lu Baolu 
> >> Sent: Thursday, July 9, 2020 3:06 PM
> >>
> >> After page requests are handled, software must respond to the device
> >> which raised the page request with the result. This is done through
> >> the iommu ops.page_response if the request was reported to outside of
> >> vendor iommu driver through iommu_report_device_fault(). This adds
> the
> >> VT-d implementation of page_response ops.
> >>
> >> Co-developed-by: Jacob Pan 
> >> Signed-off-by: Jacob Pan 
> >> Co-developed-by: Liu Yi L 
> >> Signed-off-by: Liu Yi L 
> >> Signed-off-by: Lu Baolu 
> >> ---
> >>   drivers/iommu/intel/iommu.c |   1 +
> >>   drivers/iommu/intel/svm.c   | 100
> >> 
> >>   include/linux/intel-iommu.h |   3 ++
> >>   3 files changed, 104 insertions(+)
> >>
> >> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> >> index 4a6b6960fc32..98390a6d8113 100644
> >> --- a/drivers/iommu/intel/iommu.c
> >> +++ b/drivers/iommu/intel/iommu.c
> >> @@ -6057,6 +6057,7 @@ const struct iommu_ops intel_iommu_ops = {
> >>.sva_bind   = intel_svm_bind,
> >>.sva_unbind = intel_svm_unbind,
> >>.sva_get_pasid  = intel_svm_get_pasid,
> >> +  .page_response  = intel_svm_page_response,
> >>   #endif
> >>   };
> >>
> >> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> >> index d24e71bac8db..839d2af377b6 100644
> >> --- a/drivers/iommu/intel/svm.c
> >> +++ b/drivers/iommu/intel/svm.c
> >> @@ -1082,3 +1082,103 @@ int intel_svm_get_pasid(struct iommu_sva
> *sva)
> >>
> >>return pasid;
> >>   }
> >> +
> >> +int intel_svm_page_response(struct device *dev,
> >> +  struct iommu_fault_event *evt,
> >> +  struct iommu_page_response *msg)
> >> +{
> >> +  struct iommu_fault_page_request *prm;
> >> +  struct intel_svm_dev *sdev = NULL;
> >> +  struct intel_svm *svm = NULL;
> >> +  struct intel_iommu *iommu;
> >> +  bool private_present;
> >> +  bool pasid_present;
> >> +  bool last_page;
> >> +  u8 bus, devfn;
> >> +  int ret = 0;
> >> +  u16 sid;
> >> +
> >> +  if (!dev || !dev_is_pci(dev))
> >> +  return -ENODEV;
> >> +
> >> +  iommu = device_to_iommu(dev, , );
> >> +  if (!iommu)
> >> +  return -ENODEV;
> >> +
> >> +  if (!msg || !evt)
> >> +  return -EINVAL;
> >> +
> >> +  mutex_lock(_mutex);
> >> +
> >> +  prm = >fault.prm;
> >> +  sid = PCI_DEVID(bus, devfn);
> >> +  pasid_present = prm->flags &
> >> IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
> >> +  private_present = prm->flags &
> >> IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA;
> >> +  last_page = prm->flags &
> >> IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
> >> +
> >> +  if (pasid_present) {
> >> +  if (prm->pasid == 0 || prm->pasid >= PASID_MAX) {
> >> +  ret = -EINVAL;
> >> +  goto out;
> >> +  }
> >> +
> >> +  ret = pasid_to_svm_sdev(dev, prm->pasid, , );
> >> +  if (ret || !sdev) {
> >> +  ret = -ENODEV;
> >> +  goto out;
> >> +  }
> >> +
> >> +  /*
> >> +   * For responses from userspace, need to make sure that the
> >> +   * pasid has been bound to its mm.
> >> +  */
> >> +  if (svm->flags & SVM_FLAG_GUEST_MODE) {
> >> +  struct mm_struct *mm;
> >> +
> >> +  mm = get_task_mm(current);
> >> +  if (!mm) {
> >> +  ret = -EINVAL;
> >> +  goto out;
> >> +  }
> >> +
> >> +  if (mm != svm->mm) {
> >> +  ret = -ENODEV;
> >> +  mmput(mm);
> >> +  goto out;
> >> +  }
> >> +
> >> +  mmput(mm);
> >> +  }
> >> +  } else {
> >> +  pr_err_ratelimited("Invalid page response: no pasid\n");
> >> +  ret = -EINVAL;
> >> +  goto out;
> >
> > check pasid=0 first, then no need to indent so many lines above.
> 
> Yes.
> 
> >
> >> +  }
> >> +
> >> +  /*
> >> +   * Per VT-d spec. v3.0 ch7.7, system software must respond
> >> +   * with page group response if private data is present (PDP)
> >> +   * or last page in group (LPIG) bit is set. This is an
> >> +   * additional VT-d requirement beyond PCI ATS spec.
> >> +   */
> >
> > What is the behavior if system software doesn't follow the requirement?
> > en... maybe the question is really about whether the information in prm
> > comes from userspace or from internally-recorded info in iommu core.
> > The former cannot be trusted. The latter one is OK.
> 
> We require a page response when reporting such event. The upper layer
> (IOMMU core or VFIO) will be implemented with a timer, if userspace
> doesn't respond in time, the timer will get expired and a FAILURE
> response

Beginning 5.8rc1 kernel USB headsets (ASUS ROG Delta and HyperX Cloud Orbit S) play sound as if in slow-motion.

2020-07-09 Thread Mikhail Gavrilov

Beginning 5.8rc1 (git 69119673bd50) kernel USB headsets (ASUS ROG
Delta and HyperX Cloud Orbit S) play sound as if in slow-motion.

And in 5.8rc4 (git dcde237b9b0e) this still not fixed yet.
The bisecting is problematic because rc1 also has another issue
https://lkml.org/lkml/2020/6/22/21 which completely breaks the sound
subsystem.
If anyone said how to fix https://lkml.org/lkml/2020/6/22/21 I can
bisect this issue.

--
Best Regards,
Mike Gavrilov.

[rcu:dev.2020.07.09a] BUILD SUCCESS f40a25f3eefe71e02df81089e3331eb271ff55be

2020-07-09 Thread kernel test robot

tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git  
dev.2020.07.09a
branch HEAD: f40a25f3eefe71e02df81089e3331eb271ff55be  squash! scftorture: Add 
smp_call_function_single() memory-ordering checks

elapsed time: 722m

configs tested: 106
configs skipped: 7

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64allmodconfig
arm64 allnoconfig
arm64   defconfig
mips rt305x_defconfig
sh  defconfig
nios2alldefconfig
m68k apollo_defconfig
mipsomega2p_defconfig
powerpc mpc512x_defconfig
armspear6xx_defconfig
m68k   m5249evb_defconfig
armmulti_v7_defconfig
powerpc   holly_defconfig
x86_64  defconfig
arm ebsa110_defconfig
powerpc  tqm8xx_defconfig
powerpc  ep88xc_defconfig
arm at91_dt_defconfig
m68k  hp300_defconfig
powerpc  mpc866_ads_defconfig
powerpc64   defconfig
um i386_defconfig
openrisc allyesconfig
shedosk7705_defconfig
i386  allnoconfig
i386 allyesconfig
i386defconfig
i386  debian-10.3
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
m68k allyesconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
mips allyesconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a002-20200709
i386 randconfig-a001-20200709
i386 randconfig-a006-20200709
i386 randconfig-a005-20200709
i386 randconfig-a004-20200709
i386 randconfig-a003-20200709
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
sparcallyesconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
um   allmodconfig
um   allyesconfig
umallnoconfig
um  defconfig
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests

[rcu:dev.2020.07.08a] BUILD SUCCESS cb2d297b305225704e58759373684f365ae103d4

2020-07-09 Thread kernel test robot

tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git  
dev.2020.07.08a
branch HEAD: cb2d297b305225704e58759373684f365ae103d4  rcu: Fix kerneldoc 
comments in rcuupdate.h

elapsed time: 722m

configs tested: 106
configs skipped: 7

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64allmodconfig
arm64 allnoconfig
arm64   defconfig
mips rt305x_defconfig
sh  defconfig
nios2alldefconfig
m68k apollo_defconfig
mipsomega2p_defconfig
powerpc mpc512x_defconfig
armspear6xx_defconfig
m68k   m5249evb_defconfig
armmulti_v7_defconfig
powerpc   holly_defconfig
x86_64  defconfig
arm ebsa110_defconfig
powerpc  tqm8xx_defconfig
powerpc  ep88xc_defconfig
arm at91_dt_defconfig
m68k  hp300_defconfig
powerpc  mpc866_ads_defconfig
powerpc64   defconfig
um i386_defconfig
openrisc allyesconfig
shedosk7705_defconfig
i386  allnoconfig
i386 allyesconfig
i386defconfig
i386  debian-10.3
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
m68k allyesconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
mips allyesconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a002-20200709
i386 randconfig-a001-20200709
i386 randconfig-a006-20200709
i386 randconfig-a005-20200709
i386 randconfig-a004-20200709
i386 randconfig-a003-20200709
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
sparcallyesconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
um   allmodconfig
um   allyesconfig
umallnoconfig
um  defconfig
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64

Re: [linux-sunxi] [PATCH 01/16] ASoC: sun4i-i2s: Add support for H6 I2S

2020-07-09 Thread Samuel Holland

On 7/4/20 6:38 AM, Clément Péron wrote:
> From: Jernej Skrabec 
> 
> H6 I2S is very similar to that in H3, except it supports up to 16
> channels.
> 
> Signed-off-by: Jernej Skrabec 
> Signed-off-by: Marcus Cooper 
> Signed-off-by: Clément Péron 
> ---
>  sound/soc/sunxi/sun4i-i2s.c | 227 
>  1 file changed, 227 insertions(+)
> 
> diff --git a/sound/soc/sunxi/sun4i-i2s.c b/sound/soc/sunxi/sun4i-i2s.c
> index d0a8d5810c0a..9690389cb68e 100644
> --- a/sound/soc/sunxi/sun4i-i2s.c
> +++ b/sound/soc/sunxi/sun4i-i2s.c
> @@ -124,6 +124,21 @@
>  #define SUN8I_I2S_RX_CHAN_SEL_REG0x54
>  #define SUN8I_I2S_RX_CHAN_MAP_REG0x58
>  
> +/* Defines required for sun50i-h6 support */
> +#define SUN50I_H6_I2S_TX_CHAN_SEL_OFFSET_MASKGENMASK(21, 20)
> +#define SUN50I_H6_I2S_TX_CHAN_SEL_OFFSET(offset) ((offset) << 20)
> +#define SUN50I_H6_I2S_TX_CHAN_SEL_MASK   GENMASK(19, 16)
> +#define SUN50I_H6_I2S_TX_CHAN_SEL(chan)  ((chan - 1) << 16)
> +#define SUN50I_H6_I2S_TX_CHAN_EN_MASKGENMASK(15, 0)
> +#define SUN50I_H6_I2S_TX_CHAN_EN(num_chan)   (((1 << num_chan) - 1))
> +
> +#define SUN50I_H6_I2S_TX_CHAN_MAP0_REG   0x44
> +#define SUN50I_H6_I2S_TX_CHAN_MAP1_REG   0x48
> +
> +#define SUN50I_H6_I2S_RX_CHAN_SEL_REG0x64
> +#define SUN50I_H6_I2S_RX_CHAN_MAP0_REG   0x68
> +#define SUN50I_H6_I2S_RX_CHAN_MAP1_REG   0x6C
> +
>  struct sun4i_i2s;
>  
>  /**
> @@ -466,6 +481,65 @@ static int sun8i_i2s_set_chan_cfg(const struct sun4i_i2s 
> *i2s,
>   return 0;
>  }
>  
> +static int sun50i_i2s_set_chan_cfg(const struct sun4i_i2s *i2s,
> +const struct snd_pcm_hw_params *params)
> +{
> + unsigned int channels = params_channels(params);
> + unsigned int slots = channels;
> + unsigned int lrck_period;
> +
> + if (i2s->slots)
> + slots = i2s->slots;
> +
> + /* Map the channels for playback and capture */
> + regmap_write(i2s->regmap, SUN50I_H6_I2S_TX_CHAN_MAP1_REG, 0x76543210);
> + regmap_write(i2s->regmap, SUN50I_H6_I2S_RX_CHAN_MAP1_REG, 0x76543210);
> +
> + /* Configure the channels */
> + regmap_update_bits(i2s->regmap, SUN8I_I2S_TX_CHAN_SEL_REG,
> +SUN50I_H6_I2S_TX_CHAN_SEL_MASK,
> +SUN50I_H6_I2S_TX_CHAN_SEL(channels));
> + regmap_update_bits(i2s->regmap, SUN50I_H6_I2S_RX_CHAN_SEL_REG,
> +SUN50I_H6_I2S_TX_CHAN_SEL_MASK,
> +SUN50I_H6_I2S_TX_CHAN_SEL(channels));
> +
> + regmap_update_bits(i2s->regmap, SUN8I_I2S_CHAN_CFG_REG,
> +SUN8I_I2S_CHAN_CFG_TX_SLOT_NUM_MASK,
> +SUN8I_I2S_CHAN_CFG_TX_SLOT_NUM(channels));
> + regmap_update_bits(i2s->regmap, SUN8I_I2S_CHAN_CFG_REG,
> +SUN8I_I2S_CHAN_CFG_RX_SLOT_NUM_MASK,
> +SUN8I_I2S_CHAN_CFG_RX_SLOT_NUM(channels));
> +
> + switch (i2s->format & SND_SOC_DAIFMT_FORMAT_MASK) {
> + case SND_SOC_DAIFMT_DSP_A:
> + case SND_SOC_DAIFMT_DSP_B:
> + case SND_SOC_DAIFMT_LEFT_J:
> + case SND_SOC_DAIFMT_RIGHT_J:

According to the manual, LEFT_J and RIGHT_J should use the same calculation as
I2S, not the one for PCM/DSP.

> + lrck_period = params_physical_width(params) * slots;
> + break;
> +
> + case SND_SOC_DAIFMT_I2S:
> + lrck_period = params_physical_width(params);
> + break;
> +
> + default:
> + return -EINVAL;
> + }
> +
> + if (i2s->slot_width)
> + lrck_period = i2s->slot_width;
> +
> + regmap_update_bits(i2s->regmap, SUN4I_I2S_FMT0_REG,
> +SUN8I_I2S_FMT0_LRCK_PERIOD_MASK,
> +SUN8I_I2S_FMT0_LRCK_PERIOD(lrck_period));

>From the description in the manual, this looks off by one. The number of BCLKs
per LRCK is LRCK_PERIOD + 1.

> +
> + regmap_update_bits(i2s->regmap, SUN8I_I2S_TX_CHAN_SEL_REG,
> +SUN50I_H6_I2S_TX_CHAN_EN_MASK,
> +SUN50I_H6_I2S_TX_CHAN_EN(channels));
> +
> + return 0;
> +}
> +
>  static int sun4i_i2s_hw_params(struct snd_pcm_substream *substream,
>  struct snd_pcm_hw_params *params,
>  struct snd_soc_dai *dai)
> @@ -691,6 +765,108 @@ static int sun8i_i2s_set_soc_fmt(const struct sun4i_i2s 
> *i2s,
>   return 0;
>  }
>  
> +static int sun50i_i2s_set_soc_fmt(const struct sun4i_i2s *i2s,
> +   unsigned int fmt)
> +{
> + u32 mode, val;
> + u8 offset;
> +
> + /*
> +  * DAI clock polarity
> +  *
> +  * The setup for LRCK contradicts the datasheet, but under a
> +  * scope it's clear that the LRCK polarity is reversed
> +  * compared to the expected polarity on the bus.
> +  */

This comment makes us sound a lot more confident than I think we actually are.

Regards,
Samuel

Re: [linux-sunxi] [PATCH 02/16] ASoC: sun4i-i2s: Adjust LRCLK width

2020-07-09 Thread Samuel Holland

On 7/4/20 6:38 AM, Clément Péron wrote:
> From: Marcus Cooper 
> 
> Some codecs such as i2s based HDMI audio and the Pine64 DAC require
> a different amount of bit clocks per frame than what is calculated
> by the sample width. Use the values obtained by the tdm slot bindings
> to adjust the LRCLK width accordingly.
> 
> Signed-off-by: Marcus Cooper 
> Signed-off-by: Clément Péron 
> Acked-by: Maxime Ripard 
> ---
>  sound/soc/sunxi/sun4i-i2s.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/sound/soc/sunxi/sun4i-i2s.c b/sound/soc/sunxi/sun4i-i2s.c
> index 9690389cb68e..8bae97efea30 100644
> --- a/sound/soc/sunxi/sun4i-i2s.c
> +++ b/sound/soc/sunxi/sun4i-i2s.c
> @@ -470,6 +470,9 @@ static int sun8i_i2s_set_chan_cfg(const struct sun4i_i2s 
> *i2s,
>   return -EINVAL;
>   }
>  
> + if (i2s->slot_width)
> + lrck_period = i2s->slot_width;
> +
>   regmap_update_bits(i2s->regmap, SUN4I_I2S_FMT0_REG,
>  SUN8I_I2S_FMT0_LRCK_PERIOD_MASK,
>  SUN8I_I2S_FMT0_LRCK_PERIOD(lrck_period));
> 

It looks like the existing code would have the same problem, that this should be
lrck_period - 1 according to the manual (I checked H3).

Regards,
Samuel

Re: [linux-sunxi] [PATCH 05/16] ASoc: sun4i-i2s: Add 20 and 24 bit support

2020-07-09 Thread Samuel Holland

On 7/4/20 6:38 AM, Clément Péron wrote:
> From: Marcus Cooper 
> 
> Extend the functionality of the driver to include support of 20 and
> 24 bits per sample.
> 
> Signed-off-by: Marcus Cooper 
> Signed-off-by: Clément Péron 
> ---
>  sound/soc/sunxi/sun4i-i2s.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/sound/soc/sunxi/sun4i-i2s.c b/sound/soc/sunxi/sun4i-i2s.c
> index f78167e152ce..bc7f9343bc7a 100644
> --- a/sound/soc/sunxi/sun4i-i2s.c
> +++ b/sound/soc/sunxi/sun4i-i2s.c
> @@ -577,6 +577,9 @@ static int sun4i_i2s_hw_params(struct snd_pcm_substream 
> *substream,
>   case 16:
>   width = DMA_SLAVE_BUSWIDTH_2_BYTES;
>   break;
> + case 32:
> + width = DMA_SLAVE_BUSWIDTH_4_BYTES;
> + break;

This breaks the sun4i variants, because sun4i_i2s_get_wss returns 4 for a 32 bit
width, but it needs to return 3.

As a side note, I wonder why we use the physical width (the spacing between
samples in RAM) to drive the slot width. S24_LE takes up 4 bytes per sample in
RAM, which we need for DMA. But I don't see why we would want to transmit the
padding over the wire. I would expect it to be transmitted the same as S24_3LE
(which has no padding). It did not matter before, because the only supported
format had no padding.

Regards,
Samuel

>   default:
>   dev_err(dai->dev, "Unsupported physical sample width: %d\n",
>   params_physical_width(params));
> @@ -1063,6 +1066,10 @@ static int sun4i_i2s_dai_probe(struct snd_soc_dai *dai)
>   return 0;
>  }
>  
> +#define SUN4I_FORMATS(SNDRV_PCM_FMTBIT_S16_LE | \
> +  SNDRV_PCM_FMTBIT_S20_LE | \
> +  SNDRV_PCM_FMTBIT_S24_LE)
> +
>  static struct snd_soc_dai_driver sun4i_i2s_dai = {
>   .probe = sun4i_i2s_dai_probe,
>   .capture = {
> @@ -1070,14 +1077,14 @@ static struct snd_soc_dai_driver sun4i_i2s_dai = {
>   .channels_min = 1,
>   .channels_max = 8,
>   .rates = SNDRV_PCM_RATE_8000_192000,
> - .formats = SNDRV_PCM_FMTBIT_S16_LE,
> + .formats = SUN4I_FORMATS,
>   },
>   .playback = {
>   .stream_name = "Playback",
>   .channels_min = 1,
>   .channels_max = 8,
>   .rates = SNDRV_PCM_RATE_8000_192000,
> - .formats = SNDRV_PCM_FMTBIT_S16_LE,
> + .formats = SUN4I_FORMATS,
>   },
>   .ops = _i2s_dai_ops,
>   .symmetric_rates = 1,
>

Re: [linux-sunxi] [PATCH 04/16] ASoC: sun4i-i2s: Set sign extend sample

2020-07-09 Thread Samuel Holland

On 7/4/20 6:38 AM, Clément Péron wrote:
> From: Marcus Cooper 
> 
> On the newer SoCs such as the H3 and A64 this is set by default
> to transfer a 0 after each sample in each slot. However the A10
> and A20 SoCs that this driver was developed on had a default
> setting where it padded the audio gain with zeros.
> 
> This isn't a problem while we have only support for 16bit audio
> but with larger sample resolution rates in the pipeline then SEXT
> bits should be cleared so that they also pad at the LSB. Without
> this the audio gets distorted.
> 
> Set sign extend sample for all the sunxi generations even if they
> are not affected. This will keep coherency and avoid relying on
> default.
> 
> Signed-off-by: Marcus Cooper 
> Signed-off-by: Clément Péron 
> ---
>  sound/soc/sunxi/sun4i-i2s.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/sound/soc/sunxi/sun4i-i2s.c b/sound/soc/sunxi/sun4i-i2s.c
> index 8bae97efea30..f78167e152ce 100644
> --- a/sound/soc/sunxi/sun4i-i2s.c
> +++ b/sound/soc/sunxi/sun4i-i2s.c
> @@ -48,6 +48,9 @@
>  #define SUN4I_I2S_FMT0_FMT_I2S   (0 << 0)
>  
>  #define SUN4I_I2S_FMT1_REG   0x08
> +#define SUN4I_I2S_FMT1_REG_SEXT_MASK BIT(8)
> +#define SUN4I_I2S_FMT1_REG_SEXT(sext)((sext) << 8)
> +
>  #define SUN4I_I2S_FIFO_TX_REG0x0c
>  #define SUN4I_I2S_FIFO_RX_REG0x10
>  
> @@ -105,6 +108,9 @@
>  #define SUN8I_I2S_FMT0_BCLK_POLARITY_INVERTED(1 << 7)
>  #define SUN8I_I2S_FMT0_BCLK_POLARITY_NORMAL  (0 << 7)
>  
> +#define SUN8I_I2S_FMT1_REG_SEXT_MASK GENMASK(5, 4)
> +#define SUN8I_I2S_FMT1_REG_SEXT(sext)((sext) << 4)
> +
>  #define SUN8I_I2S_INT_STA_REG0x0c
>  #define SUN8I_I2S_FIFO_TX_REG0x20
>  
> @@ -663,6 +669,12 @@ static int sun4i_i2s_set_soc_fmt(const struct sun4i_i2s 
> *i2s,
>   }
>   regmap_update_bits(i2s->regmap, SUN4I_I2S_CTRL_REG,
>  SUN4I_I2S_CTRL_MODE_MASK, val);
> +
> + /* Set sign extension to pad out LSB with 0 */
> + regmap_update_bits(i2s->regmap, SUN4I_I2S_FMT1_REG,
> +SUN4I_I2S_FMT1_REG_SEXT_MASK,
> +SUN4I_I2S_FMT1_REG_SEXT(0));
> +

This is just a note; I'm not suggesting a change here:

This does nothing, because SUN4I_I2S_FMT1_REG only affects PCM mode, which is
not implemented in the driver for the sun4i generation of hardware. PCM mode
requires setting bit 4 of SUN4I_I2S_CTRL_REG, and then configuring
SUN4I_I2S_FMT1_REG instead of SUN4I_I2S_FMT0_REG.

Regards,
Samuel

[PATCH 2/2] Input: elan_i2c - High resolution report for new pattern 2.

2020-07-09 Thread Jingle Wu

Due to the higer resolution touchpads would be produced,
The mainly modifications were as below:
1. the former resolution bits were not enough. Extend the
resolution bits from 12 to 16 bits.
2. Increase the report ID 0x60 for higher resoltion of touchpads.
3. Move the position of mk value in the report packet.

Signed-off-by: Jingle Wu 
---
 drivers/input/mouse/elan_i2c.h   |   7 +-
 drivers/input/mouse/elan_i2c_core.c  | 134 +++
 drivers/input/mouse/elan_i2c_i2c.c   |  48 +++---
 drivers/input/mouse/elan_i2c_smbus.c |  14 ++-
 4 files changed, 170 insertions(+), 33 deletions(-)

diff --git a/drivers/input/mouse/elan_i2c.h b/drivers/input/mouse/elan_i2c.h
index f28b747978f5..71fff2cef8b5 100644
--- a/drivers/input/mouse/elan_i2c.h
+++ b/drivers/input/mouse/elan_i2c.h
@@ -78,9 +78,12 @@ struct elan_transport_ops {
int (*write_fw_block)(struct i2c_client *client, u16 fw_page_size,
  const u8 *page, u16 checksum, int idx);
int (*finish_fw_update)(struct i2c_client *client,
-   struct completion *reset_done);
+   struct completion *reset_done,
+   int report_len);
 
-   int (*get_report)(struct i2c_client *client, u8 *report);
+   int (*get_report_length)(struct i2c_client *client, int *report_len);
+   int (*get_report)(struct i2c_client *client, u8 *report,
+   int report_len);
int (*get_pressure_adjustment)(struct i2c_client *client,
   int *adjustment);
int (*get_pattern)(struct i2c_client *client, u8 *pattern);
diff --git a/drivers/input/mouse/elan_i2c_core.c 
b/drivers/input/mouse/elan_i2c_core.c
index 0703f7d0d02d..a1bdb25c2450 100644
--- a/drivers/input/mouse/elan_i2c_core.c
+++ b/drivers/input/mouse/elan_i2c_core.c
@@ -42,6 +42,8 @@
 
 #define DRIVER_NAME"elan_i2c"
 #define ELAN_VENDOR_ID 0x04f3
+#define ELAN_I2C_INTERFACE 1
+#define ELAN_SMBUS_INTERFACE   2
 #define ETP_MAX_PRESSURE   255
 #define ETP_FWIDTH_REDUCE  90
 #define ETP_FINGER_WIDTH   15
@@ -50,12 +52,13 @@
 #define ETP_MAX_FINGERS5
 #define ETP_FINGER_DATA_LEN5
 #define ETP_REPORT_ID  0x5D
+#define ETP_REPORT_ID2 0x60
 #define ETP_TP_REPORT_ID   0x5E
 #define ETP_REPORT_ID_OFFSET   2
 #define ETP_TOUCH_INFO_OFFSET  3
 #define ETP_FINGER_DATA_OFFSET 4
 #define ETP_HOVER_INFO_OFFSET  30
-#define ETP_MAX_REPORT_LEN 34
+#define ETP_MAX_REPORT_LEN 39
 
 /* The main device structure */
 struct elan_tp_data {
@@ -72,6 +75,8 @@ struct elan_tp_data {
 
struct mutexsysfs_mutex;
 
+   int interface;
+
unsigned intmax_x;
unsigned intmax_y;
unsigned intwidth_x;
@@ -85,6 +90,7 @@ struct elan_tp_data {
u8  sm_version;
u8  iap_version;
u16 fw_checksum;
+   int report_len;
int pressure_adjustment;
u8  mode;
u16 ic_type;
@@ -354,6 +360,10 @@ static int elan_query_device_info(struct elan_tp_data 
*data)
if (error)
return error;
 
+   error = data->ops->get_report_length(data->client, >report_len);
+   if (error)
+   return error;
+
error = elan_get_fwinfo(data->ic_type, data->iap_version, 
>fw_validpage_count,
>fw_signature_address,
@@ -366,16 +376,21 @@ static int elan_query_device_info(struct elan_tp_data 
*data)
return 0;
 }
 
-static unsigned int elan_convert_resolution(u8 val)
+static unsigned int elan_convert_resolution(u8 val, u8 pattern)
 {
/*
-* (value from firmware) * 10 + 790 = dpi
-*
+* pattern <= 0x01:
+*  (value from firmware) * 10 + 790 = dpi
+* else
+*  ((value from firmware) + 3) * 100 = dpi
 * We also have to convert dpi to dots/mm (*10/254 to avoid floating
 * point).
 */
 
-   return ((int)(char)val * 10 + 790) * 10 / 254;
+   if (pattern <= 0x01)
+   return ((int)(char)val * 10 + 790) * 10 / 254;
+   else
+   return (((int)(char)val + 3) * 100) * 10 / 254;
 }
 
 static int elan_query_device_parameters(struct elan_tp_data *data)
@@ -424,8 +439,8 @@ static int elan_query_device_parameters(struct elan_tp_data 
*data)
if (error)
return error;
 
-   data->x_res = elan_convert_resolution(hw_x_res);
-   data->y_res = elan_convert_resolution(hw_y_res);
+   data->x_res = elan_convert_resolution(hw_x_res, data->pattern);
+   data->y_res = elan_convert_resolution(hw_y_res, data->pattern);

[PATCH 1/2] Input: elan_i2c - Add ic type 0x11 0x13 0x14.

2020-07-09 Thread Jingle Wu

Modify the iap method for all IC.
Get the correct value of ic_type for old and new pattern of
firmware.

Signed-off-by: Jingle Wu 
---
 drivers/input/mouse/elan_i2c.h   |   6 +-
 drivers/input/mouse/elan_i2c_core.c  |  52 -
 drivers/input/mouse/elan_i2c_i2c.c   | 109 ++-
 drivers/input/mouse/elan_i2c_smbus.c |  10 +--
 4 files changed, 134 insertions(+), 43 deletions(-)

diff --git a/drivers/input/mouse/elan_i2c.h b/drivers/input/mouse/elan_i2c.h
index a9074ac9364f..f28b747978f5 100644
--- a/drivers/input/mouse/elan_i2c.h
+++ b/drivers/input/mouse/elan_i2c.h
@@ -33,6 +33,8 @@
 #define ETP_FW_IAP_PAGE_ERR(1 << 5)
 #define ETP_FW_IAP_INTF_ERR(1 << 4)
 #define ETP_FW_PAGE_SIZE   64
+#define ETP_FW_PAGE_SIZE_128   128
+#define ETP_FW_PAGE_SIZE_512   512
 #define ETP_FW_SIGNATURE_SIZE  6
 
 struct i2c_client;
@@ -72,8 +74,8 @@ struct elan_transport_ops {
int (*iap_get_mode)(struct i2c_client *client, enum tp_mode *mode);
int (*iap_reset)(struct i2c_client *client);
 
-   int (*prepare_fw_update)(struct i2c_client *client);
-   int (*write_fw_block)(struct i2c_client *client,
+   int (*prepare_fw_update)(struct i2c_client *client, u16 ic_type);
+   int (*write_fw_block)(struct i2c_client *client, u16 fw_page_size,
  const u8 *page, u16 checksum, int idx);
int (*finish_fw_update)(struct i2c_client *client,
struct completion *reset_done);
diff --git a/drivers/input/mouse/elan_i2c_core.c 
b/drivers/input/mouse/elan_i2c_core.c
index 3f9354baac4b..0703f7d0d02d 100644
--- a/drivers/input/mouse/elan_i2c_core.c
+++ b/drivers/input/mouse/elan_i2c_core.c
@@ -89,7 +89,8 @@ struct elan_tp_data {
u8  mode;
u16 ic_type;
u16 fw_validpage_count;
-   u16 fw_signature_address;
+   u16 fw_page_size;
+   u32 fw_signature_address;
 
boolirq_wake;
 
@@ -100,8 +101,10 @@ struct elan_tp_data {
boolmiddle_button;
 };
 
-static int elan_get_fwinfo(u16 ic_type, u16 *validpage_count,
-  u16 *signature_address)
+static int elan_get_fwinfo(u16 ic_type, u8 iap_version, 
+   u16 *validpage_count,
+  u32 *signature_address,
+  u16 *page_size)
 {
switch (ic_type) {
case 0x00:
@@ -124,18 +127,34 @@ static int elan_get_fwinfo(u16 ic_type, u16 
*validpage_count,
*validpage_count = 640;
break;
case 0x10:
+   case 0x14:
*validpage_count = 1024;
break;
+   case 0x11:
+   *validpage_count = 1280;
+   break;
+   case 0x13:
+   *validpage_count = 2048;
+   break;
default:
/* unknown ic type clear value */
*validpage_count = 0;
*signature_address = 0;
+   *page_size = 0;
return -ENXIO;
}
 
*signature_address =
(*validpage_count * ETP_FW_PAGE_SIZE) - ETP_FW_SIGNATURE_SIZE;
 
+   if ((ic_type == 0x14) && (iap_version >= 2)) {
+   *validpage_count /= 8;
+   *page_size = ETP_FW_PAGE_SIZE_512;
+   } else if ((ic_type >= 0x0D) && (iap_version >= 1)) {
+   *validpage_count /= 2;
+   *page_size = ETP_FW_PAGE_SIZE_128;
+   } else
+   *page_size = ETP_FW_PAGE_SIZE;
return 0;
 }
 
@@ -312,7 +331,6 @@ static int elan_initialize(struct elan_tp_data *data)
 static int elan_query_device_info(struct elan_tp_data *data)
 {
int error;
-   u16 ic_type;
 
error = data->ops->get_version(data->client, false, >fw_version);
if (error)
@@ -336,13 +354,10 @@ static int elan_query_device_info(struct elan_tp_data 
*data)
if (error)
return error;
 
-   if (data->pattern == 0x01)
-   ic_type = data->ic_type;
-   else
-   ic_type = data->iap_version;
-
-   error = elan_get_fwinfo(ic_type, >fw_validpage_count,
-   >fw_signature_address);
+   error = elan_get_fwinfo(data->ic_type, data->iap_version, 
+   >fw_validpage_count,
+   >fw_signature_address,
+   >fw_page_size);
if (error)
dev_warn(>client->dev,
 "unexpected iap version %#04x (ic type: %#04x), 
firmware update will not work\n",
@@ -430,14 +445,14 @@ static int elan_query_device_parameters(struct 
elan_tp_data *data)
  * IAP firmware updater related routines
  **
  */
-static int elan_write_fw_block(struct elan_tp_data *data,
+static int

RE: [PATCH v3 06/14] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)

2020-07-09 Thread Liu, Yi L

Hi Alex, 

> From: Alex Williamson 
> Sent: Thursday, July 9, 2020 10:28 PM
> 
> On Thu, 9 Jul 2020 07:16:31 +
> "Liu, Yi L"  wrote:
> 
> > Hi Alex,
> >
> > After more thinking, looks like adding a r-b tree is still not enough to
> > solve the potential problem for free a range of PASID in one ioctl. If
> > caller gives [0, MAX_UNIT] in the free request, kernel anyhow should
> > loop all the PASIDs and search in the r-b tree. Even VFIO can track the
> > smallest/largest allocated PASID, and limit the free range to an accurate
> > range, it is still no efficient. For example, user has allocated two PASIDs
> > ( 1 and 999), and user gives the [0, MAX_UNIT] range in free request. VFIO
> > will limit the free range to be [1, 999], but still needs to loop PASID 1 -
> > 999, and search in r-b tree.
> 
> That sounds like a poor tree implementation.  Look at vfio_find_dma()
> for instance, it returns a node within the specified range.  If the
> tree has two nodes within the specified range we should never need to
> call a search function like vfio_find_dma() more than three times.  We
> call it once, get the first node, remove it.  Call it again, get the
> other node, remove it.  Call a third time, find no matches, we're done.
> So such an implementation limits searches to N+1 where N is the number
> of nodes within the range.

I see. When getting a free range from user. Use the range to find suited
PASIDs in the r-b tree. For the example I mentioned, if giving [0, MAX_UNIT],
will find two nodes. If giving [0, 100] range, then only one node will be
found. But even though, it still take some time if the user holds a bunch
of PASIDs and user gives a big free range.

> > So I'm wondering can we fall back to prior proposal which only free one
> > PASID for a free request. how about your opinion?
> 
> Doesn't it still seem like it would be a useful user interface to have
> a mechanism to free all pasids, by calling with exactly [0, MAX_UINT]?
> I'm not sure if there's another use case for this given than the user
> doesn't have strict control of the pasid values they get.  Thanks,

I don't have such use case neither. perhaps we may allow it in future by
adding flag. but if it's still useful, I may try with your suggestion. :-)

Regards,
Yi Liu

> Alex
> 
> > > From: Liu, Yi L 
> > > Sent: Thursday, July 9, 2020 10:26 AM
> > >
> > > Hi Kevin,
> > >
> > > > From: Tian, Kevin 
> > > > Sent: Thursday, July 9, 2020 10:18 AM
> > > >
> > > > > From: Liu, Yi L 
> > > > > Sent: Thursday, July 9, 2020 10:08 AM
> > > > >
> > > > > Hi Kevin,
> > > > >
> > > > > > From: Tian, Kevin 
> > > > > > Sent: Thursday, July 9, 2020 9:57 AM
> > > > > >
> > > > > > > From: Liu, Yi L 
> > > > > > > Sent: Thursday, July 9, 2020 8:32 AM
> > > > > > >
> > > > > > > Hi Alex,
> > > > > > >
> > > > > > > > Alex Williamson 
> > > > > > > > Sent: Thursday, July 9, 2020 3:55 AM
> > > > > > > >
> > > > > > > > On Wed, 8 Jul 2020 08:16:16 + "Liu, Yi L"
> > > > > > > >  wrote:
> > > > > > > >
> > > > > > > > > Hi Alex,
> > > > > > > > >
> > > > > > > > > > From: Liu, Yi L < yi.l@intel.com>
> > > > > > > > > > Sent: Friday, July 3, 2020 2:28 PM
> > > > > > > > > >
> > > > > > > > > > Hi Alex,
> > > > > > > > > >
> > > > > > > > > > > From: Alex Williamson 
> > > > > > > > > > > Sent: Friday, July 3, 2020 5:19 AM
> > > > > > > > > > >
> > > > > > > > > > > On Wed, 24 Jun 2020 01:55:19 -0700 Liu Yi L
> > > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > > This patch allows user space to request PASID
> > > > > > > > > > > > allocation/free,
> > > > > e.g.
> > > > > > > > > > > > when serving the request from the guest.
> > > > > > > > > > > >
> > > > > > > > > > > > PASIDs that are not freed by userspace are
> > > > > > > > > > > > automatically freed
> > > > > > > when
> > > > > > > > > > > > the IOASID set is destroyed when process exits.
> > > > > > > > > [...]
> > > > > > > > > > > > +static int vfio_iommu_type1_pasid_request(struct
> > > > > > > > > > > > +vfio_iommu
> > > > > > > *iommu,
> > > > > > > > > > > > + unsigned long 
> > > > > > > > > > > > arg) {
> > > > > > > > > > > > +   struct vfio_iommu_type1_pasid_request req;
> > > > > > > > > > > > +   unsigned long minsz;
> > > > > > > > > > > > +
> > > > > > > > > > > > +   minsz = offsetofend(struct
> > > > vfio_iommu_type1_pasid_request,
> > > > > > > > range);
> > > > > > > > > > > > +
> > > > > > > > > > > > +   if (copy_from_user(, (void __user *)arg, 
> > > > > > > > > > > > minsz))
> > > > > > > > > > > > +   return -EFAULT;
> > > > > > > > > > > > +
> > > > > > > > > > > > +   if (req.argsz < minsz || (req.flags &
> > > > > > > > ~VFIO_PASID_REQUEST_MASK))
> > > > > > > > > > > > +   return -EINVAL;
> > > > > > > > > > > > +
> > > > > > > > > > > > +   if (req.range.min > req.range.max)
> > > > > > > > > > >
> > > > > > > > > > > Is it exploitable that a user can spin

Re: [PATCH RFC v8 02/11] vhost: use batched get_vq_desc version

2020-07-09 Thread Eugenio Perez Martin

On Fri, Jul 10, 2020 at 5:56 AM Jason Wang  wrote:
>
>
> On 2020/7/10 上午1:37, Michael S. Tsirkin wrote:
> > On Thu, Jul 09, 2020 at 06:46:13PM +0200, Eugenio Perez Martin wrote:
> >> On Wed, Jul 1, 2020 at 4:10 PM Jason Wang  wrote:
> >>>
> >>> On 2020/7/1 下午9:04, Eugenio Perez Martin wrote:
>  On Wed, Jul 1, 2020 at 2:40 PM Jason Wang  wrote:
> > On 2020/7/1 下午6:43, Eugenio Perez Martin wrote:
> >> On Tue, Jun 23, 2020 at 6:15 PM Eugenio Perez Martin
> >>  wrote:
> >>> On Mon, Jun 22, 2020 at 6:29 PM Michael S. Tsirkin  
> >>> wrote:
>  On Mon, Jun 22, 2020 at 06:11:21PM +0200, Eugenio Perez Martin wrote:
> > On Mon, Jun 22, 2020 at 5:55 PM Michael S. Tsirkin 
> >  wrote:
> >> On Fri, Jun 19, 2020 at 08:07:57PM +0200, Eugenio Perez Martin 
> >> wrote:
> >>> On Mon, Jun 15, 2020 at 2:28 PM Eugenio Perez Martin
> >>>  wrote:
>  On Thu, Jun 11, 2020 at 5:22 PM Konrad Rzeszutek Wilk
>   wrote:
> > On Thu, Jun 11, 2020 at 07:34:19AM -0400, Michael S. Tsirkin 
> > wrote:
> >> As testing shows no performance change, switch to that now.
> > What kind of testing? 100GiB? Low latency?
> >
>  Hi Konrad.
> 
>  I tested this version of the patch:
>  https://lkml.org/lkml/2019/10/13/42
> 
>  It was tested for throughput with DPDK's testpmd (as described in
>  http://doc.dpdk.org/guides/howto/virtio_user_as_exceptional_path.html)
>  and kernel pktgen. No latency tests were performed by me. Maybe 
>  it is
>  interesting to perform a latency test or just a different set of 
>  tests
>  over a recent version.
> 
>  Thanks!
> >>> I have repeated the tests with v9, and results are a little bit 
> >>> different:
> >>> * If I test opening it with testpmd, I see no change between 
> >>> versions
> >> OK that is testpmd on guest, right? And vhost-net on the host?
> >>
> > Hi Michael.
> >
> > No, sorry, as described in
> > http://doc.dpdk.org/guides/howto/virtio_user_as_exceptional_path.html.
> > But I could add to test it in the guest too.
> >
> > These kinds of raw packets "bursts" do not show performance
> > differences, but I could test deeper if you think it would be worth
> > it.
>  Oh ok, so this is without guest, with virtio-user.
>  It might be worth checking dpdk within guest too just
>  as another data point.
> 
> >>> Ok, I will do it!
> >>>
> >>> * If I forward packets between two vhost-net interfaces in the 
> >>> guest
> >>> using a linux bridge in the host:
> >> And here I guess you mean virtio-net in the guest kernel?
> > Yes, sorry: Two virtio-net interfaces connected with a linux bridge 
> > in
> > the host. More precisely:
> > * Adding one of the interfaces to another namespace, assigning it an
> > IP, and starting netserver there.
> > * Assign another IP in the range manually to the other virtual net
> > interface, and start the desired test there.
> >
> > If you think it would be better to perform then differently please 
> > let me know.
>  Not sure why you bother with namespaces since you said you are
>  using L2 bridging. I guess it's unimportant.
> 
> >>> Sorry, I think I should have provided more context about that.
> >>>
> >>> The only reason to use namespaces is to force the traffic of these
> >>> netperf tests to go through the external bridge. To test netperf
> >>> different possibilities than the testpmd (or pktgen or others "blast
> >>> of frames unconditionally" tests).
> >>>
> >>> This way, I make sure that is the same version of everything in the
> >>> guest, and is a little bit easier to manage cpu affinity, start and
> >>> stop testing...
> >>>
> >>> I could use a different VM for sending and receiving, but I find this
> >>> way a faster one and it should not introduce a lot of noise. I can
> >>> test with two VM if you think that this use of network namespace
> >>> introduces too much noise.
> >>>
> >>> Thanks!
> >>>
> >>>  - netperf UDP_STREAM shows a performance increase of 1.8, 
> >>> almost
> >>> doubling performance. This gets lower as frame size increase.
> >> Regarding UDP_STREAM:
> >> * with event_idx=on: The performance difference is reduced a lot if
> >> applied affinity properly (manually assigning CPU on host/guest and
> >> setting IRQs on guest), making them perform equally with and without
> >> the patch again. Maybe the batching makes the

Re: [PATCH 1/5] lib: Add a generic version of devmem_is_allowed()

2020-07-09 Thread Christoph Hellwig

On Thu, Jul 09, 2020 at 11:49:21PM +0300, Mike Rapoport wrote:
> > +#ifndef CONFIG_GENERIC_DEVMEM_IS_ALLOWED
> > +extern int devmem_is_allowed(unsigned long pfn);
> > +#endif

Nit: no need for the extern here.

> > +config GENERIC_LIB_DEVMEM_IS_ALLOWED
> > +   bool
> > +   select ARCH_HAS_DEVMEM_IS_ALLOWED
> 
> This seems to work the other way around from the usual Kconfig chains.
> In the most cases ARCH_HAS_SOMETHING selects GENERIC_SOMETHING.
> 
> I believe nicer way would be to make 
> 
> config STRICT_DEVMEM
>   bool "Filter access to /dev/mem"
>   depends on MMU && DEVMEM
>   depends on ARCH_HAS_DEVMEM_IS_ALLOWED || GENERIC_LIB_DEVMEM_IS_ALLOWED
> 
> config GENERIC_LIB_DEVMEM_IS_ALLOWED
>   bool
> 
> and then s/select ARCH_HAS_DEVMEM_IS_ALLOWED/select 
> GENERIC_LIB_DEVMEM_IS_ALLOWED/
> in the arch Kconfigs and drop ARCH_HAS_DEVMEM_IS_ALLOWED in the end.

To take a step back:  Is there any reason to not just always
STRICT_DEVMEM? Maybe for a few architectures that don't currently
support a strict /dev/mem the generic version isn't quite correct, but
someone selecting the option and finding the issue is the best way to
figure that out..

Re: Linux kernel in-tree Rust support

2020-07-09 Thread Josh Triplett

On Thu, Jul 09, 2020 at 11:41:47AM -0700, Nick Desaulniers wrote:
> Hello folks,
> I'm working on putting together an LLVM "Micro Conference" for the
> upcoming Linux Plumbers Conf
> (https://www.linuxplumbersconf.org/event/7/page/47-attend).  It's not
> solidified yet, but I would really like to run a session on support
> for Rust "in tree."  I suspect we could cover technical aspects of
> what that might look like (I have a prototype of that, was trivial to
> wire up KBuild support), but also a larger question of "should we do
> this?" or "how might we place limits on where this can be used?"
> 
> Question to folks explicitly in To:, are you planning on attending plumbers?
> 
> If so, would this be an interesting topic that you'd participate in?

I hadn't planned to attend the virtual event, but this sounds like a
topic I absolutely have to attend. Please follow up if this proposal
gets accepted.

I'd love to see a path to incorporating Rust into the kernel, as long as
we can ensure that:
- There are appropriate Rustic interfaces that are natural and safe to
  use (not just C FFI, and not *just* trivial transformations like
  slices instead of buffer+len pairs).
- Those Rustic interfaces are easy to maintain and evolve with the kernel.
- We provide compelling use cases that go beyond just basic safety, such
  as concurrency checking, or lifetimes for object ownership.
- We make Rust fit naturally into the kernel's norms and standards,
  while also introducing some of Rust's norms and standards where they
  make sense. (We want to fit into the kernel, and at the same time, we
  don't want to hastily saw off all the corners that don't immediately
  fit, because some of those corners provide value. Let's take our
  time.)
- We move slowly and carefully, making sure it's a gradual introduction,
  and give people time to incorporate the Rust toolchain into their
  kernel workflows.

Also, with my "Rust language team lead" hat on, I'd be happy to have the
Linux kernel feeding into Rust language development priorities. If
building Rustic interfaces within the kernel requires some additional
language features, we should see what enhancements to the language would
best serve those requirements. I've often seen the sentiment that
co-evolving Linux and a C compiler would be beneficial for both; I think
the same would be true of Linux and the Rust compiler.

Re: [PATCH v3 4/4] iommu/vt-d: Add page response ops support

2020-07-09 Thread Lu Baolu


Hi Kevin,

On 2020/7/10 10:42, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Thursday, July 9, 2020 3:06 PM

After page requests are handled, software must respond to the device
which raised the page request with the result. This is done through
the iommu ops.page_response if the request was reported to outside of
vendor iommu driver through iommu_report_device_fault(). This adds the
VT-d implementation of page_response ops.

Co-developed-by: Jacob Pan 
Signed-off-by: Jacob Pan 
Co-developed-by: Liu Yi L 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel/iommu.c |   1 +
  drivers/iommu/intel/svm.c   | 100

  include/linux/intel-iommu.h |   3 ++
  3 files changed, 104 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 4a6b6960fc32..98390a6d8113 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -6057,6 +6057,7 @@ const struct iommu_ops intel_iommu_ops = {
.sva_bind   = intel_svm_bind,
.sva_unbind = intel_svm_unbind,
.sva_get_pasid  = intel_svm_get_pasid,
+   .page_response  = intel_svm_page_response,
  #endif
  };

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index d24e71bac8db..839d2af377b6 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -1082,3 +1082,103 @@ int intel_svm_get_pasid(struct iommu_sva *sva)

return pasid;
  }
+
+int intel_svm_page_response(struct device *dev,
+   struct iommu_fault_event *evt,
+   struct iommu_page_response *msg)
+{
+   struct iommu_fault_page_request *prm;
+   struct intel_svm_dev *sdev = NULL;
+   struct intel_svm *svm = NULL;
+   struct intel_iommu *iommu;
+   bool private_present;
+   bool pasid_present;
+   bool last_page;
+   u8 bus, devfn;
+   int ret = 0;
+   u16 sid;
+
+   if (!dev || !dev_is_pci(dev))
+   return -ENODEV;
+
+   iommu = device_to_iommu(dev, , );
+   if (!iommu)
+   return -ENODEV;
+
+   if (!msg || !evt)
+   return -EINVAL;
+
+   mutex_lock(_mutex);
+
+   prm = >fault.prm;
+   sid = PCI_DEVID(bus, devfn);
+   pasid_present = prm->flags &
IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+   private_present = prm->flags &
IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA;
+   last_page = prm->flags &
IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
+
+   if (pasid_present) {
+   if (prm->pasid == 0 || prm->pasid >= PASID_MAX) {
+   ret = -EINVAL;
+   goto out;
+   }
+
+   ret = pasid_to_svm_sdev(dev, prm->pasid, , );
+   if (ret || !sdev) {
+   ret = -ENODEV;
+   goto out;
+   }
+
+   /*
+* For responses from userspace, need to make sure that the
+* pasid has been bound to its mm.
+   */
+   if (svm->flags & SVM_FLAG_GUEST_MODE) {
+   struct mm_struct *mm;
+
+   mm = get_task_mm(current);
+   if (!mm) {
+   ret = -EINVAL;
+   goto out;
+   }
+
+   if (mm != svm->mm) {
+   ret = -ENODEV;
+   mmput(mm);
+   goto out;
+   }
+
+   mmput(mm);
+   }
+   } else {
+   pr_err_ratelimited("Invalid page response: no pasid\n");
+   ret = -EINVAL;
+   goto out;


check pasid=0 first, then no need to indent so many lines above.


Yes.




+   }
+
+   /*
+* Per VT-d spec. v3.0 ch7.7, system software must respond
+* with page group response if private data is present (PDP)
+* or last page in group (LPIG) bit is set. This is an
+* additional VT-d requirement beyond PCI ATS spec.
+*/


What is the behavior if system software doesn't follow the requirement?
en... maybe the question is really about whether the information in prm
comes from userspace or from internally-recorded info in iommu core.
The former cannot be trusted. The latter one is OK.


We require a page response when reporting such event. The upper layer
(IOMMU core or VFIO) will be implemented with a timer, if userspace
doesn't respond in time, the timer will get expired and a FAILURE
response will be sent to device.

Best regards,
baolu



Thanks
Kevin


+   if (last_page || private_present) {
+   struct qi_desc desc;
+
+   desc.qw0 = QI_PGRP_PASID(prm->pasid) | QI_PGRP_DID(sid)
|
+   QI_PGRP_PASID_P(pasid_present) |
+   QI_PGRP_PDP(private_present) |
+

[RESEND: PATCH v5 1/4] regulator: Allow regulators to verify enabled during enable()

2020-07-09 Thread Sumit Semwal

Some regulators might need to verify that they have indeed been enabled
after the enable() call is made and enable_time delay has passed.

This is implemented by repeatedly checking is_enabled() upto
poll_enabled_time, waiting for the already calculated enable delay in
each iteration.

Signed-off-by: Sumit Semwal 

---
v3: addressed minor review comments, improved kernel doc
v2: Address review comments, including swapping enable_time and 
poll_enabled_time.
---
 drivers/regulator/core.c | 63 +++-
 include/linux/regulator/driver.h |  5 +++
 2 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 03154f5b939f..538a2779986a 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -2347,6 +2347,37 @@ static void _regulator_enable_delay(unsigned int delay)
udelay(us);
 }
 
+/**
+ * _regulator_check_status_enabled
+ *
+ * A helper function to check if the regulator status can be interpreted
+ * as 'regulator is enabled'.
+ * @rdev: the regulator device to check
+ *
+ * Return:
+ * * 1 - if status shows regulator is in enabled state
+ * * 0 - if not enabled state
+ * * Error Value   - as received from ops->get_status()
+ */
+static inline int _regulator_check_status_enabled(struct regulator_dev *rdev)
+{
+   int ret = rdev->desc->ops->get_status(rdev);
+
+   if (ret < 0) {
+   rdev_info(rdev, "get_status returned error: %d\n", ret);
+   return ret;
+   }
+
+   switch (ret) {
+   case REGULATOR_STATUS_OFF:
+   case REGULATOR_STATUS_ERROR:
+   case REGULATOR_STATUS_UNDEFINED:
+   return 0;
+   default:
+   return 1;
+   }
+}
+
 static int _regulator_do_enable(struct regulator_dev *rdev)
 {
int ret, delay;
@@ -2407,7 +2438,37 @@ static int _regulator_do_enable(struct regulator_dev 
*rdev)
 * together.  */
trace_regulator_enable_delay(rdev_get_name(rdev));
 
-   _regulator_enable_delay(delay);
+   /* If poll_enabled_time is set, poll upto the delay calculated
+* above, delaying poll_enabled_time uS to check if the regulator
+* actually got enabled.
+* If the regulator isn't enabled after enable_delay has
+* expired, return -ETIMEDOUT.
+*/
+   if (rdev->desc->poll_enabled_time) {
+   unsigned int time_remaining = delay;
+
+   while (time_remaining > 0) {
+   _regulator_enable_delay(rdev->desc->poll_enabled_time);
+
+   if (rdev->desc->ops->get_status) {
+   ret = _regulator_check_status_enabled(rdev);
+   if (ret < 0)
+   return ret;
+   else if (ret)
+   break;
+   } else if (rdev->desc->ops->is_enabled(rdev))
+   break;
+
+   time_remaining -= rdev->desc->poll_enabled_time;
+   }
+
+   if (time_remaining <= 0) {
+   rdev_err(rdev, "Enabled check timed out\n");
+   return -ETIMEDOUT;
+   }
+   } else {
+   _regulator_enable_delay(delay);
+   }
 
trace_regulator_enable_complete(rdev_get_name(rdev));
 
diff --git a/include/linux/regulator/driver.h b/include/linux/regulator/driver.h
index 7eb9fea8e482..436df3ba0b2a 100644
--- a/include/linux/regulator/driver.h
+++ b/include/linux/regulator/driver.h
@@ -305,6 +305,9 @@ enum regulator_type {
  * @enable_time: Time taken for initial enable of regulator (in uS).
  * @off_on_delay: guard time (in uS), before re-enabling a regulator
  *
+ * @poll_enabled_time: The polling interval (in uS) to use while checking that
+ * the regulator was actually enabled. Max upto 
enable_time.
+ *
  * @of_map_mode: Maps a hardware mode defined in a DeviceTree to a standard 
mode
  */
 struct regulator_desc {
@@ -372,6 +375,8 @@ struct regulator_desc {
 
unsigned int off_on_delay;
 
+   unsigned int poll_enabled_time;
+
unsigned int (*of_map_mode)(unsigned int mode);
 };
 
-- 
2.27.0

[RESEND PATCH v5 3/4] arm64: dts: qcom: pmi8998: Add nodes for LAB and IBB regulators

2020-07-09 Thread Sumit Semwal

From: Nisha Kumari 

This patch adds devicetree nodes for LAB and IBB regulators.

Signed-off-by: Nisha Kumari 
Signed-off-by: Sumit Semwal 
  [sumits: Updated for better compatible strings and names]

---
v5: sumits: removed interrupt-names, since there is only one
interrupt per node
v4: sumits: removed labibb label which is not needed
v3: sumits: updated interrupt-names as per review comments
v2: sumits: updated for better compatible string and names
---
 arch/arm64/boot/dts/qcom/pmi8998.dtsi | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/pmi8998.dtsi 
b/arch/arm64/boot/dts/qcom/pmi8998.dtsi
index 23f9146a161e..d016b12967eb 100644
--- a/arch/arm64/boot/dts/qcom/pmi8998.dtsi
+++ b/arch/arm64/boot/dts/qcom/pmi8998.dtsi
@@ -25,5 +25,17 @@ pmi8998_lsid1: pmic@3 {
reg = <0x3 SPMI_USID>;
#address-cells = <1>;
#size-cells = <0>;
+
+   labibb {
+   compatible = "qcom,pmi8998-lab-ibb";
+
+   ibb: ibb {
+   interrupts = <0x3 0xdc 0x2 
IRQ_TYPE_EDGE_RISING>;
+   };
+
+   lab: lab {
+   interrupts = <0x3 0xde 0x0 
IRQ_TYPE_EDGE_RISING>;
+   };
+   };
};
 };
-- 
2.27.0

[RESEND PATCH v5 2/4] dt-bindings: regulator: Add labibb regulator

2020-07-09 Thread Sumit Semwal

From: Nisha Kumari 

Adding the devicetree binding for labibb regulator.

Signed-off-by: Nisha Kumari 
Signed-off-by: Sumit Semwal 
 [sumits: cleanup as per review comments and update to yaml]

---
v5: Addressed review comments - removed interrupt-names, changed to
 dual license, added unevaluatedProperties: false
v4: fixed dt_binding_check issues
v3: moved to yaml
v2: updated for better compatible string and names.
---
 .../regulator/qcom-labibb-regulator.yaml  | 70 +++
 1 file changed, 70 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/qcom-labibb-regulator.yaml

diff --git 
a/Documentation/devicetree/bindings/regulator/qcom-labibb-regulator.yaml 
b/Documentation/devicetree/bindings/regulator/qcom-labibb-regulator.yaml
new file mode 100644
index ..085cbd1ad8d0
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/qcom-labibb-regulator.yaml
@@ -0,0 +1,70 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/regulator/qcom-labibb-regulator.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm's LAB(LCD AMOLED Boost)/IBB(Inverting Buck Boost) Regulator
+
+maintainers:
+  - Sumit Semwal 
+
+description:
+  LAB can be used as a positive boost power supply and IBB can be used as a
+  negative boost power supply for display panels. Currently implemented for
+  pmi8998.
+
+properties:
+  compatible:
+const: qcom,pmi8998-lab-ibb
+
+  lab:
+type: object
+
+properties:
+
+  interrupts:
+maxItems: 1
+description:
+  Short-circuit interrupt for lab.
+
+required:
+- interrupts
+
+  ibb:
+type: object
+
+properties:
+
+  interrupts:
+maxItems: 1
+description:
+  Short-circuit interrupt for lab.
+
+required:
+- interrupts
+
+required:
+  - compatible
+
+unevaluatedProperties: false
+
+examples:
+  - |
+#include 
+
+labibb {
+  compatible = "qcom,pmi8998-lab-ibb";
+
+  lab {
+interrupts = <0x3 0x0 IRQ_TYPE_EDGE_RISING>;
+interrupt-names = "sc-err";
+  };
+
+  ibb {
+interrupts = <0x3 0x2 IRQ_TYPE_EDGE_RISING>;
+interrupt-names = "sc-err";
+  };
+};
+
+...
-- 
2.27.0

Re: a question of split_huge_page

2020-07-09 Thread Mika Penttilä



On 10.7.2020 7.51, Alex Shi wrote:
>
> 在 2020/7/10 上午12:07, Kirill A. Shutemov 写道:
>> On Thu, Jul 09, 2020 at 04:50:02PM +0100, Matthew Wilcox wrote:
>>> On Thu, Jul 09, 2020 at 11:11:11PM +0800, Alex Shi wrote:
 Hi Kirill & Matthew,

 In the func call chain, from split_huge_page() to lru_add_page_tail(),
 Seems tail pages are added to lru list at line 963, but in this scenario
 the head page has no lru bit and isn't set the bit later. Why we do this?
 or do I miss sth?
>>> I don't understand how we get to split_huge_page() with a page that's
>>> not on an LRU list.  Both anonymous and page cache pages should be on
>>> an LRU list.  What am I missing?> 
>
> Thanks a lot for quick reply!
> What I am confusing is the call chain: __iommu_dma_alloc_pages()
> to split_huge_page(), in the func, splited page,
>   page = alloc_pages_node(nid, alloc_flags, order);
> And if the pages were added into lru, they maybe reclaimed and lost,
> that would be a panic bug. But in fact, this never happened for long time.
> Also I put a BUG() at the line, it's nevre triggered in ltp, and run_vmtests


In  __iommu_dma_alloc_pages, after split_huge_page(),  who is taking a
reference on tail pages? Seems tail pages are freed and the function
errornously returns them in pages[] array for use?

> in kselftest.
>
>> Right, and it's never got removed from LRU during the split. The tail
>> pages have to be added to LRU because they now separate from the tail
>> page.
>>
> According to the explaination, looks like we could remove the code path,
> since it's never got into. (base on my v15 patchset). Any comments?
>
> Thanks
> Alex
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 7c52c5228aab..c28409509ad3 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2357,17 +2357,6 @@ static void lru_add_page_tail(struct page *head, 
> struct page *page_tail,
> if (!list)
> SetPageLRU(page_tail);
>
> if (likely(PageLRU(head)))
> list_add_tail(_tail->lru, >lru);
> else if (list) {
> /* page reclaim is reclaiming a huge page */
> get_page(page_tail);
> list_add_tail(_tail->lru, list);
> -   } else {
> -   /*
> -* Head page has not yet been counted, as an hpage,
> -* so we must account for each subpage individually.
> -*
> -* Put page_tail on the list at the correct position
> -* so they all end up in order.
> -*/
> -   VM_BUG_ON_PAGE(1, head);
> -   add_page_to_lru_list_tail(page_tail, lruvec,
> - page_lru(page_tail));
> }
>  }



pEpkey.asc
Description: application/pgp-keys

[PATCH] spi: spi-cadence: add support for chip select high

2020-07-09 Thread Shreyas Joshi

spi cadence driver should support spi-cs-high in mode bits
so that the peripherals that needs the chip select to be high active can
use it. Add the SPI-CS-HIGH flag in the supported mode bits.

Signed-off-by: Shreyas Joshi 
---
 drivers/spi/spi-cadence.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-cadence.c b/drivers/spi/spi-cadence.c
index 82a0ee09cbe1..2b6b9c1ad9d0 100644
--- a/drivers/spi/spi-cadence.c
+++ b/drivers/spi/spi-cadence.c
@@ -556,7 +556,7 @@ static int cdns_spi_probe(struct platform_device *pdev)
master->unprepare_transfer_hardware = cdns_unprepare_transfer_hardware;
master->set_cs = cdns_spi_chipselect;
master->auto_runtime_pm = true;
-   master->mode_bits = SPI_CPOL | SPI_CPHA;
+   master->mode_bits = SPI_CPOL | SPI_CPHA | SPI_CS_HIGH;
/* Set to default valid value */
master->max_speed_hz = clk_get_rate(xspi->ref_clk) / 4;
--
2.20.1

[RESEND: PATCH v5 0/4] Qualcomm labibb regulator driver

2020-07-09 Thread Sumit Semwal

This series adds a driver for LAB/IBB regulators found on some Qualcomm SoCs.
These regulators provide positive and/or negative boost power supplies
for LCD/LED display panels connected to the SoC.

This series adds the support for pmi8998 PMIC found in SDM845 family of SoCs.

Changes from v4:
- v4 Review comments incorporated
  - simplified the driver: removed of_get_child_by_name(); use ENABLE_CTL
register and switch over to use the regulator_*_regmap helpers
  - improved kerneldoc
  - From the dt-bindings, removed interrupt-names, changed to dual license,
added unevaluatedProperties: false, removed interrupt-names, since there
is only one interrupt per node
  - Since the Short Circuit handling needs more details from QC engineers,
drop the SC handling patch from this series, to submit it later

Changes from v3:
- Handled review comments from v3
- In core, swapped the meaning of enable_time and poll_enabled_time; so we
   wait for total enable_time delay, and poll in-between at poll_enabled_time
   interval now.
- fixed dt_bindings_check issues in dt-bindings patch.
- Cleanup of register_labibb_regulator(), and adapted to updated meaning of
   poll_enabled_time.

Changes from v2:
- Review comments from v2
- Moved the poll-to-check-enabled functionality to regulator core.
- Used more core features to simplify enable/disable functions.
- Moved the devicetree binding to yaml.
- Updated interrupt-names and simplified handling.

Changes from v1:
- Incorporated review comments from v1
- Changed from virtual-regulator based handling to individual regulator based
  handling.
- Reworked the core to merge most of enable/disable functions, combine the
  regulator_ops into one and allow for future variations.
- is_enabled() is now _really_ is_enabled()
- Simplified the SC interrupt handling - use regmap_read_poll_timeout,
  REGULATOR_EVENT_OVER_CURRENT handling and notification to clients.

Nisha Kumari (3):
  dt-bindings: regulator: Add labibb regulator
  arm64: dts: qcom: pmi8998: Add nodes for LAB and IBB regulators
  regulator: qcom: Add labibb driver

Sumit Semwal (1):
  regulator: Allow regulators to verify enabled during enable()

 .../regulator/qcom-labibb-regulator.yaml  |  70 +++
 arch/arm64/boot/dts/qcom/pmi8998.dtsi |  12 ++
 drivers/regulator/Kconfig |  10 +
 drivers/regulator/Makefile|   1 +
 drivers/regulator/core.c  |  63 ++-
 drivers/regulator/qcom-labibb-regulator.c | 175 ++
 include/linux/regulator/driver.h  |   5 +
 7 files changed, 335 insertions(+), 1 deletion(-)
 create mode 100644 
Documentation/devicetree/bindings/regulator/qcom-labibb-regulator.yaml
 create mode 100644 drivers/regulator/qcom-labibb-regulator.c

-- 
2.27.0

[RESEND PATCH v5 4/4] regulator: qcom: Add labibb driver

2020-07-09 Thread Sumit Semwal

From: Nisha Kumari 

Qualcomm platforms have LAB(LCD AMOLED Boost)/IBB(Inverting Buck Boost)
regulators, labibb for short, which are used as power supply for
LCD Mode displays.

This patch adds labibb regulator driver for pmi8998 PMIC, found on
SDM845 platforms.

Signed-off-by: Nisha Kumari 
  [sumits: reworked the driver design as per upstream review]
Signed-off-by: Sumit Semwal 

---
v5: sumits: review comments addressed
 - removed dev_info about registering
 - removed of_get_child_by_name()
 - changed from using STATUS1 register to using ENABLE_CTL; this
   allows us to use the regulator_*_regmap helpers and makes this
   code cleaner. (In limited testing, STATUS1 seemed to report the
   change faster than ENABLE_CTL, but in absence of mechanism to
   validate if the regulator has indeed fully ramped when STATUS1
   starts flagging, broonie suggested to use the slower ENABLE_CTL
   path for safety).
v4: sumits: address review comments from v3, including cleaning up
 register_labibb_regulator(), and adapted to updated meaning of
 poll_enabled_time.
v3: sumits: addressed review comments from v2; moved to use core
 regulator features like enable_time, off_on_delay, and the newly
 added poll_enabled_time. Moved the check_enabled functionality
 to core framework via poll_enabled_time.
v2: sumits: reworked the driver for more common code, and addressed
 review comments from v1
---
 drivers/regulator/Kconfig |  10 ++
 drivers/regulator/Makefile|   1 +
 drivers/regulator/qcom-labibb-regulator.c | 175 ++
 3 files changed, 186 insertions(+)
 create mode 100644 drivers/regulator/qcom-labibb-regulator.c

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 8f677f5d79b4..c6377df023bc 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -1178,5 +1178,15 @@ config REGULATOR_WM8994
  This driver provides support for the voltage regulators on the
  WM8994 CODEC.
 
+config REGULATOR_QCOM_LABIBB
+   tristate "QCOM LAB/IBB regulator support"
+   depends on SPMI || COMPILE_TEST
+   help
+ This driver supports Qualcomm's LAB/IBB regulators present on the
+ Qualcomm's PMIC chip pmi8998. QCOM LAB and IBB are SPMI
+ based PMIC implementations. LAB can be used as positive
+ boost regulator and IBB can be used as a negative boost regulator
+ for LCD display panel.
+
 endif
 
diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile
index e8f163371071..2c2b0861df76 100644
--- a/drivers/regulator/Makefile
+++ b/drivers/regulator/Makefile
@@ -88,6 +88,7 @@ obj-$(CONFIG_REGULATOR_MT6323)+= mt6323-regulator.o
 obj-$(CONFIG_REGULATOR_MT6358) += mt6358-regulator.o
 obj-$(CONFIG_REGULATOR_MT6380) += mt6380-regulator.o
 obj-$(CONFIG_REGULATOR_MT6397) += mt6397-regulator.o
+obj-$(CONFIG_REGULATOR_QCOM_LABIBB) += qcom-labibb-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_RPM) += qcom_rpm-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_RPMH) += qcom-rpmh-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_SMD_RPM) += qcom_smd-regulator.o
diff --git a/drivers/regulator/qcom-labibb-regulator.c 
b/drivers/regulator/qcom-labibb-regulator.c
new file mode 100644
index ..8c7dd1928380
--- /dev/null
+++ b/drivers/regulator/qcom-labibb-regulator.c
@@ -0,0 +1,175 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2020, The Linux Foundation. All rights reserved.
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define REG_PERPH_TYPE  0x04
+
+#define QCOM_LAB_TYPE  0x24
+#define QCOM_IBB_TYPE  0x20
+
+#define PMI8998_LAB_REG_BASE   0xde00
+#define PMI8998_IBB_REG_BASE   0xdc00
+
+#define REG_LABIBB_STATUS1 0x08
+#define REG_LABIBB_ENABLE_CTL  0x46
+#define LABIBB_STATUS1_VREG_OK_BIT BIT(7)
+#define LABIBB_CONTROL_ENABLE  BIT(7)
+
+#define LAB_ENABLE_CTL_MASKBIT(7)
+#define IBB_ENABLE_CTL_MASK(BIT(7) | BIT(6))
+
+#define LABIBB_OFF_ON_DELAY1000
+#define LAB_ENABLE_TIME(LABIBB_OFF_ON_DELAY * 2)
+#define IBB_ENABLE_TIME(LABIBB_OFF_ON_DELAY * 10)
+#define LABIBB_POLL_ENABLED_TIME   1000
+
+struct labibb_regulator {
+   struct regulator_desc   desc;
+   struct device   *dev;
+   struct regmap   *regmap;
+   struct regulator_dev*rdev;
+   u16 base;
+   u8  type;
+};
+
+struct labibb_regulator_data {
+   const char  *name;
+   u8  type;
+   u16 base;
+   struct regulator_desc   *desc;
+};
+
+static struct regulator_ops qcom_labibb_ops = {
+   .enable =

[PATCH v2 2/3] powerpc/powernv/idle: save-restore DAWR0,DAWRX0 for P10

2020-07-09 Thread Pratik Rajesh Sampat

Additional registers DAWR0, DAWRX0 may be lost on Power 10 for
stop levels < 4.
Therefore save the values of these SPRs before entering a  "stop"
state and restore their values on wakeup.

Signed-off-by: Pratik Rajesh Sampat 
---
 arch/powerpc/platforms/powernv/idle.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index 19d94d021357..f2e2a6a4c274 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -600,6 +600,8 @@ struct p9_sprs {
u64 iamr;
u64 amor;
u64 uamor;
+   u64 dawr0;
+   u64 dawrx0;
 };
 
 static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
@@ -687,6 +689,10 @@ static unsigned long power9_idle_stop(unsigned long psscr, 
bool mmu_on)
sprs.iamr   = mfspr(SPRN_IAMR);
sprs.amor   = mfspr(SPRN_AMOR);
sprs.uamor  = mfspr(SPRN_UAMOR);
+   if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+   sprs.dawr0 = mfspr(SPRN_DAWR0);
+   sprs.dawrx0 = mfspr(SPRN_DAWRX0);
+   }
 
srr1 = isa300_idle_stop_mayloss(psscr); /* go idle */
 
@@ -710,6 +716,10 @@ static unsigned long power9_idle_stop(unsigned long psscr, 
bool mmu_on)
mtspr(SPRN_IAMR,sprs.iamr);
mtspr(SPRN_AMOR,sprs.amor);
mtspr(SPRN_UAMOR,   sprs.uamor);
+   if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+   mtspr(SPRN_DAWR0, sprs.dawr0);
+   mtspr(SPRN_DAWRX0, sprs.dawrx0);
+   }
 
/*
 * Workaround for POWER9 DD2.0, if we lost resources, the ERAT
-- 
2.25.4

[PATCH v2 1/3] powerpc/powernv/idle: Exclude mfspr on HID1,4,5 on P9 and above

2020-07-09 Thread Pratik Rajesh Sampat

POWER9 onwards the support for the registers HID1, HID4, HID5 has been
receded.
Although mfspr on the above registers worked in Power9, In Power10
simulator is unrecognized. Moving their assignment under the
check for machines lower than Power9

Signed-off-by: Pratik Rajesh Sampat 
Reviewed-by: Gautham R. Shenoy 
---
 arch/powerpc/platforms/powernv/idle.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index 2dd467383a88..19d94d021357 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -73,9 +73,6 @@ static int pnv_save_sprs_for_deep_states(void)
 */
uint64_t lpcr_val   = mfspr(SPRN_LPCR);
uint64_t hid0_val   = mfspr(SPRN_HID0);
-   uint64_t hid1_val   = mfspr(SPRN_HID1);
-   uint64_t hid4_val   = mfspr(SPRN_HID4);
-   uint64_t hid5_val   = mfspr(SPRN_HID5);
uint64_t hmeer_val  = mfspr(SPRN_HMEER);
uint64_t msr_val = MSR_IDLE;
uint64_t psscr_val = pnv_deepest_stop_psscr_val;
@@ -117,6 +114,9 @@ static int pnv_save_sprs_for_deep_states(void)
 
/* Only p8 needs to set extra HID regiters */
if (!cpu_has_feature(CPU_FTR_ARCH_300)) {
+   uint64_t hid1_val = mfspr(SPRN_HID1);
+   uint64_t hid4_val = mfspr(SPRN_HID4);
+   uint64_t hid5_val = mfspr(SPRN_HID5);
 
rc = opal_slw_set_reg(pir, SPRN_HID1, hid1_val);
if (rc != 0)
-- 
2.25.4

[PATCH v2 3/3] powerpc/powernv/idle: Rename pnv_first_spr_loss_level variable

2020-07-09 Thread Pratik Rajesh Sampat

Replace the variable name from using "pnv_first_spr_loss_level" to
"pnv_first_fullstate_loss_level".

As pnv_first_spr_loss_level is supposed to be the earliest state that
has OPAL_PM_LOSE_FULL_CONTEXT set, however as shallow states too loose
SPR values, render an incorrect terminology.

Signed-off-by: Pratik Rajesh Sampat 
---
 arch/powerpc/platforms/powernv/idle.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index f2e2a6a4c274..d54e7ef234e3 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -48,7 +48,7 @@ static bool default_stop_found;
  * First stop state levels when SPR and TB loss can occur.
  */
 static u64 pnv_first_tb_loss_level = MAX_STOP_STATE + 1;
-static u64 pnv_first_spr_loss_level = MAX_STOP_STATE + 1;
+static u64 pnv_first_fullstate_loss_level = MAX_STOP_STATE + 1;
 
 /*
  * psscr value and mask of the deepest stop idle state.
@@ -659,7 +659,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, 
bool mmu_on)
  */
mmcr0   = mfspr(SPRN_MMCR0);
}
-   if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
+   if ((psscr & PSSCR_RL_MASK) >= pnv_first_fullstate_loss_level) {
sprs.lpcr   = mfspr(SPRN_LPCR);
sprs.hfscr  = mfspr(SPRN_HFSCR);
sprs.fscr   = mfspr(SPRN_FSCR);
@@ -751,7 +751,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, 
bool mmu_on)
 * just always test PSSCR for SPR/TB state loss.
 */
pls = (psscr & PSSCR_PLS) >> PSSCR_PLS_SHIFT;
-   if (likely(pls < pnv_first_spr_loss_level)) {
+   if (likely(pls < pnv_first_fullstate_loss_level)) {
if (sprs_saved)
atomic_stop_thread_idle();
goto out;
@@ -1098,7 +1098,7 @@ static void __init pnv_power9_idle_init(void)
 * the deepest loss-less (OPAL_PM_STOP_INST_FAST) stop state.
 */
pnv_first_tb_loss_level = MAX_STOP_STATE + 1;
-   pnv_first_spr_loss_level = MAX_STOP_STATE + 1;
+   pnv_first_fullstate_loss_level = MAX_STOP_STATE + 1;
for (i = 0; i < nr_pnv_idle_states; i++) {
int err;
struct pnv_idle_states_t *state = _idle_states[i];
@@ -1109,8 +1109,8 @@ static void __init pnv_power9_idle_init(void)
pnv_first_tb_loss_level = psscr_rl;
 
if ((state->flags & OPAL_PM_LOSE_FULL_CONTEXT) &&
-(pnv_first_spr_loss_level > psscr_rl))
-   pnv_first_spr_loss_level = psscr_rl;
+(pnv_first_fullstate_loss_level > psscr_rl))
+   pnv_first_fullstate_loss_level = psscr_rl;
 
/*
 * The idle code does not deal with TB loss occurring
@@ -1121,8 +1121,8 @@ static void __init pnv_power9_idle_init(void)
 * compatibility.
 */
if ((state->flags & OPAL_PM_TIMEBASE_STOP) &&
-(pnv_first_spr_loss_level > psscr_rl))
-   pnv_first_spr_loss_level = psscr_rl;
+(pnv_first_fullstate_loss_level > psscr_rl))
+   pnv_first_fullstate_loss_level = psscr_rl;
 
err = validate_psscr_val_mask(>psscr_val,
  >psscr_mask,
@@ -1168,7 +1168,7 @@ static void __init pnv_power9_idle_init(void)
}
 
pr_info("cpuidle-powernv: First stop level that may lose SPRs = 
0x%llx\n",
-   pnv_first_spr_loss_level);
+   pnv_first_fullstate_loss_level);
 
pr_info("cpuidle-powernv: First stop level that may lose timebase = 
0x%llx\n",
pnv_first_tb_loss_level);
-- 
2.25.4

Re: [PATCH] fpga: dfl: pci: add device id for Intel FPGA PAC N3000

2020-07-09 Thread Xu Yilun

On Thu, Jul 09, 2020 at 06:00:40AM -0700, Tom Rix wrote:
> 
> On 7/9/20 3:14 AM, Wu, Hao wrote:
> >> On Thu, Jul 09, 2020 at 05:10:49PM +0800, Wu, Hao wrote:
>  Subject: [PATCH] fpga: dfl: pci: add device id for Intel FPGA PAC N3000
> 
>  Add PCIe Device ID for Intel FPGA PAC N3000.
> 
>  Signed-off-by: Wu Hao 
>  Signed-off-by: Xu Yilun 
>  Signed-off-by: Matthew Gerlach 
>  Signed-off-by: Russ Weight 
>  ---
>   drivers/fpga/dfl-pci.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
>  diff --git a/drivers/fpga/dfl-pci.c b/drivers/fpga/dfl-pci.c
>  index 73b5153..824aecf 100644
>  --- a/drivers/fpga/dfl-pci.c
>  +++ b/drivers/fpga/dfl-pci.c
>  @@ -64,6 +64,7 @@ static void cci_pci_free_irq(struct pci_dev *pcidev)
>   #define PCIE_DEVICE_ID_PF_INT_5_X0xBCBD
>   #define PCIE_DEVICE_ID_PF_INT_6_X0xBCC0
>   #define PCIE_DEVICE_ID_PF_DSC_1_X0x09C4
>  +#define PCIE_DEVICE_ID_PF_PAC_N3000 0x0B30
> >>> Should we drop _PF_ here? and also do you want _INTEL_ here?
> >> I think we could keep _PF_, also there is no need to support VF of pac
> >> n3000 in product now, but it does exist (ID: 0x0b31).
> 
> I was wondering about the vf id, thanks!
> 
> >>
> >> And add _INTEL_ is good to me.
> >>
> >> Then how about this one:
> >>   #define PCIE_DEVICE_ID_PF_INTEL_PAC_N30000x0B30
> > I am just considering the alignment with ids defined in 
> > include/linux/pci_ids.h
> > So drop _PF_ before _INTEL_ would be better? : )
> 
> To be consistent, all the id's are intel and all could drop pf.

That's good to me after checking the pci_ids.h. So we have:

#define PCIE_DEVICE_ID_INTEL_PAC_N30000x0B30

> 
> Tom
> 
> >
> > Thanks
> > Hao
> >

Re: [PATCH 0/5] iommu/arm-smmu: Support maintaining bootloader mappings

2020-07-09 Thread John Stultz

On Wed, Jul 8, 2020 at 10:02 PM Bjorn Andersson
 wrote:
>
> Based on previous attempts and discussions this is the latest attempt at
> inheriting stream mappings set up by the bootloader, for e.g. boot splash or
> efifb.
>
> The first patch is an implementation of Robin's suggestion that we should just
> mark the relevant stream mappings as BYPASS. Relying on something else to set
> up the stream mappings wanted - e.g. by reading it back in platform specific
> implementation code.
>
> The series then tackles the problem seen in most versions of Qualcomm 
> firmware,
> that the hypervisor intercepts BYPASS writes and turn them into FAULTs. It 
> does
> this by allocating context banks for identity domains as well, with 
> translation
> disabled.
>
> Lastly it amends the stream mapping initialization code to allocate a specific
> identity domain that is used for any mappings inherited from the bootloader, 
> if
> above Qualcomm quirk is required.
>
>
> The series has been tested and shown to allow booting SDM845, SDM850, SM8150,
> SM8250 with boot splash screen setup by the bootloader. Specifically it also
> allows the Lenovo Yoga C630 to boot with SMMU and efifb enabled.

This series allows the db845c to boot successfully! (Without it we crash!)
It would be really great to have this upstream!

For the series:
  Tested-by: John Stultz 

Thanks so much!
-john

Re: [PATCH v3 3/4] iommu/vt-d: Report page request faults for guest SVA

2020-07-09 Thread Lu Baolu


Hi Kevin,

On 2020/7/10 10:24, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Thursday, July 9, 2020 3:06 PM

A pasid might be bound to a page table from a VM guest via the iommu
ops.sva_bind_gpasid. In this case, when a DMA page fault is detected
on the physical IOMMU, we need to inject the page fault request into
the guest. After the guest completes handling the page fault, a page
response need to be sent back via the iommu ops.page_response().

This adds support to report a page request fault. Any external module
which is interested in handling this fault should regiester a notifier
with iommu_register_device_fault_handler().

Co-developed-by: Jacob Pan 
Signed-off-by: Jacob Pan 
Co-developed-by: Liu Yi L 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel/svm.c | 103 +++---
  1 file changed, 85 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index c23167877b2b..d24e71bac8db 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -815,8 +815,63 @@ static void intel_svm_drain_prq(struct device *dev,
int pasid)
}
  }

+static int prq_to_iommu_prot(struct page_req_dsc *req)
+{
+   int prot = 0;
+
+   if (req->rd_req)
+   prot |= IOMMU_FAULT_PERM_READ;
+   if (req->wr_req)
+   prot |= IOMMU_FAULT_PERM_WRITE;
+   if (req->exe_req)
+   prot |= IOMMU_FAULT_PERM_EXEC;
+   if (req->pm_req)
+   prot |= IOMMU_FAULT_PERM_PRIV;
+
+   return prot;
+}
+
+static int
+intel_svm_prq_report(struct device *dev, struct page_req_dsc *desc)
+{
+   struct iommu_fault_event event;
+
+   /* Fill in event data for device specific processing */
+   memset(, 0, sizeof(struct iommu_fault_event));
+   event.fault.type = IOMMU_FAULT_PAGE_REQ;
+   event.fault.prm.addr = desc->addr;
+   event.fault.prm.pasid = desc->pasid;
+   event.fault.prm.grpid = desc->prg_index;
+   event.fault.prm.perm = prq_to_iommu_prot(desc);
+
+   if (!dev || !dev_is_pci(dev))
+   return -ENODEV;


move the check before memset.


Yes.




+
+   if (desc->lpig)
+   event.fault.prm.flags |=
IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
+   if (desc->pasid_present) {
+   event.fault.prm.flags |=
IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+   event.fault.prm.flags |=
IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID;
+   }


if pasid is not present, should we return error directly instead of
submitting the req and let iommu core to figure out? I don't have


This has been done in prq_event_thread(), so I don't need to check it
again here.


a strong preference, thus:

Reviewed-by: Kevin Tian 


Thanks a lot for your time.

Best regards,
baolu




+   if (desc->priv_data_present) {
+   /*
+* Set last page in group bit if private data is present,
+* page response is required as it does for LPIG.
+* iommu_report_device_fault() doesn't understand this
vendor
+* specific requirement thus we set last_page as a
workaround.
+*/
+   event.fault.prm.flags |=
IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
+   event.fault.prm.flags |=
IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA;
+   memcpy(event.fault.prm.private_data, desc->priv_data,
+  sizeof(desc->priv_data));
+   }
+
+   return iommu_report_device_fault(dev, );
+}
+
  static irqreturn_t prq_event_thread(int irq, void *d)
  {
+   struct intel_svm_dev *sdev = NULL;
struct intel_iommu *iommu = d;
struct intel_svm *svm = NULL;
int head, tail, handled = 0;
@@ -828,7 +883,6 @@ static irqreturn_t prq_event_thread(int irq, void *d)
tail = dmar_readq(iommu->reg + DMAR_PQT_REG) &
PRQ_RING_MASK;
head = dmar_readq(iommu->reg + DMAR_PQH_REG) &
PRQ_RING_MASK;
while (head != tail) {
-   struct intel_svm_dev *sdev;
struct vm_area_struct *vma;
struct page_req_dsc *req;
struct qi_desc resp;
@@ -864,6 +918,20 @@ static irqreturn_t prq_event_thread(int irq, void *d)
}
}

+   if (!sdev || sdev->sid != req->rid) {
+   struct intel_svm_dev *t;
+
+   sdev = NULL;
+   rcu_read_lock();
+   list_for_each_entry_rcu(t, >devs, list) {
+   if (t->sid == req->rid) {
+   sdev = t;
+   break;
+   }
+   }
+   rcu_read_unlock();
+   }
+
result = QI_RESP_INVALID;
/* Since we're using init_mm.pgd directly, we should never
take
 * any faults on kernel addresses. */
@@ -874,6 +942,17 @@ static

[PATCH v2 0/3] Power10 basic energy management

2020-07-09 Thread Pratik Rajesh Sampat

Changelog v1 --> v2:
1. Save-restore DAWR and DAWRX unconditionally as they are lost in
shallow idle states too
2. Rename pnv_first_spr_loss_level to pnv_first_fullstate_loss_level to
correct naming terminology

Pratik Rajesh Sampat (3):
  powerpc/powernv/idle: Exclude mfspr on HID1,4,5 on P9 and above
  powerpc/powernv/idle: save-restore DAWR0,DAWRX0 for P10
  powerpc/powernv/idle: Rename pnv_first_spr_loss_level variable

 arch/powerpc/platforms/powernv/idle.c | 34 +--
 1 file changed, 22 insertions(+), 12 deletions(-)

-- 
2.25.4

Re: WARNING: at mm/mremap.c:211 move_page_tables in i386

2020-07-09 Thread Linus Torvalds

On Thu, Jul 9, 2020 at 9:29 PM Naresh Kamboju  wrote:
>
> Your patch applied and re-tested.
> warning triggered 10 times.
>
> old: bfe0-c000 new: bfa0 (val: 7d530067)

Hmm.. It's not even the overlapping case, it's literally just "move
exactly 2MB of page tables exactly one pmd down". Which should be the
nice efficient case where we can do it without modifying the lower
page tables at all, we just move the PMD entry.

There shouldn't be anything in the new address space from bfa0-bfdf.

That PMD value obviously says differently, but it looks like a nice
normal PMD value, nothing bad there.

I'm starting to think that the issue might be that this is because the
stack segment is special. Not only does it have the growsdown flag,
but that whole thing has the magic guard page logic.

So I wonder if we have installed a guard page _just_ below the old
stack, so that we have populated that pmd because of that.

We used to have an _actual_ guard page and then play nasty games with
vm_start logic. We've gotten rid of that, though, and now we have that
"stack_guard_gap" logic that _should_ mean that vm_start is always
exact and proper (and that pgtbales_free() should have emptied it, but
maybe we have some case we forgot about.

> [  741.511684] WARNING: CPU: 1 PID: 15173 at mm/mremap.c:211 
> move_page_tables.cold+0x0/0x2b
> [  741.598159] Call Trace:
> [  741.600694]  setup_arg_pages+0x22b/0x310
> [  741.621687]  load_elf_binary+0x31e/0x10f0
> [  741.633839]  __do_execve_file+0x5a8/0xbf0
> [  741.637893]  __ia32_sys_execve+0x2a/0x40
> [  741.641875]  do_syscall_32_irqs_on+0x3d/0x2c0
> [  741.657660]  do_fast_syscall_32+0x60/0xf0
> [  741.661691]  do_SYSENTER_32+0x15/0x20
> [  741.665373]  entry_SYSENTER_32+0x9f/0xf2
> [  741.734151]  old: bfe0-c000 new: bfa0 (val: 7d530067)

Nothing looks bad, and the ELF loading phase memory map should be
really quite simple.

The only half-way unusual thing is that you have basically exactly 2MB
of stack at execve time (easy enough to tune by just setting argv/env
right), and it's moved down by exactly 2MB.

And that latter thing is just due to randomization, see
arch_align_stack() in arch/x86/kernel/process.c.

So that would explain why it doesn't happen every time.

What happens if you apply the attached patch to *always* force the 2MB
shift (rather than moving the stack by a random amount), and then run
the other program (t.c -> compiled to "a.out").

The comment should be obvious. But it's untested, I might have gotten
the math wrong. I don't run in a 32-bit environment.

Linus

patch
Description: Binary data
#define _GNU_SOURCE
#include 

static char one_kb[1024] = {
	[0 ... 1022] = 'a',
	0
};

/*
 * Each string is 1kB, so we would need 2048 strings to fill a 2MB stack.
 *
 * But we have the string pointers themselves: 4 bytes per string, so
 * that would be an additional 8kB on top of the 2MB of strings. Plus
 * we have the two NULL terminators (8 bytes) for argv/envp.
 *
 * And then we have the ELF AUX fields, which is a few hundred bytes too.
 *
 * And then we need the call stack frame etc, and only need to come within
 * 4kB of the 2MB stack target.
 *
 * So instead of using 2048 strings to fill up 2MB exactly, we want to fill up
 * basically 2MB-12kB, and let the AUX info etc go into the last page.
 *
 * So 2036 1kB strings, plus noise.
 */

static char *argv[] = {
	[0] = "/bin/echo",
	[1 ... 2036] = one_kb,
	NULL
};

static char *envp[] = {
	NULL
};

int main(int argc, char **envp)
{
	/*
	 * Don't do this recursively, and sleep so people can look at /proc//maps
	 */
	if (argc > 1000) {
		sleep(100);
		return 0;
	}
	return execvpe("./a.out", argv, envp);
}

Re: [RESEND PATCH 1/2] fpga: dfl: pci: reduce the scope of variable 'ret'

2020-07-09 Thread Xu Yilun

On Thu, Jul 09, 2020 at 06:18:18AM -0700, Tom Rix wrote:
> I think a better change is to use the ret variable, like this
> 
> --- a/drivers/fpga/dfl-pci.c
> +++ b/drivers/fpga/dfl-pci.c
> @@ -312,7 +312,7 @@ static int cci_pci_sriov_configure(struct pci_dev 
> *pcidev, int num_vfs)
>     }
>     }
>  
> -   return num_vfs;
> +   return ret;
>  }
> 
> The existing use of returning num_vfs is not right, the function should 
> return 0/err not num_vfs. currently it is reusing the 0 passed in with 
> num_vfs to mean disable as the 0 return status.  it should be properly 
> returning ret.

The sriov_configure callback should return negative value for error, and
return num_vfs if success.

See the Documentation/PCI/pci-iov-howto.rst

also in drivers/pci/iov.c:

  static ssize_t sriov_numvfs_store(struct device *dev, ...)
  {
...

ret = pdev->driver->sriov_configure(pdev, num_vfs);
if (ret < 0) 
goto exit;

if (ret != num_vfs)
pci_warn(pdev, "%d VFs requested; only %d enabled\n",
 num_vfs, ret);

...
  }

> 
> Tom
> 
> On 7/9/20 1:12 AM, Xu Yilun wrote:
> > This is to fix lkp cppcheck warnings:
> >
> >  drivers/fpga/dfl-pci.c:230:6: warning: The scope of the variable 'ret' can 
> > be reduced. [variableScope]
> > int ret = 0;
> > ^
> >
> >  drivers/fpga/dfl-pci.c:230:10: warning: Variable 'ret' is assigned a value 
> > that is never used. [unreadVariable]
> > int ret = 0;
> > ^
> >
> > Fixes: 3c2760b78f90 ("fpga: dfl: pci: fix return value of 
> > cci_pci_sriov_configure")
> > Reported-by: kbuild test robot 
> > Signed-off-by: Xu Yilun 
> > Acked-by: Wu Hao 
> > ---
> >  drivers/fpga/dfl-pci.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/fpga/dfl-pci.c b/drivers/fpga/dfl-pci.c
> > index 4a14a24..73b5153 100644
> > --- a/drivers/fpga/dfl-pci.c
> > +++ b/drivers/fpga/dfl-pci.c
> > @@ -285,7 +285,6 @@ static int cci_pci_sriov_configure(struct pci_dev 
> > *pcidev, int num_vfs)
> >  {
> > struct cci_drvdata *drvdata = pci_get_drvdata(pcidev);
> > struct dfl_fpga_cdev *cdev = drvdata->cdev;
> > -   int ret = 0;
> >  
> > if (!num_vfs) {
> > /*
> > @@ -297,6 +296,8 @@ static int cci_pci_sriov_configure(struct pci_dev 
> > *pcidev, int num_vfs)
> > dfl_fpga_cdev_config_ports_pf(cdev);
> >  
> > } else {
> > +   int ret;
> > +
> > /*
> >  * before enable SRIOV, put released ports into VF access mode
> >  * first of all.

RE: [PATCH] exfat: fix wrong size update of stream entry by typo

2020-07-09 Thread Namjae Jeon

> The stream.size field is updated to the value of create timestamp of the file 
> entry. Fix this to use
> correct stream entry pointer.
> 
> Fixes: 29bbb14bfc80 ("exfat: fix incorrect update of stream entry in 
> __exfat_truncate()")
> Signed-off-by: Hyeongseok Kim 
My bad, Pushed it into exfat #dev.
Thanks!

[PATCH] selftests/livepatch: adopt to newer sysctl error format

2020-07-09 Thread Kamalesh Babulal

With procfs v3.3.16, the sysctl command doesn't prints the set key and
value on error.  This change breaks livepatch selftest test-ftrace.sh,
that tests the interaction of sysctl ftrace_enabled:

 # selftests: livepatch: test-ftrace.sh
 # TEST: livepatch interaction with ftrace_enabled sysctl ... not ok
 #
 # --- expected
 # +++ result
 # @@ -16,7 +16,7 @@ livepatch: 'test_klp_livepatch': initial
 #  livepatch: 'test_klp_livepatch': starting patching transition
 #  livepatch: 'test_klp_livepatch': completing patching transition
 #  livepatch: 'test_klp_livepatch': patching complete
 # -livepatch: sysctl: setting key "kernel.ftrace_enabled": Device or
resource busy kernel.ftrace_enabled = 0
 # +livepatch: sysctl: setting key "kernel.ftrace_enabled": Device or
resource busy
 #  % echo 0 > /sys/kernel/livepatch/test_klp_livepatch/enabled
 #  livepatch: 'test_klp_livepatch': initializing unpatching transition
 #  livepatch: 'test_klp_livepatch': starting unpatching transition
 #
 # ERROR: livepatch kselftest(s) failed

on setting sysctl kernel.ftrace_enabled={0,1} value successfully, the
set key and value is displayed.

This patch fixes it by limiting the output from both the cases to eight
words, that includes the error message or set key and value on failure
and success. The upper bound of eight words is enough to display the
only tracked error message. Also, adjust the check_result string in
test-ftrace.sh to match the expected output.

With the patch, the test-ftrace.sh passes on v3.3.15, v3.3.16 versions
of sysctl:
 ...
 # selftests: livepatch: test-ftrace.sh
 # TEST: livepatch interaction with ftrace_enabled sysctl ... ok
 ok 5 selftests: livepatch: test-ftrace.sh

Signed-off-by: Kamalesh Babulal 
---
Based on livepatching/for-5.9/selftests-cleanup, to be merged
through livepatching.git

 tools/testing/selftests/livepatch/functions.sh   | 3 ++-
 tools/testing/selftests/livepatch/test-ftrace.sh | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/livepatch/functions.sh 
b/tools/testing/selftests/livepatch/functions.sh
index 36648ca367c2..e3c0490d5a45 100644
--- a/tools/testing/selftests/livepatch/functions.sh
+++ b/tools/testing/selftests/livepatch/functions.sh
@@ -75,7 +75,8 @@ function set_dynamic_debug() {
 }
 
 function set_ftrace_enabled() {
-   result=$(sysctl kernel.ftrace_enabled="$1" 2>&1 | paste --serial 
--delimiters=' ')
+   result=$(sysctl kernel.ftrace_enabled="$1" 2>&1 | paste --serial 
--delimiters=' ' | \
+cut -d" " -f1-8)
echo "livepatch: $result" > /dev/kmsg
 }
 
diff --git a/tools/testing/selftests/livepatch/test-ftrace.sh 
b/tools/testing/selftests/livepatch/test-ftrace.sh
index 9160c9ec3b6f..552e165512f4 100755
--- a/tools/testing/selftests/livepatch/test-ftrace.sh
+++ b/tools/testing/selftests/livepatch/test-ftrace.sh
@@ -51,7 +51,7 @@ livepatch: '$MOD_LIVEPATCH': initializing patching transition
 livepatch: '$MOD_LIVEPATCH': starting patching transition
 livepatch: '$MOD_LIVEPATCH': completing patching transition
 livepatch: '$MOD_LIVEPATCH': patching complete
-livepatch: sysctl: setting key \"kernel.ftrace_enabled\": Device or resource 
busy kernel.ftrace_enabled = 0
+livepatch: sysctl: setting key \"kernel.ftrace_enabled\": Device or resource 
busy
 % echo 0 > /sys/kernel/livepatch/$MOD_LIVEPATCH/enabled
 livepatch: '$MOD_LIVEPATCH': initializing unpatching transition
 livepatch: '$MOD_LIVEPATCH': starting unpatching transition

base-commit: 3fd9bd8b7e41a1908bf8bc0cd06606f2b787cd39
-- 
2.26.2

Re: [PATCH] pinctrl: qcom: Handle broken PDC dual edge case on sc7180

2020-07-09 Thread Maulik Shah


Hi Doug,

On 7/9/2020 2:46 AM, Douglas Anderson wrote:

As per Qualcomm, there is a PDC hardware issue (with the specific IP
rev that exists on sc7180) that causes the PDC not to work properly
when configured to handle dual edges.

Let's work around this by emulating only ever letting our parent see
requests for single edge interrupts on affected hardware.

Fixes: e35a6ae0eb3a ("pinctrl/msm: Setup GPIO chip in hierarchy")
Signed-off-by: Douglas Anderson 
---
As far as I can tell everything here should work and the limited
testing I'm able to give it shows that, in fact, I can detect both
edges.

Please give this an extra thorough review since it's trying to find
the exact right place to insert this code and I'm not massively
familiar with all the frameworks.

If someone has hardware where it's easy to stress test this that'd be
wonderful too.  The board I happen to have in front of me doesn't have
any easy-to-toggle GPIOs where I can just poke a button or a switch to
generate edges.  My testing was done by hacking the "write protect"
GPIO on my board into gpio-keys as a dual-edge interrupt and then
sending commands to our security chip to toggle it--not exactly great
for testing to make sure there are no race conditions if the interrupt
bounces a lot.

  drivers/pinctrl/qcom/pinctrl-msm.c| 80 +++
  drivers/pinctrl/qcom/pinctrl-msm.h|  4 ++
  drivers/pinctrl/qcom/pinctrl-sc7180.c |  1 +
  3 files changed, 85 insertions(+)

diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c 
b/drivers/pinctrl/qcom/pinctrl-msm.c
index 83b7d64bc4c1..45ca09ebb7b3 100644
--- a/drivers/pinctrl/qcom/pinctrl-msm.c
+++ b/drivers/pinctrl/qcom/pinctrl-msm.c
@@ -860,6 +860,79 @@ static void msm_gpio_irq_ack(struct irq_data *d)
raw_spin_unlock_irqrestore(>lock, flags);
  }
  
+/**

+ * msm_gpio_update_dual_edge_parent() - Prime next edge for IRQs handled by 
parent.
+ * @d: The irq dta.
+ *
+ * This is much like msm_gpio_update_dual_edge_pos() but for IRQs that are
+ * normally handled by the parent irqchip.  The logic here is slightly
+ * different due to what's easy to do with our parent, but in principle it's
+ * the same.
+ */
+static void msm_gpio_update_dual_edge_parent(struct irq_data *d)
+{
+   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+   struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
+   const struct msm_pingroup *g = >soc->groups[d->hwirq];
+   unsigned long flags;
+   int loop_limit = 100;
+   unsigned int val;
+   unsigned int type;
+
+   /* Read the value and make a guess about what edge we need to catch */
+   val = msm_readl_io(pctrl, g) & BIT(g->in_bit);
+   type = val ? IRQ_TYPE_EDGE_FALLING : IRQ_TYPE_EDGE_RISING;
+
+   raw_spin_lock_irqsave(>lock, flags);

can you please move this spinlock covering above two lines as well?

+   do {
+   /* Set the parent to catch the next edge */
+   irq_chip_set_type_parent(d, type);
+
+   /*
+* Possibly the line changed between when we last read "val"
+* (and decided what edge we needed) and when set the edge.
+* If the value didn't change (or changed and then changed
+* back) then we're done.
+*/
+   val = msm_readl_io(pctrl, g) & BIT(g->in_bit);
+   if (type == IRQ_TYPE_EDGE_RISING) {
+   if (!val)
+   break;
+   type = IRQ_TYPE_EDGE_FALLING;
+   } else if (type == IRQ_TYPE_EDGE_FALLING) {
+   if (val)
+   break;
+   type = IRQ_TYPE_EDGE_RISING;
+   }
+   } while (loop_limit-- > 0);
+   raw_spin_unlock_irqrestore(>lock, flags);
+
+   if (!loop_limit)
+   dev_err(pctrl->dev, "dual-edge irq failed to stabilize\n");


you will never enter this if condtion since loop_limit will become 
negative value in above do..while loop.


need to update this check to if (loop_limit <= 0)

other than above comment this change looks good to me.

Reviewed-by: Maulik Shah 

Tested-by: Maulik Shah 

Thanks,
Maulik


+}
+
+void msm_gpio_handle_dual_edge_parent_irq(struct irq_desc *desc)
+{
+   struct irq_data *d = >irq_data;
+
+   /* Make sure we're primed for the next edge */
+   msm_gpio_update_dual_edge_parent(d);
+
+   /* Pass on to the normal interrupt handler */
+   handle_fasteoi_irq(desc);
+}
+
+static bool msm_gpio_needs_dual_edge_parent_workaround(struct irq_data *d,
+  unsigned int type)
+{
+   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+   struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
+
+   return type == IRQ_TYPE_EDGE_BOTH &&
+  pctrl->soc->wakeirq_dual_edge_errata && d->parent_data &&
+  test_bit(d->hwirq, pctrl->skip_wake_irqs);
+}
+
  static int

Re: [PATCH 2/4] interconnect: Add get_bw() callback

2020-07-09 Thread Mike Tipton


On 7/9/2020 4:07 AM, Georgi Djakov wrote:

The interconnect controller hardware may support querying the current
bandwidth settings, so add a callback for providers to implement this
functionality if supported.

Signed-off-by: Georgi Djakov 
---
  drivers/interconnect/core.c   | 3 ++-
  include/linux/interconnect-provider.h | 2 ++
  2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index e53adfee1ee3..edbfe8380e83 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -926,7 +926,8 @@ void icc_node_add(struct icc_node *node, struct 
icc_provider *provider)
list_add_tail(>node_list, >nodes);
  
  	/* get the bandwidth value and sync it with the hardware */

-   if (node->init_bw && provider->set) {
+   if (provider->get_bw && provider->set) {
+   provider->get_bw(node, >init_bw);


I'm not sure what benefit this callback provides over the provider 
directly setting init_bw in the structure. Additionally, "get_bw" is a 
more generic callback than just for getting the *initial* BW 
requirement. Currently it's only used that way, but staying true to the 
spirit of the callback would require us to return the current BW at any 
point in time.


We can only detect the current HW vote at BCM-level granularity and a 
single BCM can have many nodes. So since this callback is being used to 
determine init_bw, we'd end up voting way more nodes than necessary. In 
practice we'll only need to enforce minimum initial BW on a small subset 
of them, but I wouldn't want to hardcode that init-specific logic in a 
generic "get_bw" callback.



node->peak_bw = node->init_bw;
provider->set(node, node);
}
diff --git a/include/linux/interconnect-provider.h 
b/include/linux/interconnect-provider.h
index 153fb7616f96..329eccb19f58 100644
--- a/include/linux/interconnect-provider.h
+++ b/include/linux/interconnect-provider.h
@@ -38,6 +38,7 @@ struct icc_node *of_icc_xlate_onecell(struct of_phandle_args 
*spec,
   * @aggregate: pointer to device specific aggregate operation function
   * @pre_aggregate: pointer to device specific function that is called
   *   before the aggregation begins (optional)
+ * @get_bw: pointer to device specific function to get current bandwidth
   * @xlate: provider-specific callback for mapping nodes from phandle arguments
   * @dev: the device this interconnect provider belongs to
   * @users: count of active users
@@ -50,6 +51,7 @@ struct icc_provider {
int (*aggregate)(struct icc_node *node, u32 tag, u32 avg_bw,
 u32 peak_bw, u32 *agg_avg, u32 *agg_peak);
void (*pre_aggregate)(struct icc_node *node);
+   int (*get_bw)(struct icc_node *node, u32 *bw);
struct icc_node* (*xlate)(struct of_phandle_args *spec, void *data);
struct device   *dev;
int users;

Re: [PATCH 1/4] interconnect: Add sync state support

2020-07-09 Thread Mike Tipton


On 7/9/2020 4:07 AM, Georgi Djakov wrote:

The bootloaders often do some initial configuration of the interconnects
in the system and we want to keep this configuration until all consumers
have probed and expressed their bandwidth needs. This is because we don't
want to change the configuration by starting to disable unused paths until
every user had a chance to request the amount of bandwidth it needs.

To accomplish this we will implement an interconnect specific sync_state
callback which will synchronize (aggregate and set) the current bandwidth
settings when all consumers have been probed.

Signed-off-by: Georgi Djakov 
---
  drivers/interconnect/core.c   | 56 +++
  include/linux/interconnect-provider.h |  3 ++
  2 files changed, 59 insertions(+)

diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index e5f998744501..e53adfee1ee3 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -26,6 +26,8 @@
  
  static DEFINE_IDR(icc_idr);

  static LIST_HEAD(icc_providers);
+static int providers_count;
+static bool synced_state;
  static DEFINE_MUTEX(icc_lock);
  static struct dentry *icc_debugfs_dir;
  
@@ -255,6 +257,10 @@ static int aggregate_requests(struct icc_node *node)

continue;
p->aggregate(node, r->tag, r->avg_bw, r->peak_bw,
 >avg_bw, >peak_bw);
+
+   /* during boot use the initial bandwidth as a floor value */
+   if (!synced_state)
+   node->peak_bw = max(node->peak_bw, node->init_bw);
}
  
  	return 0;

@@ -919,6 +925,12 @@ void icc_node_add(struct icc_node *node, struct 
icc_provider *provider)
node->provider = provider;
list_add_tail(>node_list, >nodes);
  
+	/* get the bandwidth value and sync it with the hardware */

+   if (node->init_bw && provider->set) {
+   node->peak_bw = node->init_bw;
+   provider->set(node, node);
+   }
+


We'll need separate initial values for avg_bw/peak_bw. Some of our BCMs 
only support one or the other. Voting for one it doesn't support is a 
NOP. Additionally, some targets bring multiple subsystems out of reset 
in bootloaders and in those cases we'd need BCM to sum our initial 
avg_bw with the other subsystems.



mutex_unlock(_lock);
  }
  EXPORT_SYMBOL_GPL(icc_node_add);
@@ -1014,8 +1026,52 @@ int icc_provider_del(struct icc_provider *provider)
  }
  EXPORT_SYMBOL_GPL(icc_provider_del);
  
+static int of_count_icc_providers(struct device_node *np)

+{
+   struct device_node *child;
+   int count = 0;
+
+   for_each_available_child_of_node(np, child) {
+   if (of_property_read_bool(child, "#interconnect-cells"))
+   count++;
+   count += of_count_icc_providers(child);
+   }
+   of_node_put(np);
+
+   return count;
+}
+
+void icc_sync_state(struct device *dev)
+{
+   struct icc_provider *p;
+   struct icc_node *n;
+   static int count;
+
+   count++;
+
+   if (count < providers_count)
+   return;
+
+   mutex_lock(_lock);
+   list_for_each_entry(p, _providers, provider_list) {
+   dev_dbg(p->dev, "interconnect provider is in synced state\n");
+   list_for_each_entry(n, >nodes, node_list) {
+   aggregate_requests(n);
+   p->set(n, n);


We could skip re-aggregating/setting for nodes that don't specify an 
initial BW. That'll save a lot of unnecessary HW voting. In practice, 
we'll only need to specify an initial minimum BW for a small subset of 
nodes (likely only one per-BCM we care about). Technically we only need 
to re-aggregate/set if the current BW vote is limited by init_bw, but 
that optimization is less important than skipping the majority that 
don't have an init_bw.



+   }
+   }
+   mutex_unlock(_lock);
+   synced_state = true;
+}
+EXPORT_SYMBOL_GPL(icc_sync_state);
+
  static int __init icc_init(void)
  {
+   struct device_node *root = of_find_node_by_path("/");
+
+   providers_count = of_count_icc_providers(root);
+   of_node_put(root);
+
icc_debugfs_dir = debugfs_create_dir("interconnect", NULL);
debugfs_create_file("interconnect_summary", 0444,
icc_debugfs_dir, NULL, _summary_fops);
diff --git a/include/linux/interconnect-provider.h 
b/include/linux/interconnect-provider.h
index 0c494534b4d3..153fb7616f96 100644
--- a/include/linux/interconnect-provider.h
+++ b/include/linux/interconnect-provider.h
@@ -71,6 +71,7 @@ struct icc_provider {
   * @req_list: a list of QoS constraint requests associated with this node
   * @avg_bw: aggregated value of average bandwidth requests from all consumers
   * @peak_bw: aggregated value of peak bandwidth requests from all consumers
+ * @init_bw: the bandwidth value that is read from the hardware during init
   *

Re: [PATCH 1/2] KVM: X86: Move ignore_msrs handling upper the stack

2020-07-09 Thread Sean Christopherson

On Fri, Jul 10, 2020 at 12:11:54AM +0200, Paolo Bonzini wrote:
> On 09/07/20 23:50, Peter Xu wrote:
> >> Sean: Objection your honor.
> >> Paolo: Overruled, you're wrong.
> >> Sean: Phooey.
> >>
> >> My point is that even though I still object to this series, Paolo has final
> >> say.
> >
> > I could be wrong, but I feel like Paolo was really respecting your input, as
> > always.
> 
> I do respect Sean's input

Ya, my comments were in jest.  Sorry if I implied I was grumpy about Paolo
taking this patch, because I'm not.  Just stubborn :-)

> but I also believe that in this case there's three questions:
> 
> a) should KVM be allowed to use the equivalent of rdmsr*_safe() on guest
> MSRs?  I say a mild yes, Sean says a strong no.

It's more that I don't think host_initiated=true is the equivalent of
rdmsr_safe().  It kind of holds true for rdmsr, but that's most definitely
not the case for wrmsr where host_initiated=true completely changes what
is/isn't allow.  And if using host_initiated=true for rdmsr is allowed,
then logically using it for wrmsr is also allowed.

> b) is it good to separate the "1" and "-EINVAL" results so that
> ignore_msrs handling can be moved out of the MSR access functions?  I
> say yes because KVM should never rely on ignore_msrs; Sean didn't say
> anything (it's not too relevant if you answer no to the first question).
> 
> c) is it possible to reimplement TSX_CTRL_MSR to avoid using the
> equivalent of rdmsr*_safe()?  Sean says yes and it's not really possible
> to argue against that, but then it doesn't really matter if you answer
> yes to the first two questions.
> 
> Sean sees your patch mostly as answering "yes" to the question (a), and
> therefore disagrees with it.  I see your patch mostly as answering "yes"
> to question (b), and therefore like it.  I would also accept a patch
> that reimplements TSX_CTRL_MSR (question c), but I consider your patch
> to be an improvement anyway (question b).
> 
> > It's just as simple as a 2:1 vote, isn't it? (I can still count myself
> > in for the vote, right? :)
> 
> I do have the final say but I try to use that as little as possible (or
> never).  And then it happens that ever so rare disagreements cluster in
> the same week!
> 
> The important thing is to analyze the source of the disagreement.
> Usually when that happens, it's because a change has multiple purposes
> and people see it in a different way.
> 
> In this case, I'm happy to accept this patch (and overrule Sean) not
> because he's wrong on question (a), but because in my opinion the actual
> motivation of the patch is question (b).
> 
> To be fair, I would prefer it if ignore_msrs didn't apply to
> host-initiated MSR accesses at all (only guest accesses).  That would
> make this series much much simpler.  It wouldn't solve the disagremement
> on question (a), but perhaps it would be a patch that Sean would agree on.

I think I could get behind that.  It shoudn't interfere with my crusade to
vanquish host_initiated :-)

drivers/video/fbdev/tdfxfb.c:1120:27: sparse: sparse: incorrect type in argument 1 (different address spaces)

2020-07-09 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   0bddd227f3dc55975e2b8dfa7fc6f959b062a2c7
commit: 670d0a4b10704667765f7d18f7592993d02783aa sparse: use identifiers to 
define address spaces
date:   3 weeks ago
config: mips-randconfig-s031-20200710 (attached as .config)
compiler: mips64el-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.2-37-gc9676a3b-dirty
git checkout 670d0a4b10704667765f7d18f7592993d02783aa
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=mips 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)

>> drivers/video/fbdev/tdfxfb.c:1120:27: sparse: sparse: incorrect type in 
>> argument 1 (different address spaces) @@ expected void *__s @@ got 
>> unsigned char [noderef] [usertype] __iomem *cursorbase @@
   drivers/video/fbdev/tdfxfb.c:1120:27: sparse: expected void *__s
   drivers/video/fbdev/tdfxfb.c:1120:27: sparse: got unsigned char 
[noderef] [usertype] __iomem *cursorbase
   drivers/video/fbdev/tdfxfb.c:1131:33: sparse: sparse: cast removes address 
space '__iomem' of expression
   drivers/video/fbdev/tdfxfb.c:1134:33: sparse: sparse: cast removes address 
space '__iomem' of expression
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32

[PATCH] Restore gcc check in mips asm/unroll.h

2020-07-09 Thread Cesar Eduardo Barros

While raising the gcc version requirement to 4.9, the compile-time check
in the unroll macro was accidentally changed from being used on gcc and
clang to being used on clang only.

Restore the gcc check, changing it from "gcc >= 4.7" to "all gcc".

Fixes: 6ec4476ac825 ("Raise gcc version requirement to 4.9")
Signed-off-by: Cesar Eduardo Barros 
---
 arch/mips/include/asm/unroll.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/unroll.h b/arch/mips/include/asm/unroll.h
index 8ed660adc84f..49009319ac2c 100644
--- a/arch/mips/include/asm/unroll.h
+++ b/arch/mips/include/asm/unroll.h
@@ -25,7 +25,8 @@
 * generate reasonable code for the switch statement,   \
 * so we skip the sanity check for those compilers. \
 */ \
-   BUILD_BUG_ON((CONFIG_CLANG_VERSION >= 8) && \
+   BUILD_BUG_ON((CONFIG_CC_IS_GCC ||   \
+ CONFIG_CLANG_VERSION >= 8) && \
 !__builtin_constant_p(times)); \
\
switch (times) {\
-- 
2.26.2

Re: a question of split_huge_page

2020-07-09 Thread Alex Shi




在 2020/7/10 上午12:07, Kirill A. Shutemov 写道:
> On Thu, Jul 09, 2020 at 04:50:02PM +0100, Matthew Wilcox wrote:
>> On Thu, Jul 09, 2020 at 11:11:11PM +0800, Alex Shi wrote:
>>> Hi Kirill & Matthew,
>>>
>>> In the func call chain, from split_huge_page() to lru_add_page_tail(),
>>> Seems tail pages are added to lru list at line 963, but in this scenario
>>> the head page has no lru bit and isn't set the bit later. Why we do this?
>>> or do I miss sth?
>>
>> I don't understand how we get to split_huge_page() with a page that's
>> not on an LRU list.  Both anonymous and page cache pages should be on
>> an LRU list.  What am I missing?> 


Thanks a lot for quick reply!
What I am confusing is the call chain: __iommu_dma_alloc_pages()
to split_huge_page(), in the func, splited page,
page = alloc_pages_node(nid, alloc_flags, order);
And if the pages were added into lru, they maybe reclaimed and lost,
that would be a panic bug. But in fact, this never happened for long time.
Also I put a BUG() at the line, it's nevre triggered in ltp, and run_vmtests
in kselftest.

> Right, and it's never got removed from LRU during the split. The tail
> pages have to be added to LRU because they now separate from the tail
> page.
> 
According to the explaination, looks like we could remove the code path,
since it's never got into. (base on my v15 patchset). Any comments?

Thanks
Alex

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 7c52c5228aab..c28409509ad3 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2357,17 +2357,6 @@ static void lru_add_page_tail(struct page *head, struct 
page *page_tail,
if (!list)
SetPageLRU(page_tail);

if (likely(PageLRU(head)))
list_add_tail(_tail->lru, >lru);
else if (list) {
/* page reclaim is reclaiming a huge page */
get_page(page_tail);
list_add_tail(_tail->lru, list);
-   } else {
-   /*
-* Head page has not yet been counted, as an hpage,
-* so we must account for each subpage individually.
-*
-* Put page_tail on the list at the correct position
-* so they all end up in order.
-*/
-   VM_BUG_ON_PAGE(1, head);
-   add_page_to_lru_list_tail(page_tail, lruvec,
- page_lru(page_tail));
}
 }

Re: [PATCH v2 0/2] spi: spi-qcom-qspi: Avoid some per-transfer overhead

2020-07-09 Thread Akash Asthana




On 7/9/2020 8:21 PM, Douglas Anderson wrote:

Not to be confused with the similar series I posed for the _other_
Qualcomm SPI controller (spi-geni-qcom) [1], this one avoids the
overhead on the Quad SPI controller.

It's based atop the current Qualcomm tree including Rajendra's ("spi:
spi-qcom-qspi: Use OPP API to set clk/perf state").  As discussed in
individual patches, these could ideally land through the Qualcomm tree
with Mark's Ack.

Measuring:
* Before OPP / Interconnect patches reading all flash takes: ~3.4 seconds
* After OPP / Interconnect patches reading all flash takes: ~4.7 seconds
* After this patch reading all flash takes: ~3.3 seconds

[1] https://lore.kernel.org/r/20200702004509.2333554-1-diand...@chromium.org
[2] 
https://lore.kernel.org/r/1593769293-6354-2-git-send-email-rna...@codeaurora.org

Changes in v2:
- Return error from runtime resume if dev_pm_opp_set_rate() fails.

Douglas Anderson (2):
   spi: spi-qcom-qspi: Avoid clock setting if not needed
   spi: spi-qcom-qspi: Set an autosuspend delay of 250 ms

  drivers/spi/spi-qcom-qspi.c | 43 -
  1 file changed, 33 insertions(+), 10 deletions(-)

Reviewed-by: Akash Asthana 

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na 
Linux Foundation Collaborative Project

linux-next: manual merge of the tip tree with the spi tree

2020-07-09 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  drivers/spi/spi.c

between commit:

  60a883d119ab ("spi: use kthread_create_worker() helper")

from the spi tree and commit:

  3070da33400c ("sched,spi: Convert to sched_set_fifo*()")

from the tip tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/spi/spi.c
index 1d7bba434225,5a4f0bfce474..
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@@ -1614,11 -1592,9 +1614,9 @@@ EXPORT_SYMBOL_GPL(spi_take_timestamp_po
   */
  static void spi_set_thread_rt(struct spi_controller *ctlr)
  {
-   struct sched_param param = { .sched_priority = MAX_RT_PRIO / 2 };
- 
dev_info(>dev,
"will run message pump with realtime priority\n");
-   sched_setscheduler(ctlr->kworker->task, SCHED_FIFO, );
 -  sched_set_fifo(ctlr->kworker_task);
++  sched_set_fifo(ctlr->kworker->task);
  }
  
  static int spi_init_queue(struct spi_controller *ctlr)


pgpHKZqi3G2pN.pgp
Description: OpenPGP digital signature

Re: [PATCH v2] spi: spi-geni-qcom: Set the clock properly at runtime resume

2020-07-09 Thread Akash Asthana




On 7/9/2020 8:10 PM, Douglas Anderson wrote:

In the patch ("spi: spi-geni-qcom: Avoid clock setting if not needed")
we avoid a whole pile of clock code.  As part of that, we should have
restored the clock at runtime resume.  Do that.

It turns out that, at least with today's configurations, this doesn't
actually matter.  That's because none of the current device trees have
an OPP table for geni SPI yet.  That makes dev_pm_opp_set_rate(dev, 0)
a no-op.  This is why it wasn't noticed in the testing of the original
patch.  It's still a good idea to fix, though.

Reviewed-by: Akash Asthana 

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na 
Linux Foundation Collaborative Project

Re: [PATCH v2 02/10] x86/percpu: Clean up percpu_to_op()

2020-07-09 Thread Brian Gerst

On Thu, Jul 9, 2020 at 6:30 AM Peter Zijlstra  wrote:
>
> On Sat, May 30, 2020 at 06:11:19PM -0400, Brian Gerst wrote:
> > + if (0) {\
> > + typeof(_var) pto_tmp__; \
> > + pto_tmp__ = (_val); \
> > + (void)pto_tmp__;\
> > + }   \
>
> This is repeated at least once more; and it looks very similar to
> __typecheck() and typecheck() but is yet another variant afaict.

The problem with typecheck() is that it will complain about a mismatch
between unsigned long and u64 (defined as unsigned long long) even
though both are 64-bits wide on x86-64.  Cleaning that mess up is
beyond the scope of this series, so I kept the existing checks.

--
Brian Gerst

Re: [PATCH] proc/sysctl: make protected_* world readable

2020-07-09 Thread Kees Cook

On Thu, Jul 09, 2020 at 04:51:15PM -0700, Julius Hemanth Pitti wrote:
> protected_* files have 600 permissions which prevents
> non-superuser from reading them.
> 
> Container like "AWS greengrass" refuse to launch unless
> protected_hardlinks and protected_symlinks are set. When
> containers like these run with "userns-remap" or "--user"
> mapping container's root to non-superuser on host, they
> fail to run due to denied read access to these files.
> 
> As these protections are hardly a secret, and do not
> possess any security risk, making them world readable.
> 
> Though above greengrass usecase needs read access to
> only protected_hardlinks and protected_symlinks files,
> setting all other protected_* files to 644 to keep
> consistency.
> 
> Fixes: 800179c9b8a1 ("fs: add link restrictions")
> Signed-off-by: Julius Hemanth Pitti 

Acked-by: Kees Cook 

I had originally proposed it as 0644, but Ingo asked that it have
a more conservative default value[1]. I figured that given the settings
can be discovered easily, it's not worth much. And if there are legit
cases where things are improved, I don't have a problem switching this
back.

Ingo, any thoughts on this now, 8 years later in the age of containers?
:)

(One devil's advocate question: as a workaround, you are able to just
change those files to 0644 after mounting /proc, yes? But regardless,
why get in people's way for no justifiable reason.)

-Kees

[1] https://lore.kernel.org/lkml/20120105091704.gb3...@elte.hu/

-- 
Kees Cook

Re: WARNING: at mm/mremap.c:211 move_page_tables in i386

2020-07-09 Thread Naresh Kamboju

On Fri, 10 Jul 2020 at 00:42, Linus Torvalds
 wrote:
>
> On Wed, Jul 8, 2020 at 10:28 PM Naresh Kamboju
>  wrote:
> >
> > While running LTP mm test suite on i386 or qemu_i386 this kernel warning
> > has been noticed from stable 5.4 to stable 5.7 branches and mainline 
> > 5.8.0-rc4
> > and linux next.
>
> Hmm
>
> If this is repeatable, would you mind making the warning also print
> out the old range and new addresses and pmd value?

Your patch applied and re-tested.
warning triggered 10 times.

old: bfe0-c000 new: bfa0 (val: 7d530067)

Here is the crash output log,
thp01.c:98: PASS: system didn't crash.
[  741.507000] [ cut here ]
[  741.511684] WARNING: CPU: 1 PID: 15173 at mm/mremap.c:211
move_page_tables.cold+0x0/0x2b
[  741.519812] Modules linked in: x86_pkg_temp_thermal fuse
[  741.525163] CPU: 1 PID: 15173 Comm: true Not tainted 5.8.0-rc4 #1
[  741.531313] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.2 05/23/2018
[  741.538760] EIP: move_page_tables.cold+0x0/0x2b
[  741.543337] Code: b1 a0 03 00 00 81 c1 cc 04 00 00 bb ea ff ff ff
51 68 e0 bc 68 d8 c6 05 dc 29 97 d8 01 e8 13 26 e9 ff 83 c4 0c e9 70
ea ff ff <0f> 0b 52 50 ff 75 08 ff 75 b4 ff 75 d4 68 3c bd 68 d8 e8 f4
25 e9
[  741.562140] EAX: 7d530067 EBX: e9c90ff8 ECX:  EDX: 
[  741.568456] ESI:  EDI: 7d5ba007 EBP: cef67dd0 ESP: cef67d28
[  741.574776] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202
[  741.581623] CR0: 80050033 CR2: b7d53f50 CR3: 107da000 CR4: 003406f0
[  741.587941] DR0:  DR1:  DR2:  DR3: 
[  741.594259] DR6: fffe0ff0 DR7: 0400
[  741.598159] Call Trace:
[  741.600694]  setup_arg_pages+0x22b/0x310
[  741.604654]  ? _raw_spin_unlock_irqrestore+0x45/0x50
[  741.609677]  ? trace_hardirqs_on+0x4b/0x110
[  741.613930]  ? get_random_u32+0x4e/0x80
[  741.617809]  ? get_random_u32+0x4e/0x80
[  741.621687]  load_elf_binary+0x31e/0x10f0
[  741.625714]  ? __do_execve_file+0x5b4/0xbf0
[  741.629917]  ? find_held_lock+0x24/0x80
[  741.633839]  __do_execve_file+0x5a8/0xbf0
[  741.637893]  __ia32_sys_execve+0x2a/0x40
[  741.641875]  do_syscall_32_irqs_on+0x3d/0x2c0
[  741.646246]  ? find_held_lock+0x24/0x80
[  741.650105]  ? lock_release+0x8a/0x260
[  741.653890]  ? __might_fault+0x41/0x80
[  741.657660]  do_fast_syscall_32+0x60/0xf0
[  741.661691]  do_SYSENTER_32+0x15/0x20
[  741.665373]  entry_SYSENTER_32+0x9f/0xf2
[  741.669328] EIP: 0xb7f38549
[  741.672140] Code: Bad RIP value.
[  741.675430] EAX: ffda EBX: bfe19bf0 ECX: 08067420 EDX: bfe19e24
[  741.681708] ESI: 08058a14 EDI: bfe19bf9 EBP: bfe19c98 ESP: bfe19bc8
[  741.687991] DS: 007b ES: 007b FS:  GS: 0033 SS: 007b EFLAGS: 0292
[  741.694804] irq event stamp: 23911
[  741.698253] hardirqs last  enabled at (23929): []
console_unlock+0x4a5/0x610
[  741.706181] hardirqs last disabled at (23946): []
console_unlock+0x8a/0x610
[  741.714041] softirqs last  enabled at (23962): []
__do_softirq+0x2dc/0x3da
[  741.721849] softirqs last disabled at (23973): []
call_on_stack+0x45/0x50
[  741.729513] ---[ end trace 170f646c1b6225e0 ]---
[  741.734151]  old: bfe0-c000 new: bfa0 (val: 7d530067)

Build link: https://builds.tuxbuild.com/1cwiUvFIB4M0hPyB1eA3cA/
vmlinux: https://builds.tuxbuild.com/1cwiUvFIB4M0hPyB1eA3cA/vmlinux.xz
system.map: https://builds.tuxbuild.com/1cwiUvFIB4M0hPyB1eA3cA/System.map


full test log,
https://lkft.validation.linaro.org/scheduler/job/1554181#L10557

- Naresh

>
> Something like the attached (UNTESTED!) patch.
>
>  Linus

Re: [PATCH] KVM: x86/mmu: Add capability to zap only sptes for the affected memslot

2020-07-09 Thread Sean Christopherson

+Alex, whom I completely spaced on Cc'ing.

Alex, this is related to the dreaded VFIO memslot zapping issue from last
year.  Start of thread: https://patchwork.kernel.org/patch/11640719/.

The TL;DR of below: can you try the attached patch with your reproducer
from the original bug[*]?  I honestly don't know whether it has a legitimate
chance of working, but it's the one thing in all of this that I know was
definitely a bug.  I'd like to test it out if only to sate my curiosity.
Absolutely no rush.

[*] https://patchwork.kernel.org/patch/10798453/#22817321

On Fri, Jul 10, 2020 at 12:18:17AM +0200, Paolo Bonzini wrote:
> On 09/07/20 23:12, Sean Christopherson wrote:
> >> It's bad that we have no clue what's causing the bad behavior, but I
> >> don't think it's wise to have a bug that is known to happen when you
> >> enable the capability. :/
> 
> (Note that this wasn't a NACK, though subtly so).
> 
> > I don't necessarily disagree, but at the same time it's entirely possible
> > it's a Qemu bug.
> 
> No, it cannot be.  QEMU is not doing anything but
> KVM_SET_USER_MEMORY_REGION, and it's doing that synchronously with
> writes to the PCI configuration space BARs.

I'm not saying it's likely, but it's certainly possible.  The failure
went away when KVM zapped SPTEs for seemingly unrelated addresses, i.e. the
error likely goes beyond just the memslot aspect.

> > Even if this is a kernel bug, I'm fairly confident at this point that it's
> > not a KVM bug.  Or rather, if it's a KVM "bug", then there's a fundamental
> > dependency in memslot management that needs to be rooted out and documented.
> 
> Heh, here my surmise is that  it cannot be anything but a KVM bug,
> because  Memslots are not used by anything outside KVM...  But maybe I'm
> missing something.

As above, it's not really a memslot issue, it's more of a paging/TLB issue,
or possibly none of the above.  E.g. it could be a timing bug that goes away
simply because zapping and rebuilding slows things down to the point where
the timing window is closed.

I should have qualified "fairly confident ... that it's not a KVM bug" as
"not a KVM bug related to removing SPTEs for the deleted/moved memslot _as
implemented in this patch_".

Digging back through the old thread, I don't think we ever tried passing
%true for @lock_flush_tlb when zapping rmaps.  And a comment from Alex also
caught my eye, where he said of the following: "If anything, removing this
chunk seems to make things worse."

if (need_resched() || spin_needbreak(>mmu_lock)) {
kvm_mmu_remote_flush_or_zap(kvm, _list, flush);
flush = false;
cond_resched_lock(>mmu_lock);
}

A somewhat far fetched theory is that passing %false to kvm_zap_rmapp()
via slot_handle_all_level() created a window where a vCPU could have both
the old stale entry and the new memslot entry in its TLB if the equivalent
to above lock dropping in slot_handle_level_range() triggered.

Removing the above intermediate flush would exacerbate the theoretical
problem by further delaying the flush, i.e. would create a bigger window
for the guest to access the stale SPTE.

Where things get really far fetched is how zapping seemingly random SPTEs
fits in.  Best crazy guess is that zapping enough random things while holding
MMU lock would eventually zap a SPTE that caused the guest to block in the
EPT violation handler.

I'm not exactly confident that the correct zapping approach will actually
resolve the VFIO issue, but I think it's worth trying since not flushing
during rmap zapping was definitely a bug.

> > And we're kind of in a catch-22; it'll be extremely difficult to narrow down
> > exactly who is breaking what without being able to easily test the optimized
> > zapping with other VMMs and/or setups.
> 
> I agree with this, and we could have a config symbol that depends on
> BROKEN and enables it unconditionally.  However a capability is the
> wrong tool.

Ya, a capability is a bad idea.  I was coming at it from the angle that, if
there is a fundamental requirement with e.g. GPU passthrough that requires
zapping all SPTEs, then enabling the precise capability on a per-VM basis
would make sense.  But adding something to the ABI on pure speculation is
silly.
>From b68a2e6095d76574322ce7cf6e63406436fef36d Mon Sep 17 00:00:00 2001
From: Sean Christopherson 
Date: Thu, 9 Jul 2020 21:25:11 -0700
Subject: [PATCH] KVM: x86/mmu: Zap only relevant last/leaf sptes when removing
 a memslot

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3dd0af7e75151..9f468337f832c 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5810,7 +5810,18 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
 			struct kvm_memory_slot *slot,
 			struct kvm_page_track_notifier_node *node)
 {
-

Re: [PATCH] pinctrl: qcom: Handle broken PDC dual edge case on sc7180

2020-07-09 Thread Rajendra Nayak




On 7/9/2020 2:46 AM, Douglas Anderson wrote:

As per Qualcomm, there is a PDC hardware issue (with the specific IP
rev that exists on sc7180) that causes the PDC not to work properly
when configured to handle dual edges.

Let's work around this by emulating only ever letting our parent see
requests for single edge interrupts on affected hardware.

Fixes: e35a6ae0eb3a ("pinctrl/msm: Setup GPIO chip in hierarchy")
Signed-off-by: Douglas Anderson 


Thanks Doug, this looks like a much better solution than what I was
proposing :)

Reviewed-by: Rajendra Nayak 


---
As far as I can tell everything here should work and the limited
testing I'm able to give it shows that, in fact, I can detect both
edges.

Please give this an extra thorough review since it's trying to find
the exact right place to insert this code and I'm not massively
familiar with all the frameworks.

If someone has hardware where it's easy to stress test this that'd be
wonderful too.  The board I happen to have in front of me doesn't have
any easy-to-toggle GPIOs where I can just poke a button or a switch to
generate edges.  My testing was done by hacking the "write protect"
GPIO on my board into gpio-keys as a dual-edge interrupt and then
sending commands to our security chip to toggle it--not exactly great
for testing to make sure there are no race conditions if the interrupt
bounces a lot.

  drivers/pinctrl/qcom/pinctrl-msm.c| 80 +++
  drivers/pinctrl/qcom/pinctrl-msm.h|  4 ++
  drivers/pinctrl/qcom/pinctrl-sc7180.c |  1 +
  3 files changed, 85 insertions(+)

diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c 
b/drivers/pinctrl/qcom/pinctrl-msm.c
index 83b7d64bc4c1..45ca09ebb7b3 100644
--- a/drivers/pinctrl/qcom/pinctrl-msm.c
+++ b/drivers/pinctrl/qcom/pinctrl-msm.c
@@ -860,6 +860,79 @@ static void msm_gpio_irq_ack(struct irq_data *d)
raw_spin_unlock_irqrestore(>lock, flags);
  }
  
+/**

+ * msm_gpio_update_dual_edge_parent() - Prime next edge for IRQs handled by 
parent.
+ * @d: The irq dta.
+ *
+ * This is much like msm_gpio_update_dual_edge_pos() but for IRQs that are
+ * normally handled by the parent irqchip.  The logic here is slightly
+ * different due to what's easy to do with our parent, but in principle it's
+ * the same.
+ */
+static void msm_gpio_update_dual_edge_parent(struct irq_data *d)
+{
+   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+   struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
+   const struct msm_pingroup *g = >soc->groups[d->hwirq];
+   unsigned long flags;
+   int loop_limit = 100;
+   unsigned int val;
+   unsigned int type;
+
+   /* Read the value and make a guess about what edge we need to catch */
+   val = msm_readl_io(pctrl, g) & BIT(g->in_bit);
+   type = val ? IRQ_TYPE_EDGE_FALLING : IRQ_TYPE_EDGE_RISING;
+
+   raw_spin_lock_irqsave(>lock, flags);
+   do {
+   /* Set the parent to catch the next edge */
+   irq_chip_set_type_parent(d, type);
+
+   /*
+* Possibly the line changed between when we last read "val"
+* (and decided what edge we needed) and when set the edge.
+* If the value didn't change (or changed and then changed
+* back) then we're done.
+*/
+   val = msm_readl_io(pctrl, g) & BIT(g->in_bit);
+   if (type == IRQ_TYPE_EDGE_RISING) {
+   if (!val)
+   break;
+   type = IRQ_TYPE_EDGE_FALLING;
+   } else if (type == IRQ_TYPE_EDGE_FALLING) {
+   if (val)
+   break;
+   type = IRQ_TYPE_EDGE_RISING;
+   }
+   } while (loop_limit-- > 0);
+   raw_spin_unlock_irqrestore(>lock, flags);
+
+   if (!loop_limit)
+   dev_err(pctrl->dev, "dual-edge irq failed to stabilize\n");
+}
+
+void msm_gpio_handle_dual_edge_parent_irq(struct irq_desc *desc)
+{
+   struct irq_data *d = >irq_data;
+
+   /* Make sure we're primed for the next edge */
+   msm_gpio_update_dual_edge_parent(d);
+
+   /* Pass on to the normal interrupt handler */
+   handle_fasteoi_irq(desc);
+}
+
+static bool msm_gpio_needs_dual_edge_parent_workaround(struct irq_data *d,
+  unsigned int type)
+{
+   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+   struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
+
+   return type == IRQ_TYPE_EDGE_BOTH &&
+  pctrl->soc->wakeirq_dual_edge_errata && d->parent_data &&
+  test_bit(d->hwirq, pctrl->skip_wake_irqs);
+}
+
  static int msm_gpio_irq_set_type(struct irq_data *d, unsigned int type)
  {
struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
@@ -868,6 +941,13 @@ static int msm_gpio_irq_set_type(struct irq_data *d, 
unsigned int type)
unsigned long flags;

Re: [f2fs-dev] [PATCH] f2fs: change the way of handling range.len in F2FS_IOC_SEC_TRIM_FILE

2020-07-09 Thread Daeho Jeong

To handle that case, I think we need to handle range.len(-1) differently.
When range.len is -1, we need to find out every block belongs to the
inode regardless of i_size and discard it.

2020년 7월 10일 (금) 오후 12:52, Jaegeuk Kim 님이 작성:
>
> On 07/10, Chao Yu wrote:
> > On 2020/7/10 11:31, Jaegeuk Kim wrote:
> > > On 07/10, Chao Yu wrote:
> > >> On 2020/7/10 11:02, Jaegeuk Kim wrote:
> > >>> On 07/10, Daeho Jeong wrote:
> >  From: Daeho Jeong 
> > 
> >  Changed the way of handling range.len of F2FS_IOC_SEC_TRIM_FILE.
> >   1. Added -1 value support for range.len to signify the end of file.
> >   2. If the end of the range passes over the end of file, it means until
> >  the end of file.
> >   3. ignored the case of that range.len is zero to prevent the function
> >  from making end_addr zero and triggering different behaviour of
> >  the function.
> > 
> >  Signed-off-by: Daeho Jeong 
> >  ---
> >   fs/f2fs/file.c | 16 +++-
> >   1 file changed, 7 insertions(+), 9 deletions(-)
> > 
> >  diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >  index 368c80f8e2a1..1c4601f99326 100644
> >  --- a/fs/f2fs/file.c
> >  +++ b/fs/f2fs/file.c
> >  @@ -3813,21 +3813,19 @@ static int f2fs_sec_trim_file(struct file 
> >  *filp, unsigned long arg)
> >   file_start_write(filp);
> >   inode_lock(inode);
> > 
> >  -if (f2fs_is_atomic_file(inode) || 
> >  f2fs_compressed_file(inode)) {
> >  +if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode) 
> >  ||
> >  +range.start >= inode->i_size) {
> >   ret = -EINVAL;
> >   goto err;
> >   }
> > 
> >  -if (range.start >= inode->i_size) {
> >  -ret = -EINVAL;
> >  +if (range.len == 0)
> >   goto err;
> >  -}
> > 
> >  -if (inode->i_size - range.start < range.len) {
> >  -ret = -E2BIG;
> >  -goto err;
> >  -}
> >  -end_addr = range.start + range.len;
> >  +if (range.len == (u64)-1 || inode->i_size - range.start < 
> >  range.len)
> >  +end_addr = inode->i_size;
> > >>
> > >> We can remove 'range.len == (u64)-1' condition since later condition can 
> > >> cover
> > >> this?
> > >>
> > >>>
> > >>> Hmm, what if there are blocks beyond i_size? Do we need to check 
> > >>> i_blocks for
> > >>
> > >> The blocks beyond i_size will never be written, there won't be any valid 
> > >> message
> > >> there, so we don't need to worry about that.
> > >
> > > I don't think we have a way to guarantee the order of i_size and block
> > > allocation in f2fs. See f2fs_write_begin and f2fs_write_end.
> >
> > However, write_begin & write_end are covered by inode_lock, it could not be
> > racy with inode size check in f2fs_sec_trim_file() as it hold inode_lock as
> > well?
>
> Like Daeho said, write_begin -> checkpoint -> power-cut can give bigger 
> i_blocks
> than i_size.
>
> >
> > >
> > >>
> > >> Thanks,
> > >>
> > >>> ending criteria?
> > >>>
> >  +else
> >  +end_addr = range.start + range.len;
> > 
> >   to_end = (end_addr == inode->i_size);
> >   if (!IS_ALIGNED(range.start, F2FS_BLKSIZE) ||
> >  --
> >  2.27.0.383.g050319c2ae-goog
> > 
> > 
> > 
> >  ___
> >  Linux-f2fs-devel mailing list
> >  linux-f2fs-de...@lists.sourceforge.net
> >  https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > >>>
> > >>>
> > >>> ___
> > >>> Linux-f2fs-devel mailing list
> > >>> linux-f2fs-de...@lists.sourceforge.net
> > >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > >>> .
> > >>>
> > > .
> > >

Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags

2020-07-09 Thread Ezequiel Garcia

Hello Jonas,

In the context of the uAPI cleanup,
I'm revisiting this patch.

On Sun, 2019-09-01 at 12:45 +, Jonas Karlman wrote:
> Add DPB entry flags to help indicate when a reference frame is a field picture
> and how the DPB entry is referenced, top or bottom field or full frame.
> 
> Signed-off-by: Jonas Karlman 
> ---
>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 
>  include/media/h264-ctrls.h   |  4 
>  2 files changed, 16 insertions(+)
> 
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst 
> b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index bc5dd8e76567..eb6c32668ad7 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -2022,6 +2022,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>  * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
>- 0x0004
>- The DPB entry is a long term reference frame
> +* - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> +  - 0x0008
> +  - The DPB entry is a field picture
> +* - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> +  - 0x0010
> +  - The DPB entry is a top field reference
> +* - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> +  - 0x0020
> +  - The DPB entry is a bottom field reference
> +* - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> +  - 0x0030
> +  - The DPB entry is a reference frame
>  
>  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
>  Specifies the decoding mode to use. Currently exposes slice-based and
> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> index e877bf1d537c..76020ebd1e6c 100644
> --- a/include/media/h264-ctrls.h
> +++ b/include/media/h264-ctrls.h
> @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
>  #define V4L2_H264_DPB_ENTRY_FLAG_VALID   0x01
>  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE  0x02
>  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM   0x04
> +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE   0x08
> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP 0x10
> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM  0x20
> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME   0x30
>  

I've been going thru the H264 spec and I'm unsure,
are all these flags semantically needed?

For instance, if one of REF_BOTTOM or REF_TOP (or both)
are set, doesn't that indicate it's a field picture?

Or conversely, if neither REF_BOTTOM or REF_TOP are set,
then it's a frame picture?

Thanks,
Ezequiel

drivers/clk/clk-hsdk-pll.c:407:24: sparse: expected void

2020-07-09 Thread kernel test robot

Hi Stephen,

First bad commit (maybe != root cause):

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   0bddd227f3dc55975e2b8dfa7fc6f959b062a2c7
commit: bbd7ffdbef6888459f301c5889f3b14ada38b913 clk: Allow the common clk 
framework to be selectable
date:   9 weeks ago
config: openrisc-randconfig-s031-20200710 (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.2-37-gc9676a3b-dirty
git checkout bbd7ffdbef6888459f301c5889f3b14ada38b913
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=openrisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)

   drivers/clk/clk-hsdk-pll.c:407:24: sparse: sparse: incorrect type in 
argument 1 (different address spaces) @@ expected void *addr @@ got 
void [noderef]  *spec_regs @@
>> drivers/clk/clk-hsdk-pll.c:407:24: sparse: expected void *addr
   drivers/clk/clk-hsdk-pll.c:407:24: sparse: got void [noderef]  
*spec_regs
   drivers/clk/clk-hsdk-pll.c:409:24: sparse: sparse: incorrect type in 
argument 1 (different address spaces) @@ expected void *addr @@ got 
void [noderef]  *regs @@
   drivers/clk/clk-hsdk-pll.c:409:24: sparse: expected void *addr
   drivers/clk/clk-hsdk-pll.c:409:24: sparse: got void [noderef]  
*regs

vim +407 drivers/clk/clk-hsdk-pll.c

daeeb438c052e3 Eugeniy Paltsev 2017-08-25  353  
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  354  static void __init 
of_hsdk_pll_clk_setup(struct device_node *node)
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  355  {
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  356  int ret;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  357  const char *parent_name;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  358  unsigned int 
num_parents;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  359  struct hsdk_pll_clk 
*pll_clk;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  360  struct clk_init_data 
init = { };
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  361  
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  362  pll_clk = 
kzalloc(sizeof(*pll_clk), GFP_KERNEL);
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  363  if (!pll_clk)
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  364  return;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  365  
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  366  pll_clk->regs = 
of_iomap(node, 0);
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  367  if (!pll_clk->regs) {
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  368  pr_err("failed 
to map pll registers\n");
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  369  goto 
err_free_pll_clk;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  370  }
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  371  
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  372  pll_clk->spec_regs = 
of_iomap(node, 1);
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  373  if 
(!pll_clk->spec_regs) {
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  374  pr_err("failed 
to map pll registers\n");
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  375  goto 
err_unmap_comm_regs;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  376  }
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  377  
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  378  init.name = node->name;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  379  init.ops = 
_pll_ops;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  380  parent_name = 
of_clk_get_parent_name(node, 0);
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  381  init.parent_names = 
_name;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  382  num_parents = 
of_clk_get_parent_count(node);
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  383  if (num_parents > 
CGU_PLL_SOURCE_MAX) {
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  384  pr_err("too 
much clock parents: %u\n", num_parents);
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  385  goto 
err_unmap_spec_regs;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  386  }
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  387  init.num_parents = 
num_parents;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  388  
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  389  pll_clk->hw.init = 

daeeb438c052e3 Eugeniy Paltsev 2017-08-25  390  pll_clk->pll_devdata = 
_pll_devdata;
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  391  
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  392  ret = 
clk_hw_register(NULL, _clk->hw);
daeeb438c052e3 Eugeniy Paltsev 2017-08-25  393  if (ret) {

Re: WARNING: at mm/mremap.c:211 move_page_tables in i386

2020-07-09 Thread Naresh Kamboju

On Thu, 9 Jul 2020 at 13:55, Arnd Bergmann  wrote:
>
> On Thu, Jul 9, 2020 at 7:28 AM Naresh Kamboju  
> wrote:
> >
> > While running LTP mm test suite on i386 or qemu_i386 this kernel warning
> > has been noticed from stable 5.4 to stable 5.7 branches and mainline 
> > 5.8.0-rc4
> > and linux next.
>
> Are you able to correlate this with any particular test case in LTP, or does
> it happen for random processes?
>
> In the log you linked to, it happens once for ksm05.c and multiple times for
> thp01.c, sources here:
>
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/ksm/ksm05.c
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/thp/thp01.c
>
> Is it always these tests that trigger the warning, or sometimes others?

These two test cases are causing this warning multiple times on i386.

>
> When you say it happens with linux-5.4 stable, does that mean you don't see
> it with older versions? What is the last known working version?

I do not notice on stable-4.19 and below versions.
Sorry i did not get the known working commit id or version.

It started happening from stable-rc 5.0 first release.
I have evidence [1] showing it on 5.0.1

>
> I also see that you give the virtual machine 16GB of RAM, but as you are
> running a 32-bit kernel without PAE, only 2.3GB end up being available,
> and some other LTP tests in the log run out of memory.
>
> You could check if the behavior changes if you give the kernel less memory,
> e.g. 768MB (lowmem only), or enable CONFIG_X86_PAE to let it use the
> entire 16GB.

Warning is still happening after enabling PAE config.
But the oom-killer messages are gone. Thank you.

CONFIG_HIGHMEM=y
CONFIG_X86_PAE=y

full test log oom-killer messages are gone and kernel warning is still there,
https://lkft.validation.linaro.org/scheduler/job/1552606#L10357

build location:
https://builds.tuxbuild.com/puilcMcGVwzFMN5fDUhY4g/

[1] 
https://qa-reports.linaro.org/lkft/linux-stable-rc-5.0-oe/build/v5.0.1/testrun/1324990/suite/ltp-mm-tests/test/ksm02/log

---
[  775.646689] WARNING: CPU: 3 PID: 10858 at
/srv/oe/build/tmp-lkft-glibc/work-shared/intel-core2-32/kernel-source/mm/mremap.c:211
move_page_tables+0x553/0x570
[  775.647006] Modules linked in: fuse
[  775.647006] CPU: 3 PID: 10858 Comm: true Not tainted 5.0.1 #1
[  775.647006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1 04/01/2014
[  775.647006] EIP: move_page_tables+0x553/0x570

- Naresh

Re: [PATCH v2] spi: spi-geni-qcom: Set the clock properly at runtime resume

2020-07-09 Thread Rajendra Nayak




On 7/9/2020 8:10 PM, Douglas Anderson wrote:

In the patch ("spi: spi-geni-qcom: Avoid clock setting if not needed")
we avoid a whole pile of clock code.  As part of that, we should have
restored the clock at runtime resume.  Do that.

It turns out that, at least with today's configurations, this doesn't
actually matter.  That's because none of the current device trees have
an OPP table for geni SPI yet.  That makes dev_pm_opp_set_rate(dev, 0)
a no-op.  This is why it wasn't noticed in the testing of the original
patch.  It's still a good idea to fix, though.

Signed-off-by: Douglas Anderson 
Acked-by: Mark Brown 


Reviewed-by: Rajendra Nayak 


---
Sending this as a separate patch even though I think the patch it's
fixing [1] hasn't landed yet.  I'd be happy if this was squashed into
that patch when landing if that suits everyone, but it could land on
its own too.

Like the patch it's fixing, this needs to target the Qualcomm tree in
order to avoid merge conflicts.

[1] 
https://lore.kernel.org/r/20200701174506.1.Icfdcee14649fc0a6c38e87477b28523d4e60bab3@changeid

Changes in v2:
- Return error from runtime resume if dev_pm_opp_set_rate() fails.

  drivers/spi/spi-geni-qcom.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index 97fac5ea6afd..0e11a90490ff 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -79,6 +79,7 @@ struct spi_geni_master {
u32 tx_wm;
u32 last_mode;
unsigned long cur_speed_hz;
+   unsigned long cur_sclk_hz;
unsigned int cur_bits_per_word;
unsigned int tx_rem_bytes;
unsigned int rx_rem_bytes;
@@ -116,6 +117,9 @@ static int get_spi_clk_cfg(unsigned int speed_hz,
ret = dev_pm_opp_set_rate(mas->dev, sclk_freq);
if (ret)
dev_err(mas->dev, "dev_pm_opp_set_rate failed %d\n", ret);
+   else
+   mas->cur_sclk_hz = sclk_freq;
+
return ret;
  }
  
@@ -670,7 +674,11 @@ static int __maybe_unused spi_geni_runtime_resume(struct device *dev)

if (ret)
return ret;
  
-	return geni_se_resources_on(>se);

+   ret = geni_se_resources_on(>se);
+   if (ret)
+   return ret;
+
+   return dev_pm_opp_set_rate(mas->dev, mas->cur_sclk_hz);
  }
  
  static int __maybe_unused spi_geni_suspend(struct device *dev)




--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

Re: [PATCH v3 6/9] drm/bridge: ti-sn65dsi86: Use 18-bit DP if we can

2020-07-09 Thread Doug Anderson

Hi,

On Thu, Jul 9, 2020 at 8:43 PM Steev Klimaszewski  wrote:
>
>
> On 7/9/20 10:17 PM, Steev Klimaszewski wrote:
> >
> > On 7/9/20 10:12 PM, Steev Klimaszewski wrote:
> >>
> >> On 7/9/20 9:14 PM, Doug Anderson wrote:
> >>> Hi,
> >>>
> >>> On Thu, Jul 9, 2020 at 6:38 PM Doug Anderson 
> >>> wrote:
>  Hi,
> 
>  On Thu, Jul 9, 2020 at 6:19 PM Steev Klimaszewski
>   wrote:
> > Hi Doug,
> >
> > I've been testing 5.8 and linux-next on the Lenovo Yoga C630, and
> > with this patch applied, there is really bad banding on the display.
> >
> > I'm really bad at explaining it, but you can see the differences
> > in the following:
> >
> > 24bit (pre-5.8) - https://dev.gentoo.org/~steev/files/image0.jpg
> >
> > 18bit (5.8/linux-next) -
> > https://dev.gentoo.org/~steev/files/image1.jpg
>  Presumably this means that your panel is defined improperly? If the
>  panel reports that it's a 6 bits per pixel panel but it's actually an
>  8 bits per pixel panel then you'll run into this problem.
> 
>  I would have to assume you have a bunch of out of tree patches to
>  support your hardware since I don't see any device trees in linuxnext
>  (other than cheza) that use this bridge chip.  Otherwise I could try
>  to check and confirm that was the problem.
> >>> Ah, interesting.  Maybe you have the panel:
> >>>
> >>> boe,nv133fhm-n61
> >>>
> >>> As far as I can tell from the datasheet (I have the similar
> >>> boe,nv133fhm-n62) this is a 6bpp panel.  ...but if you feed it 8bpp
> >>> the banding goes away!  Maybe the panel itself knows how to dither???
> >>> ...or maybe the datasheet / edid are wrong and this is actually an
> >>> 8bpp panel.  Seems unlikely...
> >>>
> >>> In any case, one fix is to pick
> >>> ,
> >>>
> >>> though right now that patch is only enabled for sc7180.  Maybe you
> >>> could figure out how to apply it to your hardware?
> >>>
> >>> ...another fix would be to pretend that your panel is 8bpp even though
> >>> it's actually 6bpp.  Ironically if anyone ever tried to configure BPP
> >>> from the EDID they'd go back to 6bpp.  You can read the EDID of your
> >>> panel with this:
> >>>
> >>> bus=$(i2cdetect -l | grep sn65 | sed 's/i2c-\([0-9]*\).*$/\1/')
> >>> i2cdump ${bus} 0x50 i
> >>>
> >>> When I do that and then decode it on the "boe,nv133fhm-n62" panel, I
> >>> find:
> >>>
> >>> 6 bits per primary color channel
> >>>
> >>> -Doug
> >>
> >>
> >> Hi Doug,
> >>
> >> Decoding it does show be to boe,nv133fhm-n61 - and yeah it does say
> >> it's 6-bit according to panelook's specs for it.
>
>
> I derped again...
>
> root@c630:~# bus=$(i2cdetect -l | grep sn65 | sed 's/i2c-\([0-9]*\).*$/\1/')
> root@c630:~# i2cdump ${bus} 0x50 i > edid
> WARNING! This program can confuse your I2C bus, cause data loss and worse!
> I will probe file /dev/i2c-16, address 0x50, mode i2c block
> Continue? [Y/n]
> root@c630:~# edid-decode edid
> edid-decode (hex):
>
> 00 ff ff ff ff ff ff 00 09 e5 d1 07 00 00 00 00
> 01 1c 01 04 a5 1d 11 78 0a 1d b0 a6 58 54 9e 26
> 0f 50 54 00 00 00 01 01 01 01 01 01 01 01 01 01
> 01 01 01 01 01 01 c0 39 80 18 71 38 28 40 30 20
> 36 00 26 a5 10 00 00 1a 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 1a 00 00 00 fe 00 42
> 4f 45 20 43 51 0a 20 20 20 20 20 20 00 00 00 fe
> 00 4e 56 31 33 33 46 48 4d 2d 4e 36 31 0a 00 9a
>
> 03 26 0a 77 ab 1c 05 71 6f 1d 8c f1 43 ce 6a bb
> fb d3 11 20 39 07 22 6e 65 68 77 70 d3 05 34 73
> 44 21 8b fd f5 6d 11 62 94 2a 7c fa 93 ba 6a 61
> 92 da 15 53 4c 39 eb f7 86 23 97 48 e9 39 09 d2
> 66 02 70 bb e2 77 0f 4a a3 a0 4c 72 6e 5d 47 70
> 43 c2 13 f3 b2 d9 b9 78 02 be 41 82 15 6a 28 dc
> 45 0f 9d eb 0f 2a cc e8 35 8d 34 7f 3e 84 5e a3
> 30 5e 1e 29 0a 48 0c d1 0a c4 08 31 03 a9 3b 29
>
> 
>
> EDID version: 1.4
> Manufacturer: BOE Model 2001 Serial Number 0
> Made in week 1 of 2018
> Digital display
> 8 bits per primary color channel
> DisplayPort interface
> Maximum image size: 29 cm x 17 cm
> Gamma: 2.20
> Supported color formats: RGB 4:4:4, YCrCb 4:4:4
> First detailed timing includes the native pixel format and preferred
> refresh rate
> Color Characteristics
>Red:   0.6484, 0.3447
>Green: 0.3310, 0.6181
>Blue:  0.1503, 0.0615
>White: 0.3125, 0.3281
> Established Timings I & II: none
> Standard Timings: none
> Detailed mode: Clock 147.840 MHz, 294 mm x 165 mm
> 1920 1968 2000 2200 ( 48  32 200)
> 1080 1083 1089 1120 (  3   6  31)
> +hsync -vsync
> VertFreq: 60.000 Hz, HorFreq: 67.200 kHz
> Manufacturer-Specified Display Descriptor (0x00): 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 1a  
> Alphanumeric Data String: BOE CQ
> Alphanumeric Data String: NV133FHM-N61
> Checksum: 0x9a
>
> 
>
> Unknown EDID Extension Block 0x03
>03 26 0a 77 ab 1c 05 71 6f

[PATCH][v2] intel_idle: Customize IceLake server support

2020-07-09 Thread Chen Yu

On ICX platform, the CPU frequency will slowly ramp up
when woken up from C-states deeper than/equals to C1E.
Although this feature does save energy in many cases
this might also cause unexpected result. For example,
workload might get unstable performance due to the uncertainty
of CPU frequency. Besides, the CPU frequency might not be locked
to specific level when the CPU utilization is low.

Thus this patch disables C1E auto-promotion and expose
C1E as a separate idle state, so that the C1E and C6 can
be disabled via sysfs when necessary.

Besides C1 and C1E, the exit latency of C6 was measured
by a dedicated tool. However the exit latency(41us) exposed
by _CST is much smaller than the one we measured(128us). This
is probably due to the _CST uses the exit latency when woken
up from PC0+C6, rather than PC6+C6 when C6 was measured. Choose
the latter as we need the longest latency in theory.

Reported-by: kernel test robot 
Tested-by: Artem Bityutskiy 
Acked-by: Artem Bityutskiy 
Cc: Len Brown 
Cc: Rafael J. Wysocki 
Signed-off-by: Zhang Rui 
Signed-off-by: Chen Yu 
---
v2: Minor commit message refinement for better understanding.
--
 drivers/idle/intel_idle.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index f4495841bf68..1eab606d858b 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -752,6 +752,35 @@ static struct cpuidle_state skx_cstates[] __initdata = {
.enter = NULL }
 };
 
+static struct cpuidle_state icx_cstates[] __initdata = {
+   {
+   .name = "C1",
+   .desc = "MWAIT 0x00",
+   .flags = MWAIT2flg(0x00),
+   .exit_latency = 1,
+   .target_residency = 1,
+   .enter = _idle,
+   .enter_s2idle = intel_idle_s2idle, },
+   {
+   .name = "C1E",
+   .desc = "MWAIT 0x01",
+   .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
+   .exit_latency = 4,
+   .target_residency = 4,
+   .enter = _idle,
+   .enter_s2idle = intel_idle_s2idle, },
+   {
+   .name = "C6",
+   .desc = "MWAIT 0x20",
+   .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+   .exit_latency = 128,
+   .target_residency = 384,
+   .enter = _idle,
+   .enter_s2idle = intel_idle_s2idle, },
+   {
+   .enter = NULL }
+};
+
 static struct cpuidle_state atom_cstates[] __initdata = {
{
.name = "C1E",
@@ -1056,6 +1085,12 @@ static const struct idle_cpu idle_cpu_skx __initconst = {
.use_acpi = true,
 };
 
+static const struct idle_cpu idle_cpu_icx __initconst = {
+   .state_table = icx_cstates,
+   .disable_promotion_to_c1e = true,
+   .use_acpi = true,
+};
+
 static const struct idle_cpu idle_cpu_avn __initconst = {
.state_table = avn_cstates,
.disable_promotion_to_c1e = true,
@@ -1110,6 +1145,7 @@ static const struct x86_cpu_id intel_idle_ids[] 
__initconst = {
X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE_L,  _cpu_skl),
X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE,_cpu_skl),
X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X,   _cpu_skx),
+   X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X,   _cpu_icx),
X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL,_cpu_knl),
X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM,_cpu_knl),
X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT,   _cpu_bxt),
-- 
2.17.1

mmotm 2020-07-09-21-00 uploaded

2020-07-09 Thread Andrew Morton

The mm-of-the-moment snapshot 2020-07-09-21-00 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.8-rc4:
(patches marked "*" will be included in linux-next)

  origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
* mailmap-add-entry-for-mike-rapoport.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* kbuild-move-wtype-limits-to-w=2.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* 
mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* 
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* mm-handle-page-mapping-better-in-dump_page.patch
* mm-dump-compound-page-information-on-a-second-line.patch
*

Re: [PATCH RFC v8 02/11] vhost: use batched get_vq_desc version

2020-07-09 Thread Jason Wang




On 2020/7/10 上午1:37, Michael S. Tsirkin wrote:

On Thu, Jul 09, 2020 at 06:46:13PM +0200, Eugenio Perez Martin wrote:

On Wed, Jul 1, 2020 at 4:10 PM Jason Wang  wrote:


On 2020/7/1 下午9:04, Eugenio Perez Martin wrote:

On Wed, Jul 1, 2020 at 2:40 PM Jason Wang  wrote:

On 2020/7/1 下午6:43, Eugenio Perez Martin wrote:

On Tue, Jun 23, 2020 at 6:15 PM Eugenio Perez Martin
 wrote:

On Mon, Jun 22, 2020 at 6:29 PM Michael S. Tsirkin  wrote:

On Mon, Jun 22, 2020 at 06:11:21PM +0200, Eugenio Perez Martin wrote:

On Mon, Jun 22, 2020 at 5:55 PM Michael S. Tsirkin  wrote:

On Fri, Jun 19, 2020 at 08:07:57PM +0200, Eugenio Perez Martin wrote:

On Mon, Jun 15, 2020 at 2:28 PM Eugenio Perez Martin
 wrote:

On Thu, Jun 11, 2020 at 5:22 PM Konrad Rzeszutek Wilk
 wrote:

On Thu, Jun 11, 2020 at 07:34:19AM -0400, Michael S. Tsirkin wrote:

As testing shows no performance change, switch to that now.

What kind of testing? 100GiB? Low latency?


Hi Konrad.

I tested this version of the patch:
https://lkml.org/lkml/2019/10/13/42

It was tested for throughput with DPDK's testpmd (as described in
http://doc.dpdk.org/guides/howto/virtio_user_as_exceptional_path.html)
and kernel pktgen. No latency tests were performed by me. Maybe it is
interesting to perform a latency test or just a different set of tests
over a recent version.

Thanks!

I have repeated the tests with v9, and results are a little bit different:
* If I test opening it with testpmd, I see no change between versions

OK that is testpmd on guest, right? And vhost-net on the host?


Hi Michael.

No, sorry, as described in
http://doc.dpdk.org/guides/howto/virtio_user_as_exceptional_path.html.
But I could add to test it in the guest too.

These kinds of raw packets "bursts" do not show performance
differences, but I could test deeper if you think it would be worth
it.

Oh ok, so this is without guest, with virtio-user.
It might be worth checking dpdk within guest too just
as another data point.


Ok, I will do it!


* If I forward packets between two vhost-net interfaces in the guest
using a linux bridge in the host:

And here I guess you mean virtio-net in the guest kernel?

Yes, sorry: Two virtio-net interfaces connected with a linux bridge in
the host. More precisely:
* Adding one of the interfaces to another namespace, assigning it an
IP, and starting netserver there.
* Assign another IP in the range manually to the other virtual net
interface, and start the desired test there.

If you think it would be better to perform then differently please let me know.

Not sure why you bother with namespaces since you said you are
using L2 bridging. I guess it's unimportant.


Sorry, I think I should have provided more context about that.

The only reason to use namespaces is to force the traffic of these
netperf tests to go through the external bridge. To test netperf
different possibilities than the testpmd (or pktgen or others "blast
of frames unconditionally" tests).

This way, I make sure that is the same version of everything in the
guest, and is a little bit easier to manage cpu affinity, start and
stop testing...

I could use a different VM for sending and receiving, but I find this
way a faster one and it should not introduce a lot of noise. I can
test with two VM if you think that this use of network namespace
introduces too much noise.

Thanks!


 - netperf UDP_STREAM shows a performance increase of 1.8, almost
doubling performance. This gets lower as frame size increase.

Regarding UDP_STREAM:
* with event_idx=on: The performance difference is reduced a lot if
applied affinity properly (manually assigning CPU on host/guest and
setting IRQs on guest), making them perform equally with and without
the patch again. Maybe the batching makes the scheduler perform
better.

Note that for UDP_STREAM, the result is pretty trick to be analyzed. E.g
setting a sndbuf for TAP may help for the performance (reduce the drop).


Ok, will add that to the test. Thanks!


Actually, it's better to skip the UDP_STREAM test since:

- My understanding is very few application is using raw UDP stream
- It's hard to analyze (usually you need to count the drop ratio etc)



 - rests of the test goes noticeably worse: UDP_RR goes from ~6347
transactions/sec to 5830

* Regarding UDP_RR, TCP_STREAM, and TCP_RR, proper CPU pinning makes
them perform similarly again, only a very small performance drop
observed. It could be just noise.
** All of them perform better than vanilla if event_idx=off, not sure
why. I can try to repeat them if you suspect that can be a test
failure.

* With testpmd and event_idx=off, if I send from the VM to host, I see
a performance increment especially in small packets. The buf api also
increases performance compared with only batching: Sending the minimum
packet size in testpmd makes pps go from 356kpps to 473 kpps.

What's your setup for this. The number looks rather low. I'd expected
1-2 Mpps at least.


Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz, 2

Re:

2020-07-09 Thread Susana Maleane

Hello,

I am 64yrs and I am intermediary / direct mandate to
reliable end seller refineries, who are capable of supplying the below
listed products. these products are currently available  D2,D6,M-100,JP54,
Aviation Kerosene TS1,REBCO,AGO,LNG,LPG,SN500, Bio-Diesel. please reply if
you have reputable buyers.

[PATCH v2] mm/hugetlb: split hugetlb_cma in nodes with memory

2020-07-09 Thread Barry Song

Rather than splitting huge_cma in online nodes, it is better to do it in
nodes with memory.
Without this patch, for an ARM64 server with four numa nodes and only
node0 has memory. If I set hugetlb_cma=4G in bootargs,

without this patch, I got the below printk:
hugetlb_cma: reserve 4096 MiB, up to 1024 MiB per node
hugetlb_cma: reserved 1024 MiB on node 0
hugetlb_cma: reservation failed: err -12, node 1
hugetlb_cma: reservation failed: err -12, node 2
hugetlb_cma: reservation failed: err -12, node 3

With this patch, I got the below printk:

hugetlb_cma: reserve 4096 MiB, up to 4096 MiB per node
hugetlb_cma: reserved 4096 MiB on node 0

So this patch makes the hugetlb_cma size consistent with users' setting
on ARM64 platforms.

Jonathan Cameron tested this patch on x86 platform. Jonathan figured out
the boot code of x86 is much different with arm64. On arm64 all nodes are
marked online at the same time. On x86, only nodes with memory are
initially marked as online:
initmem_init()->x86_numa_init()->numa_init()->
numa_register_memblks()->alloc_node_data()->node_set_online()
So at time of the existing cma setup call only the memory containing nodes
are online. The other nodes are brought up much later.
Therefore, on x86 platform, hugetlb_cma size is actually consistent with
users' setting even though system has nodes without memory.

The problem is always there if N_ONLINE != N_MEMORY. In x86 case, it
is just hidden because N_ONLINE happen to match N_MEMORY during the boot
process when hugetlb_cma_reserve() gets called.

This patch documents this problem in the comment of hugetlb_cma_reserve()
and makes hugetlb_cma size optimal.

Cc: Roman Gushchin 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: H. Peter Anvin 
Cc: Mike Kravetz 
Cc: Mike Rapoport 
Cc: Andrew Morton 
Cc: Anshuman Khandual 
Cc: Jonathan Cameron 
Signed-off-by: Barry Song 
---
 -v2: document better according to Anshuman Khandual's comment

 arch/arm64/mm/init.c| 19 ++-
 arch/x86/kernel/setup.c | 12 +---
 include/linux/hugetlb.h |  7 +++
 mm/hugetlb.c|  4 ++--
 4 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1e93cfc7c47a..420f5e55615c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -420,15 +420,6 @@ void __init bootmem_init(void)
 
arm64_numa_init();
 
-   /*
-* must be done after arm64_numa_init() which calls numa_init() to
-* initialize node_online_map that gets used in hugetlb_cma_reserve()
-* while allocating required CMA size across online nodes.
-*/
-#ifdef CONFIG_ARM64_4K_PAGES
-   hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
-#endif
-
/*
 * Sparsemem tries to allocate bootmem in memory_present(), so must be
 * done after the fixed reservations.
@@ -438,6 +429,16 @@ void __init bootmem_init(void)
sparse_init();
zone_sizes_init(min, max);
 
+   /*
+* must be done after zone_sizes_init() which calls free_area_init()
+* that calls node_set_state() to initialize node_states[N_MEMORY]
+* because hugetlb_cma_reserve() will scan over nodes with N_MEMORY
+* state
+*/
+#ifdef CONFIG_ARM64_4K_PAGES
+   hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
+#endif
+
memblock_dump_all();
 }
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index a3767e74c758..a1a9712090ae 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1164,9 +1164,6 @@ void __init setup_arch(char **cmdline_p)
initmem_init();
dma_contiguous_reserve(max_pfn_mapped << PAGE_SHIFT);
 
-   if (boot_cpu_has(X86_FEATURE_GBPAGES))
-   hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
-
/*
 * Reserve memory for crash kernel after SRAT is parsed so that it
 * won't consume hotpluggable memory.
@@ -1180,6 +1177,15 @@ void __init setup_arch(char **cmdline_p)
 
x86_init.paging.pagetable_init();
 
+   /*
+* must be done after zone_sizes_init() which calls free_area_init()
+* that calls node_set_state() to initialize node_states[N_MEMORY]
+* because hugetlb_cma_reserve() will scan over nodes with N_MEMORY
+* state
+*/
+   if (boot_cpu_has(X86_FEATURE_GBPAGES))
+   hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
+
kasan_init();
 
/*
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 50650d0d01b9..6df411d91040 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -909,6 +909,13 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h,
 }
 
 #if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_CMA)
+/**
+ * hugetlb_cma_reserve() -- reserve CMA for gigantic pages on nodes with memory
+ *
+ * must be called after free_area_init() that updates N_MEMORY via 
node_set_state().
+ *

Re: [f2fs-dev] [PATCH] f2fs: change the way of handling range.len in F2FS_IOC_SEC_TRIM_FILE

2020-07-09 Thread Jaegeuk Kim

On 07/10, Chao Yu wrote:
> On 2020/7/10 11:31, Jaegeuk Kim wrote:
> > On 07/10, Chao Yu wrote:
> >> On 2020/7/10 11:02, Jaegeuk Kim wrote:
> >>> On 07/10, Daeho Jeong wrote:
>  From: Daeho Jeong 
> 
>  Changed the way of handling range.len of F2FS_IOC_SEC_TRIM_FILE.
>   1. Added -1 value support for range.len to signify the end of file.
>   2. If the end of the range passes over the end of file, it means until
>  the end of file.
>   3. ignored the case of that range.len is zero to prevent the function
>  from making end_addr zero and triggering different behaviour of
>  the function.
> 
>  Signed-off-by: Daeho Jeong 
>  ---
>   fs/f2fs/file.c | 16 +++-
>   1 file changed, 7 insertions(+), 9 deletions(-)
> 
>  diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>  index 368c80f8e2a1..1c4601f99326 100644
>  --- a/fs/f2fs/file.c
>  +++ b/fs/f2fs/file.c
>  @@ -3813,21 +3813,19 @@ static int f2fs_sec_trim_file(struct file *filp, 
>  unsigned long arg)
>   file_start_write(filp);
>   inode_lock(inode);
>   
>  -if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode)) {
>  +if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode) ||
>  +range.start >= inode->i_size) {
>   ret = -EINVAL;
>   goto err;
>   }
>   
>  -if (range.start >= inode->i_size) {
>  -ret = -EINVAL;
>  +if (range.len == 0)
>   goto err;
>  -}
>   
>  -if (inode->i_size - range.start < range.len) {
>  -ret = -E2BIG;
>  -goto err;
>  -}
>  -end_addr = range.start + range.len;
>  +if (range.len == (u64)-1 || inode->i_size - range.start < 
>  range.len)
>  +end_addr = inode->i_size;
> >>
> >> We can remove 'range.len == (u64)-1' condition since later condition can 
> >> cover
> >> this?
> >>
> >>>
> >>> Hmm, what if there are blocks beyond i_size? Do we need to check i_blocks 
> >>> for
> >>
> >> The blocks beyond i_size will never be written, there won't be any valid 
> >> message
> >> there, so we don't need to worry about that.
> > 
> > I don't think we have a way to guarantee the order of i_size and block
> > allocation in f2fs. See f2fs_write_begin and f2fs_write_end.
> 
> However, write_begin & write_end are covered by inode_lock, it could not be
> racy with inode size check in f2fs_sec_trim_file() as it hold inode_lock as
> well?

Like Daeho said, write_begin -> checkpoint -> power-cut can give bigger i_blocks
than i_size.

> 
> > 
> >>
> >> Thanks,
> >>
> >>> ending criteria?
> >>>
>  +else
>  +end_addr = range.start + range.len;
>   
>   to_end = (end_addr == inode->i_size);
>   if (!IS_ALIGNED(range.start, F2FS_BLKSIZE) ||
>  -- 
>  2.27.0.383.g050319c2ae-goog
> 
> 
> 
>  ___
>  Linux-f2fs-devel mailing list
>  linux-f2fs-de...@lists.sourceforge.net
>  https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>
> >>>
> >>> ___
> >>> Linux-f2fs-devel mailing list
> >>> linux-f2fs-de...@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>> .
> >>>
> > .
> >

Re: [f2fs-dev] [PATCH] f2fs: don't skip writeback of quota data

2020-07-09 Thread Jaegeuk Kim

On 07/10, Chao Yu wrote:
> On 2020/7/10 11:26, Jaegeuk Kim wrote:
> > On 07/10, Chao Yu wrote:
> >> On 2020/7/10 3:05, Jaegeuk Kim wrote:
> >>> On 07/09, Chao Yu wrote:
>  On 2020/7/9 13:30, Jaegeuk Kim wrote:
> > It doesn't need to bypass flushing quota data in background.
> 
>  The condition is used to flush quota data in batch to avoid random
>  small-sized udpate, did you hit any problem here?
> >>>
> >>> I suspect this causes fault injection test being stuck by waiting for 
> >>> inode
> >>> writeback completion. With this patch, it has been running w/o any issue 
> >>> so far.
> >>> I keep an eye on this.
> >>
> >> Hmmm.. so that this patch may not fix the root cause, and it may hiding the
> >> issue deeper.
> >>
> >> How about just keeping this patch in our private branch to let fault 
> >> injection
> >> test not be stuck? until we find the root cause in upstream codes.
> > 
> > Well, I don't think this hides something. When the issue happens, I saw 
> > inodes
> > being stuck due to writeback while only quota has some dirty data. At that 
> > time,
> > there was no dirty data page from other inodes.
> 
> Okay,
> 
> > 
> > More specifically, I suspect __writeback_inodes_sb_nr() gives WB_SYNC_NONE 
> > and
> > waits for wb_wait_for_completion().
> 
> Did you record any callstack after the issue happened?

I found this.

[213389.297642]  __schedule+0x2dd/0x780^M
[213389.299224]  schedule+0x55/0xc0^M
[213389.300745]  wb_wait_for_completion+0x56/0x90^M
[213389.302469]  ? wait_woken+0x80/0x80^M
[213389.303997]  __writeback_inodes_sb_nr+0xa8/0xd0^M
[213389.305760]  writeback_inodes_sb+0x4b/0x60^M
[213389.307439]  sync_filesystem+0x2e/0xa0^M
[213389.308999]  generic_shutdown_super+0x27/0x110^M
[213389.310738]  kill_block_super+0x27/0x50^M
[213389.312327]  kill_f2fs_super+0x76/0xe0 [f2fs]^M
[213389.314014]  deactivate_locked_super+0x3b/0x80^M
[213389.315692]  deactivate_super+0x3e/0x50^M
[213389.317226]  cleanup_mnt+0x109/0x160^M
[213389.318718]  __cleanup_mnt+0x12/0x20^M
[213389.320177]  task_work_run+0x70/0xb0^M
[213389.321609]  exit_to_usermode_loop+0x131/0x160^M
[213389.323306]  do_syscall_64+0x170/0x1b0^M
[213389.324762]  entry_SYSCALL_64_after_hwframe+0x44/0xa9^M
[213389.326477] RIP: 0033:0x7fc4b5e6a35b^M

> 
> Still I'm confused that why directory's data written could be skipped, but
> quota's data couldn't, what's the difference?

I suspect different blocking timing from cp_error between quota and dentry.
e.g., we block dir operations right after cp_error, while quota can make
dirty pages in more fine granularity.

> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Thanks,
> >>>
> 
>  Thanks,
> 
> >
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >  fs/f2fs/data.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index 44645f4f914b6..72e8b50e588c1 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -3148,7 +3148,7 @@ static int __f2fs_write_data_pages(struct 
> > address_space *mapping,
> > if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
> > goto skip_write;
> >  
> > -   if ((S_ISDIR(inode->i_mode) || IS_NOQUOTA(inode)) &&
> > +   if (S_ISDIR(inode->i_mode) &&
> > wbc->sync_mode == WB_SYNC_NONE &&
> > get_dirty_pages(inode) < nr_pages_to_skip(sbi, 
> > DATA) &&
> > f2fs_available_free_memory(sbi, DIRTY_DENTS))
> >
> >>> .
> >>>
> > .
> >

[PATCH] MAINTAINERS: Add entry for Broadcom BDC driver

2020-07-09 Thread Florian Fainelli

The Broadcom BDC driver did not have a MAINTAINERS entry which made it
escape review from Al and myself, add an entry so the relevant mailing
lists and people are copied.

Signed-off-by: Florian Fainelli 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1d4aa7f942de..360d001b81b8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3434,6 +3434,14 @@ F:   drivers/bus/brcmstb_gisb.c
 F: drivers/pci/controller/pcie-brcmstb.c
 N: brcmstb
 
+BROADCOM BDC DRIVER
+M: Al Cooper 
+L: linux-...@vger.kernel.org
+L: bcm-kernel-feedback-l...@broadcom.com
+S: Maintained
+F: Documentation/devicetree/bindings/usb/brcm,bdc.txt
+F: drivers/usb/gadget/udc/bdc/
+
 BROADCOM BMIPS CPUFREQ DRIVER
 M: Markus Mayer 
 M: bcm-kernel-feedback-l...@broadcom.com
-- 
2.17.1

Re: [PATCH] Documentation/security-bugs: Explain why plain text is preferred

2020-07-09 Thread Willy Tarreau

On Thu, Jul 09, 2020 at 09:42:56PM +0100, Will Deacon wrote:
> Acked-by: Will Deacon 
> 
> Hopefully "plain text" implies unencrypted as much as it does "not html".

I would have liked "(i.e. not html)" to be added after "plain text", but
I figured that those who do that often don't even know what this means
so that will probably not help them avoid their messages being stored
into a spambox :-/

Acked-by: Willy Tarreau 

Willy

Re: [PATCH v3 6/9] drm/bridge: ti-sn65dsi86: Use 18-bit DP if we can

2020-07-09 Thread Steev Klimaszewski




On 7/9/20 10:17 PM, Steev Klimaszewski wrote:


On 7/9/20 10:12 PM, Steev Klimaszewski wrote:


On 7/9/20 9:14 PM, Doug Anderson wrote:

Hi,

On Thu, Jul 9, 2020 at 6:38 PM Doug Anderson  
wrote:

Hi,

On Thu, Jul 9, 2020 at 6:19 PM Steev Klimaszewski 
 wrote:

Hi Doug,

I've been testing 5.8 and linux-next on the Lenovo Yoga C630, and 
with this patch applied, there is really bad banding on the display.


I'm really bad at explaining it, but you can see the differences 
in the following:


24bit (pre-5.8) - https://dev.gentoo.org/~steev/files/image0.jpg

18bit (5.8/linux-next) - 
https://dev.gentoo.org/~steev/files/image1.jpg

Presumably this means that your panel is defined improperly? If the
panel reports that it's a 6 bits per pixel panel but it's actually an
8 bits per pixel panel then you'll run into this problem.

I would have to assume you have a bunch of out of tree patches to
support your hardware since I don't see any device trees in linuxnext
(other than cheza) that use this bridge chip.  Otherwise I could try
to check and confirm that was the problem.

Ah, interesting.  Maybe you have the panel:

boe,nv133fhm-n61

As far as I can tell from the datasheet (I have the similar
boe,nv133fhm-n62) this is a 6bpp panel.  ...but if you feed it 8bpp
the banding goes away!  Maybe the panel itself knows how to dither???
...or maybe the datasheet / edid are wrong and this is actually an
8bpp panel.  Seems unlikely...

In any case, one fix is to pick
, 


though right now that patch is only enabled for sc7180.  Maybe you
could figure out how to apply it to your hardware?

...another fix would be to pretend that your panel is 8bpp even though
it's actually 6bpp.  Ironically if anyone ever tried to configure BPP
from the EDID they'd go back to 6bpp.  You can read the EDID of your
panel with this:

bus=$(i2cdetect -l | grep sn65 | sed 's/i2c-\([0-9]*\).*$/\1/')
i2cdump ${bus} 0x50 i

When I do that and then decode it on the "boe,nv133fhm-n62" panel, I 
find:


6 bits per primary color channel

-Doug



Hi Doug,

Decoding it does show be to boe,nv133fhm-n61 - and yeah it does say 
it's 6-bit according to panelook's specs for it.



I derped again...

root@c630:~# bus=$(i2cdetect -l | grep sn65 | sed 's/i2c-\([0-9]*\).*$/\1/')
root@c630:~# i2cdump ${bus} 0x50 i > edid
WARNING! This program can confuse your I2C bus, cause data loss and worse!
I will probe file /dev/i2c-16, address 0x50, mode i2c block
Continue? [Y/n]
root@c630:~# edid-decode edid
edid-decode (hex):

00 ff ff ff ff ff ff 00 09 e5 d1 07 00 00 00 00
01 1c 01 04 a5 1d 11 78 0a 1d b0 a6 58 54 9e 26
0f 50 54 00 00 00 01 01 01 01 01 01 01 01 01 01
01 01 01 01 01 01 c0 39 80 18 71 38 28 40 30 20
36 00 26 a5 10 00 00 1a 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 1a 00 00 00 fe 00 42
4f 45 20 43 51 0a 20 20 20 20 20 20 00 00 00 fe
00 4e 56 31 33 33 46 48 4d 2d 4e 36 31 0a 00 9a

03 26 0a 77 ab 1c 05 71 6f 1d 8c f1 43 ce 6a bb
fb d3 11 20 39 07 22 6e 65 68 77 70 d3 05 34 73
44 21 8b fd f5 6d 11 62 94 2a 7c fa 93 ba 6a 61
92 da 15 53 4c 39 eb f7 86 23 97 48 e9 39 09 d2
66 02 70 bb e2 77 0f 4a a3 a0 4c 72 6e 5d 47 70
43 c2 13 f3 b2 d9 b9 78 02 be 41 82 15 6a 28 dc
45 0f 9d eb 0f 2a cc e8 35 8d 34 7f 3e 84 5e a3
30 5e 1e 29 0a 48 0c d1 0a c4 08 31 03 a9 3b 29



EDID version: 1.4
Manufacturer: BOE Model 2001 Serial Number 0
Made in week 1 of 2018
Digital display
8 bits per primary color channel
DisplayPort interface
Maximum image size: 29 cm x 17 cm
Gamma: 2.20
Supported color formats: RGB 4:4:4, YCrCb 4:4:4
First detailed timing includes the native pixel format and preferred 
refresh rate

Color Characteristics
  Red:   0.6484, 0.3447
  Green: 0.3310, 0.6181
  Blue:  0.1503, 0.0615
  White: 0.3125, 0.3281
Established Timings I & II: none
Standard Timings: none
Detailed mode: Clock 147.840 MHz, 294 mm x 165 mm
   1920 1968 2000 2200 ( 48  32 200)
   1080 1083 1089 1120 (  3   6  31)
   +hsync -vsync
   VertFreq: 60.000 Hz, HorFreq: 67.200 kHz
Manufacturer-Specified Display Descriptor (0x00): 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 1a  

Alphanumeric Data String: BOE CQ
Alphanumeric Data String: NV133FHM-N61
Checksum: 0x9a



Unknown EDID Extension Block 0x03
  03 26 0a 77 ab 1c 05 71 6f 1d 8c f1 43 ce 6a bb .&.w...qo...C.j.
  fb d3 11 20 39 07 22 6e 65 68 77 70 d3 05 34 73  ... 9."nehwp..4s
  44 21 8b fd f5 6d 11 62 94 2a 7c fa 93 ba 6a 61 D!...m.b.*|...ja
  92 da 15 53 4c 39 eb f7 86 23 97 48 e9 39 09 d2 ...SL9...#.H.9..
  66 02 70 bb e2 77 0f 4a a3 a0 4c 72 6e 5d 47 70 f.p..w.J..Lrn]Gp
  43 c2 13 f3 b2 d9 b9 78 02 be 41 82 15 6a 28 dc C..x..A..j(.
  45 0f 9d eb 0f 2a cc e8 35 8d 34 7f 3e 84 5e a3 E*..5.4.>.^.
  30 5e 1e 29 0a 48 0c d1 0a c4 08 31 03 a9 3b 29 0^.).H.1..;)
Checksum: 0x29 (should be 0x82)


- My edid does in fact say it's 8bit

Re: [PATCH V4] mm/vmstat: Add events for THP migration without split

2020-07-09 Thread Anshuman Khandual




On 07/09/2020 11:12 PM, Zi Yan wrote:
> On 9 Jul 2020, at 12:39, Randy Dunlap wrote:
> 
>> On 7/9/20 9:34 AM, Zi Yan wrote:
>>> On 9 Jul 2020, at 11:34, Randy Dunlap wrote:
>>>
 Hi,

 I have a few comments on this.

 a. I reported it very early and should have been Cc-ed.

 b. A patch that applies to mmotm or linux-next would have been better
 than a full replacement patch.

 c. I tried replacing what I believe is the correct/same patch file in mmotm
 and still have build errors.

 (more below)

 On 7/9/20 2:39 AM, Anshuman Khandual wrote:

> ---
> Applies on 5.8-rc4.
>
> Changes in V4:
>
> - Changed THP_MIGRATION_FAILURE as THP_MIGRATION_FAIL per John
> - Dropped all conditional 'if' blocks in migrate_pages() per Andrew and 
> John
> - Updated migration events documentation per John
> - Updated thp_nr_pages variable as nr_subpages for an expected merge 
> conflict
> - Moved all new THP vmstat events into CONFIG_MIGRATION
> - Updated Cc list with Documentation/ and tracing related addresses
>
> Changes in V3: (https://patchwork.kernel.org/patch/11647237/)
>
> - Formatted new events documentation with 'fmt' tool per Matthew
> - Made events universally available i.e dropped ARCH_ENABLE_THP_MIGRATION
> - Added THP_MIGRATION_SPLIT
> - Updated trace_mm_migrate_pages() with THP events
> - Made THP events update normal page migration events as well
>
> Changes in V2: (https://patchwork.kernel.org/patch/11586893/)
>
> - Dropped PMD reference both from code and commit message per Matthew
> - Added documentation and updated the commit message per Daniel
>
> Changes in V1: (https://patchwork.kernel.org/patch/11564497/)
>
> - Changed function name as thp_pmd_migration_success() per John
> - Folded in a fix (https://patchwork.kernel.org/patch/11563009/) from Hugh
>
> Changes in RFC V2: (https://patchwork.kernel.org/patch/11554861/)
>
> - Decopupled and renamed VM events from their implementation per Zi and 
> John
> - Added THP_PMD_MIGRATION_FAILURE VM event upon allocation failure and 
> split
>
> Changes in RFC V1: (https://patchwork.kernel.org/patch/11542055/)
>
>  Documentation/vm/page_migration.rst | 27 +++
>  include/linux/vm_event_item.h   |  3 ++
>  include/trace/events/migrate.h  | 17 --
>  mm/migrate.c| 52 -
>  mm/vmstat.c |  3 ++
>  5 files changed, 91 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
> index 24fc7c3ae7d6..2e6ca53b9bbd 100644
> --- a/include/linux/vm_event_item.h
> +++ b/include/linux/vm_event_item.h
> @@ -56,6 +56,9 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
>  #endif
>  #ifdef CONFIG_MIGRATION
>   PGMIGRATE_SUCCESS, PGMIGRATE_FAIL,
> + THP_MIGRATION_SUCCESS,
> + THP_MIGRATION_FAIL,
> + THP_MIGRATION_SPLIT,
 These 3 new symbols are still only present if CONFIG_MIGRATION=y, but the 
 build errors
 are using these symbols even when CONFIG_MIGRATION is not set.

>  #endif
>  #ifdef CONFIG_COMPACTION
>   COMPACTMIGRATE_SCANNED, COMPACTFREE_SCANNED,
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f37729673558..c706e3576cfc 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1429,22 +1429,35 @@ int migrate_pages(struct list_head *from, 
> new_page_t get_new_page,
>   enum migrate_mode mode, int reason)
>  {
>   int retry = 1;
> + int thp_retry = 1;
>   int nr_failed = 0;
>   int nr_succeeded = 0;
> + int nr_thp_succeeded = 0;
> + int nr_thp_failed = 0;
> + int nr_thp_split = 0;
>   int pass = 0;
> + bool is_thp = false;
>   struct page *page;
>   struct page *page2;
>   int swapwrite = current->flags & PF_SWAPWRITE;
> - int rc;
> + int rc, nr_subpages;
>
>   if (!swapwrite)
>   current->flags |= PF_SWAPWRITE;
>
> - for(pass = 0; pass < 10 && retry; pass++) {
> + for (pass = 0; pass < 10 && (retry || thp_retry); pass++) {
>   retry = 0;
> + thp_retry = 0;
>
>   list_for_each_entry_safe(page, page2, from, lru) {
>  retry:
> + /*
> +  * THP statistics is based on the source huge page.
> +  * Capture required information that might get lost
> +  * during migration.
> +  */
> + is_thp = PageTransHuge(page);
> + nr_subpages = hpage_nr_pages(page);
>   cond_resched();
>
>   if (PageHuge(page))
> @@

Re: [f2fs-dev] [PATCH] f2fs: don't skip writeback of quota data

2020-07-09 Thread Chao Yu

On 2020/7/10 11:26, Jaegeuk Kim wrote:
> On 07/10, Chao Yu wrote:
>> On 2020/7/10 3:05, Jaegeuk Kim wrote:
>>> On 07/09, Chao Yu wrote:
 On 2020/7/9 13:30, Jaegeuk Kim wrote:
> It doesn't need to bypass flushing quota data in background.

 The condition is used to flush quota data in batch to avoid random
 small-sized udpate, did you hit any problem here?
>>>
>>> I suspect this causes fault injection test being stuck by waiting for inode
>>> writeback completion. With this patch, it has been running w/o any issue so 
>>> far.
>>> I keep an eye on this.
>>
>> Hmmm.. so that this patch may not fix the root cause, and it may hiding the
>> issue deeper.
>>
>> How about just keeping this patch in our private branch to let fault 
>> injection
>> test not be stuck? until we find the root cause in upstream codes.
> 
> Well, I don't think this hides something. When the issue happens, I saw inodes
> being stuck due to writeback while only quota has some dirty data. At that 
> time,
> there was no dirty data page from other inodes.

Okay,

> 
> More specifically, I suspect __writeback_inodes_sb_nr() gives WB_SYNC_NONE and
> waits for wb_wait_for_completion().

Did you record any callstack after the issue happened?

Still I'm confused that why directory's data written could be skipped, but
quota's data couldn't, what's the difference?

> 
>>
>> Thanks,
>>
>>>
>>> Thanks,
>>>

 Thanks,

>
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/data.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 44645f4f914b6..72e8b50e588c1 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -3148,7 +3148,7 @@ static int __f2fs_write_data_pages(struct 
> address_space *mapping,
>   if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
>   goto skip_write;
>  
> - if ((S_ISDIR(inode->i_mode) || IS_NOQUOTA(inode)) &&
> + if (S_ISDIR(inode->i_mode) &&
>   wbc->sync_mode == WB_SYNC_NONE &&
>   get_dirty_pages(inode) < nr_pages_to_skip(sbi, DATA) &&
>   f2fs_available_free_memory(sbi, DIRTY_DENTS))
>
>>> .
>>>
> .
>

Re: [f2fs-dev] [PATCH] f2fs: change the way of handling range.len in F2FS_IOC_SEC_TRIM_FILE

2020-07-09 Thread Chao Yu

On 2020/7/10 11:31, Jaegeuk Kim wrote:
> On 07/10, Chao Yu wrote:
>> On 2020/7/10 11:02, Jaegeuk Kim wrote:
>>> On 07/10, Daeho Jeong wrote:
 From: Daeho Jeong 

 Changed the way of handling range.len of F2FS_IOC_SEC_TRIM_FILE.
  1. Added -1 value support for range.len to signify the end of file.
  2. If the end of the range passes over the end of file, it means until
 the end of file.
  3. ignored the case of that range.len is zero to prevent the function
 from making end_addr zero and triggering different behaviour of
 the function.

 Signed-off-by: Daeho Jeong 
 ---
  fs/f2fs/file.c | 16 +++-
  1 file changed, 7 insertions(+), 9 deletions(-)

 diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
 index 368c80f8e2a1..1c4601f99326 100644
 --- a/fs/f2fs/file.c
 +++ b/fs/f2fs/file.c
 @@ -3813,21 +3813,19 @@ static int f2fs_sec_trim_file(struct file *filp, 
 unsigned long arg)
file_start_write(filp);
inode_lock(inode);
  
 -  if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode)) {
 +  if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode) ||
 +  range.start >= inode->i_size) {
ret = -EINVAL;
goto err;
}
  
 -  if (range.start >= inode->i_size) {
 -  ret = -EINVAL;
 +  if (range.len == 0)
goto err;
 -  }
  
 -  if (inode->i_size - range.start < range.len) {
 -  ret = -E2BIG;
 -  goto err;
 -  }
 -  end_addr = range.start + range.len;
 +  if (range.len == (u64)-1 || inode->i_size - range.start < range.len)
 +  end_addr = inode->i_size;
>>
>> We can remove 'range.len == (u64)-1' condition since later condition can 
>> cover
>> this?
>>
>>>
>>> Hmm, what if there are blocks beyond i_size? Do we need to check i_blocks 
>>> for
>>
>> The blocks beyond i_size will never be written, there won't be any valid 
>> message
>> there, so we don't need to worry about that.
> 
> I don't think we have a way to guarantee the order of i_size and block
> allocation in f2fs. See f2fs_write_begin and f2fs_write_end.

However, write_begin & write_end are covered by inode_lock, it could not be
racy with inode size check in f2fs_sec_trim_file() as it hold inode_lock as
well?

> 
>>
>> Thanks,
>>
>>> ending criteria?
>>>
 +  else
 +  end_addr = range.start + range.len;
  
to_end = (end_addr == inode->i_size);
if (!IS_ALIGNED(range.start, F2FS_BLKSIZE) ||
 -- 
 2.27.0.383.g050319c2ae-goog



 ___
 Linux-f2fs-devel mailing list
 linux-f2fs-de...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>
>>>
>>> ___
>>> Linux-f2fs-devel mailing list
>>> linux-f2fs-de...@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>> .
>>>
> .
>

Re: [PATCH V4] mm/vmstat: Add events for THP migration without split

2020-07-09 Thread Randy Dunlap

On 7/9/20 8:30 PM, Anshuman Khandual wrote:
> 
> On 07/09/2020 09:04 PM, Randy Dunlap wrote:
>> Hi,
>>
>> I have a few comments on this.
>>
>> a. I reported it very early and should have been Cc-ed.
> 
> I should have Cc-ed you on this V4 patch, sorry about that.
> 
>>
>> b. A patch that applies to mmotm or linux-next would have been better
>> than a full replacement patch.
> I have followed that (i.e patch on mmotm/next as fix) only when the
> required change is smaller as compared to the series on mmotm/next.
> But for others a new patch should be better which can be replaced
> on mmotm and next. At least that is my understanding and would like
> to be corrected otherwise.
> 
>>
>> c. I tried replacing what I believe is the correct/same patch file in mmotm
>> and still have build errors.
> 
> That should not have happened, all new THP migration events are with
> CONFIG_MIGRATION rather than CONFIG_TRANSPARENT_HUGEPAGE previously.

Yes, I could have been mistaken about that last part.  Sorry about that.

-- 
~Randy

Re: [PATCH] sysctl: add bound to panic_timeout to prevent overflow

2020-07-09 Thread Randy Dunlap

On 7/9/20 8:22 PM, Changming Liu wrote:
> Function panic() in kernel/panic.c will use panic_timeout
> multiplying 1000 as a loop boundery. So this multiplication

 boundary.

> can overflow when panic_timeout is greater than (INT_MAX/1000).
> And this results in a zero-delay panic, instead of a huge
> timeout as the user intends.
> 
> Fix this by adding bound check to make it no bigger than
> (INT_MAX/1000).
> 
> Signed-off-by: Changming Liu 
> ---
>  kernel/sysctl.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index db1ce7a..e60cf04 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -137,6 +137,9 @@ static int minolduid;
>  static int ngroups_max = NGROUPS_MAX;
>  static const int cap_last_cap = CAP_LAST_CAP;
>  
> +/* this is needed for setting boundery for panic_timeout to prevent it from 
> overflow*/

 boundary (or max value)   
overflow */

> +static int panic_time_max = INT_MAX / 1000;
> +
>  /*
>   * This is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs
>   * and hung_task_check_interval_secs
> @@ -1857,7 +1860,8 @@ static struct ctl_table kern_table[] = {
>   .data   = _timeout,
>   .maxlen = sizeof(int),
>   .mode   = 0644,
> - .proc_handler   = proc_dointvec,
> + .proc_handler   = proc_dointvec_minmax,
> + .extra2 = _time_max,
>   },
>  #ifdef CONFIG_COREDUMP
>   {
> 

thanks.
-- 
~Randy

Re: [PATCH] cpufreq: intel_pstate: Fix static checker warning for epp variable

2020-07-09 Thread Viresh Kumar

On 09-07-20, 13:05, Srinivas Pandruvada wrote:
> Fix warning for:
> drivers/cpufreq/intel_pstate.c:731 store_energy_performance_preference()
> error: uninitialized symbol 'epp'.
> 
> This warning is for a case, when energy_performance_preference attribute
> matches pre defined strings. In this case the value of raw epp will not
> be used to set EPP bits in MSR_HWP_REQUEST. So initializing with any
> value is fine.
> 
> Reported-by: Dan Carpenter 
> Signed-off-by: Srinivas Pandruvada 
> ---
> This patch is on top of bleed-edge branch at
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/rafael/linux-pm
> 
>  drivers/cpufreq/intel_pstate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 44c7b4677675..94cd07678ee3 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -709,7 +709,7 @@ static ssize_t store_energy_performance_preference(
>   struct cpudata *cpu_data = all_cpu_data[policy->cpu];
>   char str_preference[21];
>   bool raw = false;
> - u32 epp;
> + u32 epp = 0;
>   int ret;
>  
>   ret = sscanf(buf, "%20s", str_preference);

Acked-by: Viresh Kumar 

-- 
viresh

[PATCH] usb: gadget: bdc: use readl_poll_timeout() to simplify code

2020-07-09 Thread Chunfeng Yun

Use readl_poll_timeout() to poll register status

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/gadget/udc/bdc/bdc_core.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/usb/gadget/udc/bdc/bdc_core.c 
b/drivers/usb/gadget/udc/bdc/bdc_core.c
index 02a3a77..fa173de 100644
--- a/drivers/usb/gadget/udc/bdc/bdc_core.c
+++ b/drivers/usb/gadget/udc/bdc/bdc_core.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -32,21 +33,14 @@
 static int poll_oip(struct bdc *bdc, int usec)
 {
u32 status;
-   /* Poll till STS!= OIP */
-   while (usec) {
-   status = bdc_readl(bdc->regs, BDC_BDCSC);
-   if (BDC_CSTS(status) != BDC_OIP) {
-   dev_dbg(bdc->dev,
-   "poll_oip complete status=%d",
-   BDC_CSTS(status));
-   return 0;
-   }
-   udelay(10);
-   usec -= 10;
-   }
-   dev_err(bdc->dev, "Err: operation timedout BDCSC: 0x%08x\n", status);
+   int ret;
 
-   return -ETIMEDOUT;
+   ret = readl_poll_timeout(bdc->regs + BDC_BDCSC, status,
+   (BDC_CSTS(status) != BDC_OIP), 10, usec);
+   if (ret)
+   dev_err(bdc->dev, "Err: operation timedout BDCSC: 0x%08x\n", 
status);
+
+   return ret;
 }
 
 /* Stop the BDC controller */
-- 
1.9.1

Re: [PATCH V4] mm/vmstat: Add events for THP migration without split

2020-07-09 Thread Anshuman Khandual



On 07/09/2020 09:04 PM, Randy Dunlap wrote:
> Hi,
> 
> I have a few comments on this.
> 
> a. I reported it very early and should have been Cc-ed.

I should have Cc-ed you on this V4 patch, sorry about that.

> 
> b. A patch that applies to mmotm or linux-next would have been better
> than a full replacement patch.
I have followed that (i.e patch on mmotm/next as fix) only when the
required change is smaller as compared to the series on mmotm/next.
But for others a new patch should be better which can be replaced
on mmotm and next. At least that is my understanding and would like
to be corrected otherwise.

> 
> c. I tried replacing what I believe is the correct/same patch file in mmotm
> and still have build errors.

That should not have happened, all new THP migration events are with
CONFIG_MIGRATION rather than CONFIG_TRANSPARENT_HUGEPAGE previously.

Re: [f2fs-dev] [PATCH] f2fs: change the way of handling range.len in F2FS_IOC_SEC_TRIM_FILE

2020-07-09 Thread Jaegeuk Kim

On 07/10, Chao Yu wrote:
> On 2020/7/10 11:02, Jaegeuk Kim wrote:
> > On 07/10, Daeho Jeong wrote:
> >> From: Daeho Jeong 
> >>
> >> Changed the way of handling range.len of F2FS_IOC_SEC_TRIM_FILE.
> >>  1. Added -1 value support for range.len to signify the end of file.
> >>  2. If the end of the range passes over the end of file, it means until
> >> the end of file.
> >>  3. ignored the case of that range.len is zero to prevent the function
> >> from making end_addr zero and triggering different behaviour of
> >> the function.
> >>
> >> Signed-off-by: Daeho Jeong 
> >> ---
> >>  fs/f2fs/file.c | 16 +++-
> >>  1 file changed, 7 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >> index 368c80f8e2a1..1c4601f99326 100644
> >> --- a/fs/f2fs/file.c
> >> +++ b/fs/f2fs/file.c
> >> @@ -3813,21 +3813,19 @@ static int f2fs_sec_trim_file(struct file *filp, 
> >> unsigned long arg)
> >>file_start_write(filp);
> >>inode_lock(inode);
> >>  
> >> -  if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode)) {
> >> +  if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode) ||
> >> +  range.start >= inode->i_size) {
> >>ret = -EINVAL;
> >>goto err;
> >>}
> >>  
> >> -  if (range.start >= inode->i_size) {
> >> -  ret = -EINVAL;
> >> +  if (range.len == 0)
> >>goto err;
> >> -  }
> >>  
> >> -  if (inode->i_size - range.start < range.len) {
> >> -  ret = -E2BIG;
> >> -  goto err;
> >> -  }
> >> -  end_addr = range.start + range.len;
> >> +  if (range.len == (u64)-1 || inode->i_size - range.start < range.len)
> >> +  end_addr = inode->i_size;
> 
> We can remove 'range.len == (u64)-1' condition since later condition can cover
> this?
> 
> > 
> > Hmm, what if there are blocks beyond i_size? Do we need to check i_blocks 
> > for
> 
> The blocks beyond i_size will never be written, there won't be any valid 
> message
> there, so we don't need to worry about that.

I don't think we have a way to guarantee the order of i_size and block
allocation in f2fs. See f2fs_write_begin and f2fs_write_end.

> 
> Thanks,
> 
> > ending criteria?
> > 
> >> +  else
> >> +  end_addr = range.start + range.len;
> >>  
> >>to_end = (end_addr == inode->i_size);
> >>if (!IS_ALIGNED(range.start, F2FS_BLKSIZE) ||
> >> -- 
> >> 2.27.0.383.g050319c2ae-goog
> >>
> >>
> >>
> >> ___
> >> Linux-f2fs-devel mailing list
> >> linux-f2fs-de...@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 
> > 
> > ___
> > Linux-f2fs-devel mailing list
> > linux-f2fs-de...@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > .
> >

Re: [PATCH 4/4] perf-probe: Warn if the target function is GNU Indirect function

2020-07-09 Thread Masami Hiramatsu

On Thu, 9 Jul 2020 07:36:54 -0700
Andi Kleen  wrote:

> > diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> > index 1e95a336862c..671176d39569 100644
> > --- a/tools/perf/util/probe-event.c
> > +++ b/tools/perf/util/probe-event.c
> > @@ -379,6 +379,11 @@ static int find_alternative_probe_point(struct 
> > debuginfo *dinfo,
> > address = sym->start;
> > else
> > address = map->unmap_ip(map, sym->start) - map->reloc;
> > +   if (sym->type == STT_GNU_IFUNC) {
> > +   pr_warning("Warning: The probe address (0x%lx) is in a 
> > GNU indirect function.\n"
> > +   "This may not work as you expected unless you 
> > intend to probe the indirect function.\n",
> 
> I would say something like this.
> 
> Consider identifying the final function used at run time and set the
> probe directly on that.
> 
> I think that's more useful to the user.

Hmm, would you mean the default function which may be used for the symbol?
Let me check how we can find it.

Thank you,

-- 
Masami Hiramatsu

Re: [PATCH v2 5/5] firmware: QCOM_SCM: Allow qcom_scm driver to be loadable as a permenent module

2020-07-09 Thread John Stultz

On Thu, Jul 2, 2020 at 7:18 AM Will Deacon  wrote:
> On Thu, Jun 25, 2020 at 12:10:39AM +, John Stultz wrote:
> > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> > index b510f67dfa49..714893535dd2 100644
> > --- a/drivers/iommu/Kconfig
> > +++ b/drivers/iommu/Kconfig
> > @@ -381,6 +381,7 @@ config SPAPR_TCE_IOMMU
> >  config ARM_SMMU
> >   tristate "ARM Ltd. System MMU (SMMU) Support"
> >   depends on (ARM64 || ARM || (COMPILE_TEST && !GENERIC_ATOMIC64)) && 
> > MMU
> > + depends on QCOM_SCM || !QCOM_SCM #if QCOM_SCM=m this can't be =y
> >   select IOMMU_API
> >   select IOMMU_IO_PGTABLE_LPAE
> >   select ARM_DMA_USE_IOMMU if ARM
>
> This looks like a giant hack. Is there another way to handle this?

Sorry for the slow response here.

So, I agree the syntax looks strange (requiring a comment obviously
isn't a good sign), but it's a fairly common way to ensure drivers
don't get built in if they optionally depend on another driver that
can be built as a module.
  See "RFKILL || !RFKILL", "EXTCON || !EXTCON", or "USB_GADGET ||
!USB_GADGET" in various Kconfig files.

I'm open to using a different method, and in a different thread you
suggested using something like symbol_get(). I need to look into it
more, but that approach looks even more messy and prone to runtime
failures. Blocking the unwanted case at build time seems a bit cleaner
to me, even if the syntax is odd.

thanks
-john

Re: [f2fs-dev] [PATCH] f2fs: change the way of handling range.len in F2FS_IOC_SEC_TRIM_FILE

2020-07-09 Thread Jaegeuk Kim

On 07/09, Eric Biggers wrote:
> On Thu, Jul 09, 2020 at 08:20:35PM -0700, Jaegeuk Kim wrote:
> > On 07/10, Daeho Jeong wrote:
> > > 1. The valid data will be within i_size.
> > > 2. All the trim operations will be done in a unit of block, even if
> > > i_size is not aligned with BLKSIZE like the below.
> > > 
> > > index = F2FS_BYTES_TO_BLK(range.start);
> > > pg_end = DIV_ROUND_UP(end_addr, F2FS_BLKSIZE); <= BLKSIZE 
> > > aligned
> > > 
> > > Are you worried about the case that sudden power-off occurs while a
> > > file is being truncated?
> > > ex) 1GB file is being truncated to 4KB -> sudden power-off ->
> > > i_size(4KB), i_blocks(maybe somewhere between 4KB and 1GB)
> > 
> > Yes. Basically, I believe we can have some data beyond i_size like fsverity.
> > 
> 
> Note that fs-verity files are read-only, and therefore this ioctl can't be 
> used
> on them (since it requires a writable file descriptor).  So that case doesn't
> need to be handled here.

I meant it as an example of valid data beyond i_size.

> 
> - Eric

Re: [PATCH] arm64: topology: Don't support AMU without cpufreq

2020-07-09 Thread Viresh Kumar

On 09-07-20, 13:46, Ionela Voinescu wrote:
> I saw this case during FVP testing, although I acknowledge the 'virtual'
> part of that platform [1]. But allowing this does enable AMU testing on
> an AEM FVP.

In kernel, we only support things that are in mainline, else we don't
care about them. That's the general rule. And yeah I understand that
this is early support for a new hardware, and so it is better to add
code for things we are sure about.

> While I completely understand the reasoning behind avoiding to introduce
> large changes for small corner-case gains,

I think even that is fine, if there is a problem to be solved it needs
to be solved, big or small doesn't really matter. Just that it needs
to be there in mainline.

> the arguments for this
> support was:
>  - (1) AMUs are a new feature and it will take some time until we see the
>real usecases. That's always the case with early support for a
>feature - we want to add it early to enable its use and testing, but
>it will take some time to establish the true usecases.

Exactly, and so people normally prefer to keep things simple until the
time the needs arises for the same. A patch can be added later, its no
big deal. But it should be added when we need it.

>  - (2) It literally needed 2 lines of code + the weak cpufreq function
>to support this.

Yeah, small or big doesn't really matter.

> Given that I can't guarantee what hardware will or won't do, and given
> that AMUs are an optional feature, I controlled the only thing I could:
> the software :). By not making assumptions about the hardware, I ensured
> that the code does not break the interaction between cpufreq use or AMU
> use for frequency invariance.
> 
> This will be nicer in the new code as the control will be at CPU level,
> rather than policy level.

I won't try to force you to remove this piece and will leave it for
you to decide.

But, I don't see a future system in mainline which uses AMU but
doesn't have cpufreq for all its CPUs. And so I won't have kept code
for that, even if it is just 2 lines. We can always add it back when
required.

Thanks for the review again Ionela.

-- 
viresh

Re: [f2fs-dev] [PATCH] f2fs: change the way of handling range.len in F2FS_IOC_SEC_TRIM_FILE

2020-07-09 Thread Eric Biggers

On Thu, Jul 09, 2020 at 08:20:35PM -0700, Jaegeuk Kim wrote:
> On 07/10, Daeho Jeong wrote:
> > 1. The valid data will be within i_size.
> > 2. All the trim operations will be done in a unit of block, even if
> > i_size is not aligned with BLKSIZE like the below.
> > 
> > index = F2FS_BYTES_TO_BLK(range.start);
> > pg_end = DIV_ROUND_UP(end_addr, F2FS_BLKSIZE); <= BLKSIZE 
> > aligned
> > 
> > Are you worried about the case that sudden power-off occurs while a
> > file is being truncated?
> > ex) 1GB file is being truncated to 4KB -> sudden power-off ->
> > i_size(4KB), i_blocks(maybe somewhere between 4KB and 1GB)
> 
> Yes. Basically, I believe we can have some data beyond i_size like fsverity.
> 

Note that fs-verity files are read-only, and therefore this ioctl can't be used
on them (since it requires a writable file descriptor).  So that case doesn't
need to be handled here.

- Eric

Re: [f2fs-dev] [PATCH] f2fs: don't skip writeback of quota data

2020-07-09 Thread Jaegeuk Kim

On 07/10, Chao Yu wrote:
> On 2020/7/10 3:05, Jaegeuk Kim wrote:
> > On 07/09, Chao Yu wrote:
> >> On 2020/7/9 13:30, Jaegeuk Kim wrote:
> >>> It doesn't need to bypass flushing quota data in background.
> >>
> >> The condition is used to flush quota data in batch to avoid random
> >> small-sized udpate, did you hit any problem here?
> > 
> > I suspect this causes fault injection test being stuck by waiting for inode
> > writeback completion. With this patch, it has been running w/o any issue so 
> > far.
> > I keep an eye on this.
> 
> Hmmm.. so that this patch may not fix the root cause, and it may hiding the
> issue deeper.
> 
> How about just keeping this patch in our private branch to let fault injection
> test not be stuck? until we find the root cause in upstream codes.

Well, I don't think this hides something. When the issue happens, I saw inodes
being stuck due to writeback while only quota has some dirty data. At that time,
there was no dirty data page from other inodes.

More specifically, I suspect __writeback_inodes_sb_nr() gives WB_SYNC_NONE and
waits for wb_wait_for_completion().

> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Signed-off-by: Jaegeuk Kim 
> >>> ---
> >>>  fs/f2fs/data.c | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >>> index 44645f4f914b6..72e8b50e588c1 100644
> >>> --- a/fs/f2fs/data.c
> >>> +++ b/fs/f2fs/data.c
> >>> @@ -3148,7 +3148,7 @@ static int __f2fs_write_data_pages(struct 
> >>> address_space *mapping,
> >>>   if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
> >>>   goto skip_write;
> >>>  
> >>> - if ((S_ISDIR(inode->i_mode) || IS_NOQUOTA(inode)) &&
> >>> + if (S_ISDIR(inode->i_mode) &&
> >>>   wbc->sync_mode == WB_SYNC_NONE &&
> >>>   get_dirty_pages(inode) < nr_pages_to_skip(sbi, DATA) &&
> >>>   f2fs_available_free_memory(sbi, DIRTY_DENTS))
> >>>
> > .
> >

Re: [mm] 4e2c82a409: ltp.overcommit_memory01.fail

2020-07-09 Thread Qian Cai

> On Jul 9, 2020, at 9:38 PM, Feng Tang  wrote:
> 
> Give it a second thought, my previous way has more indents and lines,
> but it is easier to be understood that we have special handling for
> 'write' case. So I would prefer using it. 
> 
> Thoughts?

I don’t feel it is easier to understand. I generally prefer to bail out early 
if possible to also make code a bit more solid for future extensions (once the 
indentation reached 3+ levels, we will need to rework it).

But, I realize that I have spent too much time debugging than actually writing 
code those days, so my taste is probably not all that good. Thus, feel free to 
submit what style you prefer, so other people have more experience coding could 
review them more.

Re: [PATCH v2 2/2] soc: mediatek: add mtk-devapc driver

2020-07-09 Thread Neal Liu

Hi Chun-Kuang,

Thanks for your review.

On Thu, 2020-07-09 at 21:01 +0800, Chun-Kuang Hu wrote:
> Hi, Neal:
> 
> Neal Liu  於 2020年7月9日 週四 下午5:13寫道：
> >
> > MediaTek bus fabric provides TrustZone security support and data
> > protection to prevent slaves from being accessed by unexpected
> > masters.
> > The security violation is logged and sent to the processor for
> > further analysis or countermeasures.
> >
> > Any occurrence of security violation would raise an interrupt, and
> > it will be handled by mtk-devapc driver. The violation
> > information is printed in order to find the murderer.
> >
> > Signed-off-by: Neal Liu 
> 
> [snip]
> 
> > +
> > +static u32 get_shift_group(struct mtk_devapc_context *devapc_ctx,
> > +  int slave_type, int vio_idx)
> 
> vio_idx  is useless, so remove it.
> 

yes, my mistake. I'll remove it on next patch.

> > +{
> > +   u32 vio_shift_sta;
> > +   void __iomem *reg;
> > +   int bit;
> > +
> > +   reg = mtk_devapc_pd_get(devapc_ctx, slave_type, VIO_SHIFT_STA, 0);
> > +   vio_shift_sta = readl(reg);
> > +
> > +   for (bit = 0; bit < 32; bit++) {
> > +   if ((vio_shift_sta >> bit) & 0x1)
> > +   break;
> > +   }
> > +
> > +   return bit;
> > +}
> > +
> 
> [snip]
> 
> > +
> > +/*
> > + * devapc_violation_irq - the devapc Interrupt Service Routine (ISR) will 
> > dump
> > + *   violation information including which master 
> > violates
> > + *   access slave.
> > + */
> > +static irqreturn_t devapc_violation_irq(int irq_number,
> > +   struct mtk_devapc_context 
> > *devapc_ctx)
> > +{
> > +   const struct mtk_device_info **device_info;
> > +   int slave_type_num;
> > +   int vio_idx = -1;
> > +   int slave_type;
> > +
> > +   slave_type_num = devapc_ctx->slave_type_num;
> > +   device_info = devapc_ctx->device_info;
> > +
> > +   for (slave_type = 0; slave_type < slave_type_num; slave_type++) {
> 
> If slave_type_num is 1, I think the code should be simpler.

slave_type_num is depends on DT data, it's not always 1.

> 
> > +   if (!mtk_devapc_dump_vio_dbg(devapc_ctx, slave_type, 
> > _idx))
> > +   continue;
> > +
> > +   /* Ensure that violation info are written before
> > +* further operations
> > +*/
> > +   smp_mb();
> > +
> > +   mask_module_irq(devapc_ctx, slave_type, vio_idx, true);
> 
> Why do you mask irq?

It has to mask slave's irq before clear violation status.
It's one of hardware design.

> 
> > +
> > +   clear_vio_status(devapc_ctx, slave_type, vio_idx);
> > +
> > +   mask_module_irq(devapc_ctx, slave_type, vio_idx, false);
> > +   }
> > +
> > +   return IRQ_HANDLED;
> > +}
> > +
> > +/*
> > + * start_devapc - initialize devapc status and start receiving interrupt
> > + *   while devapc violation is triggered.
> > + */
> 
> [snip]
> 
> > +
> > +struct mtk_device_info {
> > +   int sys_index;
> 
> Useless, so remove it.

We need to print it as our debug information.
But I did not apply it on this patch, I'll add it on next patch.

> 
> > +   int ctrl_index;
> 
> Ditto.
> 
> Regards,
> Chun-Kuang.
> 
> > +   int vio_index;
> > +};
> > +

[PATCH] sysctl: add bound to panic_timeout to prevent overflow

2020-07-09 Thread Changming Liu

Function panic() in kernel/panic.c will use panic_timeout
multiplying 1000 as a loop boundery. So this multiplication
can overflow when panic_timeout is greater than (INT_MAX/1000).
And this results in a zero-delay panic, instead of a huge
timeout as the user intends.

Fix this by adding bound check to make it no bigger than
(INT_MAX/1000).

Signed-off-by: Changming Liu 
---
 kernel/sysctl.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index db1ce7a..e60cf04 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -137,6 +137,9 @@ static int minolduid;
 static int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;
 
+/* this is needed for setting boundery for panic_timeout to prevent it from 
overflow*/
+static int panic_time_max = INT_MAX / 1000;
+
 /*
  * This is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs
  * and hung_task_check_interval_secs
@@ -1857,7 +1860,8 @@ static struct ctl_table kern_table[] = {
.data   = _timeout,
.maxlen = sizeof(int),
.mode   = 0644,
-   .proc_handler   = proc_dointvec,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra2 = _time_max,
},
 #ifdef CONFIG_COREDUMP
{
-- 
2.7.4

Re: [f2fs-dev] [PATCH] f2fs: change the way of handling range.len in F2FS_IOC_SEC_TRIM_FILE

2020-07-09 Thread Chao Yu

On 2020/7/10 11:02, Jaegeuk Kim wrote:
> On 07/10, Daeho Jeong wrote:
>> From: Daeho Jeong 
>>
>> Changed the way of handling range.len of F2FS_IOC_SEC_TRIM_FILE.
>>  1. Added -1 value support for range.len to signify the end of file.
>>  2. If the end of the range passes over the end of file, it means until
>> the end of file.
>>  3. ignored the case of that range.len is zero to prevent the function
>> from making end_addr zero and triggering different behaviour of
>> the function.
>>
>> Signed-off-by: Daeho Jeong 
>> ---
>>  fs/f2fs/file.c | 16 +++-
>>  1 file changed, 7 insertions(+), 9 deletions(-)
>>
>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>> index 368c80f8e2a1..1c4601f99326 100644
>> --- a/fs/f2fs/file.c
>> +++ b/fs/f2fs/file.c
>> @@ -3813,21 +3813,19 @@ static int f2fs_sec_trim_file(struct file *filp, 
>> unsigned long arg)
>>  file_start_write(filp);
>>  inode_lock(inode);
>>  
>> -if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode)) {
>> +if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode) ||
>> +range.start >= inode->i_size) {
>>  ret = -EINVAL;
>>  goto err;
>>  }
>>  
>> -if (range.start >= inode->i_size) {
>> -ret = -EINVAL;
>> +if (range.len == 0)
>>  goto err;
>> -}
>>  
>> -if (inode->i_size - range.start < range.len) {
>> -ret = -E2BIG;
>> -goto err;
>> -}
>> -end_addr = range.start + range.len;
>> +if (range.len == (u64)-1 || inode->i_size - range.start < range.len)
>> +end_addr = inode->i_size;

We can remove 'range.len == (u64)-1' condition since later condition can cover
this?

> 
> Hmm, what if there are blocks beyond i_size? Do we need to check i_blocks for

The blocks beyond i_size will never be written, there won't be any valid message
there, so we don't need to worry about that.

Thanks,

> ending criteria?
> 
>> +else
>> +end_addr = range.start + range.len;
>>  
>>  to_end = (end_addr == inode->i_size);
>>  if (!IS_ALIGNED(range.start, F2FS_BLKSIZE) ||
>> -- 
>> 2.27.0.383.g050319c2ae-goog
>>
>>
>>
>> ___
>> Linux-f2fs-devel mailing list
>> linux-f2fs-de...@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 
> 
> ___
> Linux-f2fs-devel mailing list
> linux-f2fs-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> .
>

Re: [f2fs-dev] [PATCH] f2fs: change the way of handling range.len in F2FS_IOC_SEC_TRIM_FILE

2020-07-09 Thread Jaegeuk Kim

On 07/10, Daeho Jeong wrote:
> 1. The valid data will be within i_size.
> 2. All the trim operations will be done in a unit of block, even if
> i_size is not aligned with BLKSIZE like the below.
> 
> index = F2FS_BYTES_TO_BLK(range.start);
> pg_end = DIV_ROUND_UP(end_addr, F2FS_BLKSIZE); <= BLKSIZE aligned
> 
> Are you worried about the case that sudden power-off occurs while a
> file is being truncated?
> ex) 1GB file is being truncated to 4KB -> sudden power-off ->
> i_size(4KB), i_blocks(maybe somewhere between 4KB and 1GB)

Yes. Basically, I believe we can have some data beyond i_size like fsverity.

> 
> 2020년 7월 10일 (금) 오후 12:02, Jaegeuk Kim 님이 작성:
> >
> > On 07/10, Daeho Jeong wrote:
> > > From: Daeho Jeong 
> > >
> > > Changed the way of handling range.len of F2FS_IOC_SEC_TRIM_FILE.
> > >  1. Added -1 value support for range.len to signify the end of file.
> > >  2. If the end of the range passes over the end of file, it means until
> > > the end of file.
> > >  3. ignored the case of that range.len is zero to prevent the function
> > > from making end_addr zero and triggering different behaviour of
> > > the function.
> > >
> > > Signed-off-by: Daeho Jeong 
> > > ---
> > >  fs/f2fs/file.c | 16 +++-
> > >  1 file changed, 7 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > > index 368c80f8e2a1..1c4601f99326 100644
> > > --- a/fs/f2fs/file.c
> > > +++ b/fs/f2fs/file.c
> > > @@ -3813,21 +3813,19 @@ static int f2fs_sec_trim_file(struct file *filp, 
> > > unsigned long arg)
> > >   file_start_write(filp);
> > >   inode_lock(inode);
> > >
> > > - if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode)) {
> > > + if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode) ||
> > > + range.start >= inode->i_size) {
> > >   ret = -EINVAL;
> > >   goto err;
> > >   }
> > >
> > > - if (range.start >= inode->i_size) {
> > > - ret = -EINVAL;
> > > + if (range.len == 0)
> > >   goto err;
> > > - }
> > >
> > > - if (inode->i_size - range.start < range.len) {
> > > - ret = -E2BIG;
> > > - goto err;
> > > - }
> > > - end_addr = range.start + range.len;
> > > + if (range.len == (u64)-1 || inode->i_size - range.start < range.len)
> > > + end_addr = inode->i_size;
> >
> > Hmm, what if there are blocks beyond i_size? Do we need to check i_blocks 
> > for
> > ending criteria?
> >
> > > + else
> > > + end_addr = range.start + range.len;
> > >
> > >   to_end = (end_addr == inode->i_size);
> > >   if (!IS_ALIGNED(range.start, F2FS_BLKSIZE) ||
> > > --
> > > 2.27.0.383.g050319c2ae-goog
> > >
> > >
> > >
> > > ___
> > > Linux-f2fs-devel mailing list
> > > linux-f2fs-de...@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: change the way of handling range.len in F2FS_IOC_SEC_TRIM_FILE

2020-07-09 Thread Daeho Jeong

1. The valid data will be within i_size.
2. All the trim operations will be done in a unit of block, even if
i_size is not aligned with BLKSIZE like the below.

index = F2FS_BYTES_TO_BLK(range.start);
pg_end = DIV_ROUND_UP(end_addr, F2FS_BLKSIZE); <= BLKSIZE aligned

Are you worried about the case that sudden power-off occurs while a
file is being truncated?
ex) 1GB file is being truncated to 4KB -> sudden power-off ->
i_size(4KB), i_blocks(maybe somewhere between 4KB and 1GB)

2020년 7월 10일 (금) 오후 12:02, Jaegeuk Kim 님이 작성:
>
> On 07/10, Daeho Jeong wrote:
> > From: Daeho Jeong 
> >
> > Changed the way of handling range.len of F2FS_IOC_SEC_TRIM_FILE.
> >  1. Added -1 value support for range.len to signify the end of file.
> >  2. If the end of the range passes over the end of file, it means until
> > the end of file.
> >  3. ignored the case of that range.len is zero to prevent the function
> > from making end_addr zero and triggering different behaviour of
> > the function.
> >
> > Signed-off-by: Daeho Jeong 
> > ---
> >  fs/f2fs/file.c | 16 +++-
> >  1 file changed, 7 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index 368c80f8e2a1..1c4601f99326 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -3813,21 +3813,19 @@ static int f2fs_sec_trim_file(struct file *filp, 
> > unsigned long arg)
> >   file_start_write(filp);
> >   inode_lock(inode);
> >
> > - if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode)) {
> > + if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode) ||
> > + range.start >= inode->i_size) {
> >   ret = -EINVAL;
> >   goto err;
> >   }
> >
> > - if (range.start >= inode->i_size) {
> > - ret = -EINVAL;
> > + if (range.len == 0)
> >   goto err;
> > - }
> >
> > - if (inode->i_size - range.start < range.len) {
> > - ret = -E2BIG;
> > - goto err;
> > - }
> > - end_addr = range.start + range.len;
> > + if (range.len == (u64)-1 || inode->i_size - range.start < range.len)
> > + end_addr = inode->i_size;
>
> Hmm, what if there are blocks beyond i_size? Do we need to check i_blocks for
> ending criteria?
>
> > + else
> > + end_addr = range.start + range.len;
> >
> >   to_end = (end_addr == inode->i_size);
> >   if (!IS_ALIGNED(range.start, F2FS_BLKSIZE) ||
> > --
> > 2.27.0.383.g050319c2ae-goog
> >
> >
> >
> > ___
> > Linux-f2fs-devel mailing list
> > linux-f2fs-de...@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [PATCH v3 6/9] drm/bridge: ti-sn65dsi86: Use 18-bit DP if we can

2020-07-09 Thread Steev Klimaszewski




On 7/9/20 10:12 PM, Steev Klimaszewski wrote:


On 7/9/20 9:14 PM, Doug Anderson wrote:

Hi,

On Thu, Jul 9, 2020 at 6:38 PM Doug Anderson  
wrote:

Hi,

On Thu, Jul 9, 2020 at 6:19 PM Steev Klimaszewski  
wrote:

Hi Doug,

I've been testing 5.8 and linux-next on the Lenovo Yoga C630, and 
with this patch applied, there is really bad banding on the display.


I'm really bad at explaining it, but you can see the differences in 
the following:


24bit (pre-5.8) - https://dev.gentoo.org/~steev/files/image0.jpg

18bit (5.8/linux-next) - 
https://dev.gentoo.org/~steev/files/image1.jpg

Presumably this means that your panel is defined improperly? If the
panel reports that it's a 6 bits per pixel panel but it's actually an
8 bits per pixel panel then you'll run into this problem.

I would have to assume you have a bunch of out of tree patches to
support your hardware since I don't see any device trees in linuxnext
(other than cheza) that use this bridge chip.  Otherwise I could try
to check and confirm that was the problem.

Ah, interesting.  Maybe you have the panel:

boe,nv133fhm-n61

As far as I can tell from the datasheet (I have the similar
boe,nv133fhm-n62) this is a 6bpp panel.  ...but if you feed it 8bpp
the banding goes away!  Maybe the panel itself knows how to dither???
...or maybe the datasheet / edid are wrong and this is actually an
8bpp panel.  Seems unlikely...

In any case, one fix is to pick
, 


though right now that patch is only enabled for sc7180.  Maybe you
could figure out how to apply it to your hardware?

...another fix would be to pretend that your panel is 8bpp even though
it's actually 6bpp.  Ironically if anyone ever tried to configure BPP
from the EDID they'd go back to 6bpp.  You can read the EDID of your
panel with this:

bus=$(i2cdetect -l | grep sn65 | sed 's/i2c-\([0-9]*\).*$/\1/')
i2cdump ${bus} 0x50 i

When I do that and then decode it on the "boe,nv133fhm-n62" panel, I 
find:


6 bits per primary color channel

-Doug



Hi Doug,

Decoding it does show be to boe,nv133fhm-n61 - and yeah it does say 
it's 6-bit according to panelook's specs for it.



I'll take a look at the patch and see what I can come up with... at 
the moment, I'm forcing it to be 8bit and that does "work fine" but 
I'd like it to be fixed properly instead of my hack.


Thanks for your time and work!

-- Steev

For what it's worth - the 5.8 that I'm testing is at 
https://github.com/steev/linux/commits/c630-5.8-rc4-inline-encryption

[PATCH 1/2] mm/memory_hotplug: introduce default dummy memory_add_physaddr_to_nid()

2020-07-09 Thread Jia He

This is to introduce a general dummy helper. memory_add_physaddr_to_nid()
is a fallback option to get the nid in case NUMA_NO_NID is detected.

After this patch, arm64/sh/s390 can simply use the general dummy version.
PowerPC/x86/ia64 will still use their specific version.

This is the preparation to set a fallback value for dev_dax->target_node.

Reviewed-by: David Hildenbrand 
Signed-off-by: Jia He 
---
 arch/arm64/mm/numa.c | 10 --
 arch/ia64/mm/numa.c  |  2 --
 arch/sh/mm/init.c|  9 -
 arch/x86/mm/numa.c   |  1 -
 mm/memory_hotplug.c  | 10 ++
 5 files changed, 10 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index aafcee3e3f7e..73f8b49d485c 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -461,13 +461,3 @@ void __init arm64_numa_init(void)
 
numa_init(dummy_numa_init);
 }
-
-/*
- * We hope that we will be hotplugging memory on nodes we already know about,
- * such that acpi_get_node() succeeds and we never fall back to this...
- */
-int memory_add_physaddr_to_nid(u64 addr)
-{
-   pr_warn("Unknown node for memory at 0x%llx, assuming node 0\n", addr);
-   return 0;
-}
diff --git a/arch/ia64/mm/numa.c b/arch/ia64/mm/numa.c
index 5e1015eb6d0d..f34964271101 100644
--- a/arch/ia64/mm/numa.c
+++ b/arch/ia64/mm/numa.c
@@ -106,7 +106,5 @@ int memory_add_physaddr_to_nid(u64 addr)
return 0;
return nid;
 }
-
-EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
 #endif
 #endif
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index a70ba0fdd0b3..f75932ba87a6 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -430,15 +430,6 @@ int arch_add_memory(int nid, u64 start, u64 size,
return ret;
 }
 
-#ifdef CONFIG_NUMA
-int memory_add_physaddr_to_nid(u64 addr)
-{
-   /* Node 0 for now.. */
-   return 0;
-}
-EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
-#endif
-
 void arch_remove_memory(int nid, u64 start, u64 size,
struct vmem_altmap *altmap)
 {
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 8ee952038c80..2a6e62af4636 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -929,5 +929,4 @@ int memory_add_physaddr_to_nid(u64 start)
nid = numa_meminfo.blk[0].nid;
return nid;
 }
-EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
 #endif
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index da374cd3d45b..b49ab743d914 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -350,6 +350,16 @@ int __ref __add_pages(int nid, unsigned long pfn, unsigned 
long nr_pages,
return err;
 }
 
+#ifdef CONFIG_NUMA
+int __weak memory_add_physaddr_to_nid(u64 start)
+{
+   pr_info_once("Unknown target node for memory at 0x%llx, assuming node 
0\n",
+   start);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
+#endif
+
 /* find the smallest valid pfn in the range [start_pfn, end_pfn) */
 static unsigned long find_smallest_section_pfn(int nid, struct zone *zone,
 unsigned long start_pfn,
-- 
2.17.1

[PATCH 2/2] mm/memory_hotplug: fix unpaired mem_hotplug_begin/done

2020-07-09 Thread Jia He

When check_memblock_offlined_cb() returns failed rc(e.g. the memblock is
online at that time), mem_hotplug_begin/done is unpaired in such case.

Therefore a warning:
 Call Trace:
  percpu_up_write+0x33/0x40
  try_remove_memory+0x66/0x120
  ? _cond_resched+0x19/0x30
  remove_memory+0x2b/0x40
  dev_dax_kmem_remove+0x36/0x72 [kmem]
  device_release_driver_internal+0xf0/0x1c0
  device_release_driver+0x12/0x20
  bus_remove_device+0xe1/0x150
  device_del+0x17b/0x3e0
  unregister_dev_dax+0x29/0x60
  devm_action_release+0x15/0x20
  release_nodes+0x19a/0x1e0
  devres_release_all+0x3f/0x50
  device_release_driver_internal+0x100/0x1c0
  driver_detach+0x4c/0x8f
  bus_remove_driver+0x5c/0xd0
  driver_unregister+0x31/0x50
  dax_pmem_exit+0x10/0xfe0 [dax_pmem]

Fixes: f1037ec0cc8a ("mm/memory_hotplug: fix remove_memory() lockdep splat")
Cc: sta...@vger.kernel.org # v5.6+
Signed-off-by: Jia He 
Reviewed-by: David Hildenbrand 
Acked-by: Michal Hocko 
Acked-by: Dan Williams 
---
 mm/memory_hotplug.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b49ab743d914..3e0645387daf 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1752,7 +1752,7 @@ static int __ref try_remove_memory(int nid, u64 start, 
u64 size)
 */
rc = walk_memory_blocks(start, size, NULL, check_memblock_offlined_cb);
if (rc)
-   goto done;
+   return rc;
 
/* remove memmap entry */
firmware_map_remove(start, start + size, "System RAM");
@@ -1776,9 +1776,8 @@ static int __ref try_remove_memory(int nid, u64 start, 
u64 size)
 
try_offline_node(nid);
 
-done:
mem_hotplug_done();
-   return rc;
+   return 0;
 }
 
 /**
-- 
2.17.1

[PATCH v4 0/2] Fix and enable pmem as RAM device on arm64

2020-07-09 Thread Jia He

This fixies a few issues when I tried to enable pmem as RAM device on arm64.

To use memory_add_physaddr_to_nid as a fallback nid, it would be better
implement a general version (__weak) in mm/memory_hotplug. After that, arm64/
sh/s390 can simply use the general version, and PowerPC/ia64/x86 will use
arch specific version.

Tested on ThunderX2 host/qemu "-M virt" guest with a nvdimm device. The
memblocks from the dax pmem device can be either hot-added or hot-removed
on arm64 guest. Also passed the compilation test on x86.

Changes:
v4: - remove "device-dax: use fallback nid when numa_node is invalid", wait
  for Dan Williams' phys_addr_to_target_node() patch
- folder v3 patch1-4 into single one, no functional changes
v3: https://lkml.org/lkml/2020/7/8/1541
- introduce general version memory_add_physaddr_to_nid, refine the arch
  specific one
- fix an uninitialization bug in v2 device-dax patch
v2: https://lkml.org/lkml/2020/7/7/71
- Drop unnecessary patch to harden try_offline_node
- Use new solution(by David) to fix dev->target_node=-1 during probing
- Refine the mem_hotplug_begin/done patch

v1: https://lkml.org/lkml/2020/7/5/381

Jia He (2):
  mm/memory_hotplug: introduce default dummy
memory_add_physaddr_to_nid()
  mm/memory_hotplug: fix unpaired mem_hotplug_begin/done

 arch/arm64/mm/numa.c | 10 --
 arch/ia64/mm/numa.c  |  2 --
 arch/sh/mm/init.c|  9 -
 arch/x86/mm/numa.c   |  1 -
 mm/memory_hotplug.c  | 15 ---
 5 files changed, 12 insertions(+), 25 deletions(-)

-- 
2.17.1

Re: [PATCH] usb: xhci-mtk: fix the failure of bandwidth allocation

2020-07-09 Thread Nicolas Boichat

On Fri, Jul 10, 2020 at 10:30 AM Chunfeng Yun  wrote:
>
> The wMaxPacketSize field of endpoint descriptor may be zero
> as default value in alternate interface, and they are not
> actually selected when start stream, so skip them when try to
> allocate bandwidth.
>
> Cc: stable 
> Signed-off-by: Chunfeng Yun 

Add this?
Fixes: 0cbd4b34cda9dfd ("xhci: mediatek: support MTK xHCI host controller")

> ---
>  drivers/usb/host/xhci-mtk-sch.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/usb/host/xhci-mtk-sch.c b/drivers/usb/host/xhci-mtk-sch.c
> index fea..45c54d56 100644
> --- a/drivers/usb/host/xhci-mtk-sch.c
> +++ b/drivers/usb/host/xhci-mtk-sch.c
> @@ -557,6 +557,10 @@ static bool need_bw_sch(struct usb_host_endpoint *ep,
> if (is_fs_or_ls(speed) && !has_tt)
> return false;
>
> +   /* skip endpoint with zero maxpkt */
> +   if (usb_endpoint_maxp(>desc) == 0)
> +   return false;
> +
> return true;
>  }
>
> --
> 1.9.1
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek

Re: [f2fs-dev] [PATCH] f2fs: don't skip writeback of quota data

2020-07-09 Thread Chao Yu

On 2020/7/10 3:05, Jaegeuk Kim wrote:
> On 07/09, Chao Yu wrote:
>> On 2020/7/9 13:30, Jaegeuk Kim wrote:
>>> It doesn't need to bypass flushing quota data in background.
>>
>> The condition is used to flush quota data in batch to avoid random
>> small-sized udpate, did you hit any problem here?
> 
> I suspect this causes fault injection test being stuck by waiting for inode
> writeback completion. With this patch, it has been running w/o any issue so 
> far.
> I keep an eye on this.

Hmmm.. so that this patch may not fix the root cause, and it may hiding the
issue deeper.

How about just keeping this patch in our private branch to let fault injection
test not be stuck? until we find the root cause in upstream codes.

Thanks,

> 
> Thanks,
> 
>>
>> Thanks,
>>
>>>
>>> Signed-off-by: Jaegeuk Kim 
>>> ---
>>>  fs/f2fs/data.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index 44645f4f914b6..72e8b50e588c1 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -3148,7 +3148,7 @@ static int __f2fs_write_data_pages(struct 
>>> address_space *mapping,
>>> if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
>>> goto skip_write;
>>>  
>>> -   if ((S_ISDIR(inode->i_mode) || IS_NOQUOTA(inode)) &&
>>> +   if (S_ISDIR(inode->i_mode) &&
>>> wbc->sync_mode == WB_SYNC_NONE &&
>>> get_dirty_pages(inode) < nr_pages_to_skip(sbi, DATA) &&
>>> f2fs_available_free_memory(sbi, DIRTY_DENTS))
>>>
> .
>

Re: [PATCH v4] mm/hugetlb: avoid hardcoding while checking if cma is enabled

2020-07-09 Thread Mike Kravetz

On 7/9/20 5:57 PM, Barry Song wrote:
> hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
> no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
> enabled. gigantic pages might have been reserved on other nodes.
> This patch fixes possible double reservation and CMA leak.
> 
> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages 
> using cma")
> Cc: Mike Kravetz 
> Cc: Jonathan Cameron 
> Acked-by: Roman Gushchin 
> Signed-off-by: Barry Song 

Thank you!

Reviewed-by: Mike Kravetz 

-- 
Mike Kravetz

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1346 matches

Mail list logo