Re: [PATCH v2] iommu/iova: silence warnings under memory pressure

2019-11-22 Thread Joe Perches
On Fri, 2019-11-22 at 11:46 -0500, Qian Cai wrote:
> On Fri, 2019-11-22 at 08:28 -0800, Joe Perches wrote:
> > On Fri, 2019-11-22 at 09:59 -0500, Qian Cai wrote:
> > > On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote:
> > > > On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> > > > > When running heavy memory pressure workloads, this 5+ old system is
> > > > > throwing endless warnings below because disk IO is too slow to recover
> > > > > from swapping. Since the volume from alloc_iova_fast() could be large,
> > > > > once it calls printk(), it will trigger disk IO (writing to the log
> > > > > files) and pending softirqs which could cause an infinite loop and 
> > > > > make
> > > > > no progress for days by the ongoimng memory reclaim. This is the 
> > > > > counter
> > > > > part for Intel where the AMD part has already been merged. See the
> > > > > commit 3d708895325b ("iommu/amd: Silence warnings under memory
> > > > > pressure"). Since the allocation failure will be reported in
> > > > > intel_alloc_iova(), so just call printk_ratelimted() there and silence
> > > > > the one in alloc_iova_mem() to avoid the expensive warn_alloc().
> > > > 
> > > > []
> > > > > v2: use dev_err_ratelimited() and improve the commit messages.
> > > > 
> > > > []
> > > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > > > 
> > > > []
> > > > > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct 
> > > > > device *dev,
> > > > >   iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
> > > > >  IOVA_PFN(dma_mask), true);
> > > > >   if (unlikely(!iova_pfn)) {
> > > > > - dev_err(dev, "Allocating %ld-page iova failed", 
> > > > > nrpages);
> > > > > + dev_err_ratelimited(dev, "Allocating %ld-page iova 
> > > > > failed",
> > > > > + nrpages);
> > > > 
> > > > Trivia:
> > > > 
> > > > This should really have a \n termination on the format string
> > > > 
> > > > dev_err_ratelimited(dev, "Allocating %ld-page iova 
> > > > failed\n",
> > > > 
> > > > 
> > > 
> > > Why do you say so? It is right now printing with a newline added anyway.
> > > 
> > >  hpsa :03:00.0: DMAR: Allocating 1-page iova failed
> > 
> > If another process uses pr_cont at the same time,
> > it can be interleaved.
> 
> I lean towards fixing that in a separate patch if ever needed, as the origin
> dev_err() has no "\n" enclosed either.

Your choice.

I wrote trivia:, but touching the same line multiple times
is relatively pointless.



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/iova: silence warnings under memory pressure

2019-11-22 Thread Qian Cai
On Fri, 2019-11-22 at 08:28 -0800, Joe Perches wrote:
> On Fri, 2019-11-22 at 09:59 -0500, Qian Cai wrote:
> > On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote:
> > > On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> > > > When running heavy memory pressure workloads, this 5+ old system is
> > > > throwing endless warnings below because disk IO is too slow to recover
> > > > from swapping. Since the volume from alloc_iova_fast() could be large,
> > > > once it calls printk(), it will trigger disk IO (writing to the log
> > > > files) and pending softirqs which could cause an infinite loop and make
> > > > no progress for days by the ongoimng memory reclaim. This is the counter
> > > > part for Intel where the AMD part has already been merged. See the
> > > > commit 3d708895325b ("iommu/amd: Silence warnings under memory
> > > > pressure"). Since the allocation failure will be reported in
> > > > intel_alloc_iova(), so just call printk_ratelimted() there and silence
> > > > the one in alloc_iova_mem() to avoid the expensive warn_alloc().
> > > 
> > > []
> > > > v2: use dev_err_ratelimited() and improve the commit messages.
> > > 
> > > []
> > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > > 
> > > []
> > > > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct 
> > > > device *dev,
> > > > iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
> > > >IOVA_PFN(dma_mask), true);
> > > > if (unlikely(!iova_pfn)) {
> > > > -   dev_err(dev, "Allocating %ld-page iova failed", 
> > > > nrpages);
> > > > +   dev_err_ratelimited(dev, "Allocating %ld-page iova 
> > > > failed",
> > > > +   nrpages);
> > > 
> > > Trivia:
> > > 
> > > This should really have a \n termination on the format string
> > > 
> > >   dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n",
> > > 
> > > 
> > 
> > Why do you say so? It is right now printing with a newline added anyway.
> > 
> >  hpsa :03:00.0: DMAR: Allocating 1-page iova failed
> 
> If another process uses pr_cont at the same time,
> it can be interleaved.

I lean towards fixing that in a separate patch if ever needed, as the origin
dev_err() has no "\n" enclosed either.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/iova: silence warnings under memory pressure

2019-11-22 Thread Joe Perches
On Fri, 2019-11-22 at 09:59 -0500, Qian Cai wrote:
> On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote:
> > On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> > > When running heavy memory pressure workloads, this 5+ old system is
> > > throwing endless warnings below because disk IO is too slow to recover
> > > from swapping. Since the volume from alloc_iova_fast() could be large,
> > > once it calls printk(), it will trigger disk IO (writing to the log
> > > files) and pending softirqs which could cause an infinite loop and make
> > > no progress for days by the ongoimng memory reclaim. This is the counter
> > > part for Intel where the AMD part has already been merged. See the
> > > commit 3d708895325b ("iommu/amd: Silence warnings under memory
> > > pressure"). Since the allocation failure will be reported in
> > > intel_alloc_iova(), so just call printk_ratelimted() there and silence
> > > the one in alloc_iova_mem() to avoid the expensive warn_alloc().
> > 
> > []
> > > v2: use dev_err_ratelimited() and improve the commit messages.
> > 
> > []
> > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > 
> > []
> > > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device 
> > > *dev,
> > >   iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
> > >  IOVA_PFN(dma_mask), true);
> > >   if (unlikely(!iova_pfn)) {
> > > - dev_err(dev, "Allocating %ld-page iova failed", nrpages);
> > > + dev_err_ratelimited(dev, "Allocating %ld-page iova failed",
> > > + nrpages);
> > 
> > Trivia:
> > 
> > This should really have a \n termination on the format string
> > 
> > dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n",
> > 
> > 
> 
> Why do you say so? It is right now printing with a newline added anyway.
> 
>  hpsa :03:00.0: DMAR: Allocating 1-page iova failed

If another process uses pr_cont at the same time,
it can be interleaved.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/iova: silence warnings under memory pressure

2019-11-22 Thread Qian Cai
On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote:
> On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> > When running heavy memory pressure workloads, this 5+ old system is
> > throwing endless warnings below because disk IO is too slow to recover
> > from swapping. Since the volume from alloc_iova_fast() could be large,
> > once it calls printk(), it will trigger disk IO (writing to the log
> > files) and pending softirqs which could cause an infinite loop and make
> > no progress for days by the ongoimng memory reclaim. This is the counter
> > part for Intel where the AMD part has already been merged. See the
> > commit 3d708895325b ("iommu/amd: Silence warnings under memory
> > pressure"). Since the allocation failure will be reported in
> > intel_alloc_iova(), so just call printk_ratelimted() there and silence
> > the one in alloc_iova_mem() to avoid the expensive warn_alloc().
> 
> []
> > v2: use dev_err_ratelimited() and improve the commit messages.
> 
> []
> > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> 
> []
> > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device 
> > *dev,
> > iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
> >IOVA_PFN(dma_mask), true);
> > if (unlikely(!iova_pfn)) {
> > -   dev_err(dev, "Allocating %ld-page iova failed", nrpages);
> > +   dev_err_ratelimited(dev, "Allocating %ld-page iova failed",
> > +   nrpages);
> 
> Trivia:
> 
> This should really have a \n termination on the format string
> 
>   dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n",
> 
> 

Why do you say so? It is right now printing with a newline added anyway.

 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] iommu/iova: silence warnings under memory pressure

2019-11-21 Thread Joe Perches
On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> When running heavy memory pressure workloads, this 5+ old system is
> throwing endless warnings below because disk IO is too slow to recover
> from swapping. Since the volume from alloc_iova_fast() could be large,
> once it calls printk(), it will trigger disk IO (writing to the log
> files) and pending softirqs which could cause an infinite loop and make
> no progress for days by the ongoimng memory reclaim. This is the counter
> part for Intel where the AMD part has already been merged. See the
> commit 3d708895325b ("iommu/amd: Silence warnings under memory
> pressure"). Since the allocation failure will be reported in
> intel_alloc_iova(), so just call printk_ratelimted() there and silence
> the one in alloc_iova_mem() to avoid the expensive warn_alloc().
[]
> v2: use dev_err_ratelimited() and improve the commit messages.
[]
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
[]
> @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device 
> *dev,
>   iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
>  IOVA_PFN(dma_mask), true);
>   if (unlikely(!iova_pfn)) {
> - dev_err(dev, "Allocating %ld-page iova failed", nrpages);
> + dev_err_ratelimited(dev, "Allocating %ld-page iova failed",
> + nrpages);

Trivia:

This should really have a \n termination on the format string

dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n",


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] iommu/iova: silence warnings under memory pressure

2019-11-21 Thread Qian Cai
When running heavy memory pressure workloads, this 5+ old system is
throwing endless warnings below because disk IO is too slow to recover
from swapping. Since the volume from alloc_iova_fast() could be large,
once it calls printk(), it will trigger disk IO (writing to the log
files) and pending softirqs which could cause an infinite loop and make
no progress for days by the ongoimng memory reclaim. This is the counter
part for Intel where the AMD part has already been merged. See the
commit 3d708895325b ("iommu/amd: Silence warnings under memory
pressure"). Since the allocation failure will be reported in
intel_alloc_iova(), so just call printk_ratelimted() there and silence
the one in alloc_iova_mem() to avoid the expensive warn_alloc().

 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 slab_out_of_memory: 66 callbacks suppressed
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
   node 0: slabs: 1822, objs: 16398, free: 0
   node 1: slabs: 2051, objs: 18459, free: 31
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
   node 0: slabs: 1822, objs: 16398, free: 0
   node 1: slabs: 2051, objs: 18459, free: 31
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 0: slabs: 1822, objs: 16398, free: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 1: slabs: 2051, objs: 18459, free: 31
   node 0: slabs: 697, objs: 4182, free: 0
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   node 1: slabs: 381, objs: 2286, free: 27
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 1: slabs: 381, objs: 2286, free: 27
 hpsa :03:00.0: DMAR: Allocating 1-page iova failed
 warn_alloc: 96 callbacks suppressed
 kworker/11:1H: page allocation failure: order:0,
mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0-1
 CPU: 11 PID: 1642 Comm: kworker/11:1H Tainted: GB
 Hardware name: HP ProLiant XL420 Gen9/ProLiant XL420 Gen9, BIOS U19
12/27/2015
 Workqueue: kblockd blk_mq_run_work_fn
 Call Trace:
  dump_stack+0xa0/0xea
  warn_alloc.cold.94+0x8a/0x12d
  __alloc_pages_slowpath+0x1750/0x1870
  __alloc_pages_nodemask+0x58a/0x710
  alloc_pages_current+0x9c/0x110
  alloc_slab_page+0xc9/0x760
  allocate_slab+0x48f/0x5d0
  new_slab+0x46/0x70
  ___slab_alloc+0x4ab/0x7b0
  __slab_alloc+0x43/0x70
  kmem_cache_alloc+0x2dd/0x450
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
  alloc_iova+0x33/0x210
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
  alloc_iova_fast+0x62/0x3d1
   node 1: slabs: 381, objs: 2286, free: 27
  intel_alloc_iova+0xce/0xe0
  intel_map_sg+0xed/0x410
  scsi_dma_map+0xd7/0x160
  scsi_queue_rq+0xbf7/0x1310
  blk_mq_dispatch_rq_list+0x4d9/0xbc0
  blk_mq_sched_dispatch_requests+0x24a/0x300
  __blk_mq_run_hw_queue+0x156/0x230
  blk_mq_run_work_fn+0x3b/0x40
  process_one_work+0x579/0xb90
  worker_thread+0x63/0x5b0
  kthread+0x1e6/0x210
  ret_from_fork+0x3a/0x50
 Mem-Info:
 active_anon:2422723 inactive_anon:361971 isolated_anon:34403
  active_file:2285 inactive_file:1838 isolated_file:0