[PATCH v6 2/4] resource: Use list_head to link sibling resource

2018-07-04 Thread Baoquan He
The struct resource uses singly linked list to link siblings, implemented
by pointer operation. Replace it with list_head for better code readability.

Based on this list_head replacement, it will be very easy to do reverse
iteration on iomem_resource's sibling list in later patch.

Besides, type of member variables of struct resource, sibling and child, are
changed from 'struct resource *' to 'struct list_head'. This brings two
pointers of size increase.

Suggested-by: Andrew Morton 
Signed-off-by: Baoquan He 
Cc: Patrik Jakobsson 
Cc: David Airlie 
Cc: "K. Y. Srinivasan" 
Cc: Haiyang Zhang 
Cc: Stephen Hemminger 
Cc: Dmitry Torokhov 
Cc: Dan Williams 
Cc: Rob Herring 
Cc: Frank Rowand 
Cc: Keith Busch 
Cc: Jonathan Derrick 
Cc: Lorenzo Pieralisi 
Cc: Bjorn Helgaas 
Cc: Thomas Gleixner 
Cc: Brijesh Singh 
Cc: "Jérôme Glisse" 
Cc: Borislav Petkov 
Cc: Tom Lendacky 
Cc: Greg Kroah-Hartman 
Cc: Yaowei Bai 
Cc: Wei Yang 
Cc: de...@linuxdriverproject.org
Cc: linux-in...@vger.kernel.org
Cc: linux-nvd...@lists.01.org
Cc: devicet...@vger.kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/arm/plat-samsung/pm-check.c|   6 +-
 arch/microblaze/pci/pci-common.c|   4 +-
 arch/powerpc/kernel/pci-common.c|   4 +-
 arch/sparc/kernel/ioport.c  |   2 +-
 arch/xtensa/include/asm/pci-bridge.h|   4 +-
 drivers/eisa/eisa-bus.c |   2 +
 drivers/gpu/drm/drm_memory.c|   3 +-
 drivers/gpu/drm/gma500/gtt.c|   5 +-
 drivers/hv/vmbus_drv.c  |  52 +++
 drivers/input/joystick/iforce/iforce-main.c |   4 +-
 drivers/nvdimm/namespace_devs.c |   6 +-
 drivers/nvdimm/nd.h |   5 +-
 drivers/of/address.c|   4 +-
 drivers/parisc/lba_pci.c|   4 +-
 drivers/pci/controller/vmd.c|   8 +-
 drivers/pci/probe.c |   2 +
 drivers/pci/setup-bus.c |   2 +-
 include/linux/ioport.h  |  17 ++-
 kernel/resource.c   | 208 ++--
 19 files changed, 175 insertions(+), 167 deletions(-)

diff --git a/arch/arm/plat-samsung/pm-check.c b/arch/arm/plat-samsung/pm-check.c
index cd2c02c68bc3..5494355b1c49 100644
--- a/arch/arm/plat-samsung/pm-check.c
+++ b/arch/arm/plat-samsung/pm-check.c
@@ -46,8 +46,8 @@ typedef u32 *(run_fn_t)(struct resource *ptr, u32 *arg);
 static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, u32 *arg)
 {
while (ptr != NULL) {
-   if (ptr->child != NULL)
-   s3c_pm_run_res(ptr->child, fn, arg);
+   if (!list_empty(>child))
+   s3c_pm_run_res(resource_first_child(>child), fn, 
arg);
 
if ((ptr->flags & IORESOURCE_SYSTEM_RAM)
== IORESOURCE_SYSTEM_RAM) {
@@ -57,7 +57,7 @@ static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, 
u32 *arg)
arg = (fn)(ptr, arg);
}
 
-   ptr = ptr->sibling;
+   ptr = resource_sibling(ptr);
}
 }
 
diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 7899bafab064..2bf73e27e231 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -533,7 +533,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller 
*hose,
res->flags = range.flags;
res->start = range.cpu_addr;
res->end = range.cpu_addr + range.size - 1;
-   res->parent = res->child = res->sibling = NULL;
+   res->parent = NULL;
+   INIT_LIST_HEAD(>child);
+   INIT_LIST_HEAD(>sibling);
}
}
 
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 926035bb378d..28fbe83c9daf 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -761,7 +761,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller 
*hose,
res->flags = range.flags;
res->start = range.cpu_addr;
res->end = range.cpu_addr + range.size - 1;
-   res->parent = res->child = res->sibling = NULL;
+   res->parent = NULL;
+   INIT_LIST_HEAD(>child);
+   INIT_LIST_HEAD(>sibling);
}
}
 }
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index cca9134cfa7d..99efe4e98b16 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -669,7 +669,7 @@ static int sparc_io_proc_show(struct seq_file *m, void *v)
struct resource *root = m->private, *r;
const char *nm;
 
-   for (r = root->child; r != NULL; r = r->sibling) {
+   list_for_each_entry(r, 

Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public

2018-07-04 Thread Baoquan He
Hi Andy,

On 06/12/18 at 05:24pm, Andy Shevchenko wrote:
> On Tue, Jun 12, 2018 at 5:20 PM, Andy Shevchenko
>  wrote:
> >> Hmm, I just copied it from arch/powerpc/kernel/pci-common.c. The
> >> function interface expects an integer returned value, not sure what a
> >> real error codes look like, could you give more hints? Will change
> >> accordingly.
> >
> > I briefly looked at the code and error codes we have, so, my proposal
> > is one of the following
> 
> >  - use -ECANCELED (not the best choice for first occurrence here,
> > though I can't find better)
> 
> Actually -ENOTSUPP might suit the first case (although the actual
> would be something like -EOVERLAP, which we don't have)

Sorry for late reply, and many thanks for your great suggestion.

I am fine to use -ENOTSUPP as the first returned value, and -ECANCELED
for the 2nd one. Or define an enum as you suggested inside the function
or in header file.

Or use -EBUSY for the first case because existing resource is
overlapping but not fully contained by 'res'; and -EINVAL for
the 2nd case since didn't find any one resources which is contained by
'res', means we passed in a invalid resource. 

All is fine to me, I can repost with each of them.

Thanks
Baoquan

> 
> >  - use positive integers (or enum), like
> >   #define RES_REPARENTED 0
> >   #define RES_OVERLAPPED 1
> >   #define RES_NOCONFLICT 2
> >
> >
> >>> > +   if (firstpp == NULL)
> >>> > +   firstpp = pp;
> >>> > +   }
> >>>
> >>> > +   if (firstpp == NULL)
> >>> > +   return -1;  /* didn't find any conflicting entries? 
> >>> > */
> >>>
> >>> Ditto.
> >
> > Ditto.
> >
> >>>
> >>> > +}
> >>> > +EXPORT_SYMBOL(reparent_resources);
> >
> > --
> > With Best Regards,
> > Andy Shevchenko
> 
> 
> 
> -- 
> With Best Regards,
> Andy Shevchenko

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v6 0/4] resource: Use list_head to link sibling resource

2018-07-04 Thread Baoquan He
This patchset is doing:
1) Replace struct resource's sibling list from singly linked list to
list_head. Clearing out those pointer operation within singly linked
list for better code readability.
2) Based on list_head replacement, add a new function
walk_system_ram_res_rev() which can does reversed iteration on
iomem_resource's siblings.
3) Change kexec_file loading to search system RAM top down for kernel
loadin, using walk_system_ram_res_rev().

Note:
This patchset only passed testing on  x86_64 arch with network
enabling. The thing we need pay attetion to is that a root resource's
child member need be initialized specifically with LIST_HEAD_INIT() if
statically defined or INIT_LIST_HEAD() for dynamically definition. Here
Just like we do for iomem_resource/ioport_resource, or the change in
get_pci_domain_busn_res().

v5:
http://lkml.kernel.org/r/20180612032831.29747-1-...@redhat.com

v4:
http://lkml.kernel.org/r/20180507063224.24229-1-...@redhat.com

v3:
http://lkml.kernel.org/r/20180419001848.3041-1-...@redhat.com

v2:
http://lkml.kernel.org/r/20180408024724.16812-1-...@redhat.com

v1:
http://lkml.kernel.org/r/20180322033722.9279-1-...@redhat.com

Changelog:
v5->v6:
  Fix code style problems in reparent_resources() and use existing
  error codes, according to Andy's suggestion.

  Fix bugs test robot reported.
  
v4->v5:
  Add new patch 0001 to move duplicated reparent_resources() to
  kernel/resource.c to make it be shared by different ARCH-es.

  Fix several code bugs reported by test robot on ARCH powerpc and
  microblaze.
v3->v4:
  Fix several bugs test robot reported. Rewrite cover letter and patch
  log according to reviewer's comment.

v2->v3:
  Rename resource functions first_child() and sibling() to
  resource_first_chils() and resource_sibling(). Dan suggested this.

  Move resource_first_chils() and resource_sibling() to linux/ioport.h
  and make them as inline function. Rob suggested this. Accordingly add
  linux/list.h including in linux/ioport.h, please help review if this
  bring efficiency degradation or code redundancy.

  The change on struct resource {} bring two pointers of size increase,
  mention this in git log to make it more specifically, Rob suggested
  this.

v1->v2:
  Use list_head instead to link resource siblings. This is suggested by
  Andrew.

  Rewrite walk_system_ram_res_rev() after list_head is taken to link
  resouce siblings.



Baoquan He (4):
  resource: Move reparent_resources() to kernel/resource.c and make it
public
  resource: Use list_head to link sibling resource
  resource: add walk_system_ram_res_rev()
  kexec_file: Load kernel at top of system RAM if required

 arch/arm/plat-samsung/pm-check.c|   6 +-
 arch/microblaze/pci/pci-common.c|  41 +
 arch/powerpc/kernel/pci-common.c|  39 +
 arch/sparc/kernel/ioport.c  |   2 +-
 arch/xtensa/include/asm/pci-bridge.h|   4 +-
 drivers/eisa/eisa-bus.c |   2 +
 drivers/gpu/drm/drm_memory.c|   3 +-
 drivers/gpu/drm/gma500/gtt.c|   5 +-
 drivers/hv/vmbus_drv.c  |  52 +++---
 drivers/input/joystick/iforce/iforce-main.c |   4 +-
 drivers/nvdimm/namespace_devs.c |   6 +-
 drivers/nvdimm/nd.h |   5 +-
 drivers/of/address.c|   4 +-
 drivers/parisc/lba_pci.c|   4 +-
 drivers/pci/controller/vmd.c|   8 +-
 drivers/pci/probe.c |   2 +
 drivers/pci/setup-bus.c |   2 +-
 include/linux/ioport.h  |  21 ++-
 kernel/kexec_file.c |   2 +
 kernel/resource.c   | 263 ++--
 20 files changed, 248 insertions(+), 227 deletions(-)

-- 
2.13.6


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v6 3/4] resource: add walk_system_ram_res_rev()

2018-07-04 Thread Baoquan He
This function, being a variant of walk_system_ram_res() introduced in
commit 8c86e70acead ("resource: provide new functions to walk through
resources"), walks through a list of all the resources of System RAM
in reversed order, i.e., from higher to lower.

It will be used in kexec_file code.

Signed-off-by: Baoquan He 
Cc: Andrew Morton 
Cc: Thomas Gleixner 
Cc: Brijesh Singh 
Cc: "Jérôme Glisse" 
Cc: Borislav Petkov 
Cc: Tom Lendacky 
Cc: Wei Yang 
---
 include/linux/ioport.h |  3 +++
 kernel/resource.c  | 40 
 2 files changed, 43 insertions(+)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index b7456ae889dd..066cc263e2cc 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -279,6 +279,9 @@ extern int
 walk_system_ram_res(u64 start, u64 end, void *arg,
int (*func)(struct resource *, void *));
 extern int
+walk_system_ram_res_rev(u64 start, u64 end, void *arg,
+   int (*func)(struct resource *, void *));
+extern int
 walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 
end,
void *arg, int (*func)(struct resource *, void *));
 
diff --git a/kernel/resource.c b/kernel/resource.c
index 6d647a3824b1..4c5fbef4ea24 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -23,6 +23,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 
@@ -443,6 +445,44 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 }
 
 /*
+ * This function, being a variant of walk_system_ram_res(), calls the @func
+ * callback against all memory ranges of type System RAM which are marked as
+ * IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY in reversed order, i.e., from
+ * higher to lower.
+ */
+int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
+   int (*func)(struct resource *, void *))
+{
+   unsigned long flags;
+   struct resource *res;
+   int ret = -1;
+
+   flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+
+   read_lock(_lock);
+   list_for_each_entry_reverse(res, _resource.child, sibling) {
+   if (start >= end)
+   break;
+   if ((res->flags & flags) != flags)
+   continue;
+   if (res->desc != IORES_DESC_NONE)
+   continue;
+   if (res->end < start)
+   break;
+
+   if ((res->end >= start) && (res->start < end)) {
+   ret = (*func)(res, arg);
+   if (ret)
+   break;
+   }
+   end = res->start - 1;
+
+   }
+   read_unlock(_lock);
+   return ret;
+}
+
+/*
  * This function calls the @func callback against all memory ranges, which
  * are ranges marked as IORESOURCE_MEM and IORESOUCE_BUSY.
  */
-- 
2.13.6


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v6 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public

2018-07-04 Thread Baoquan He
reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c
and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c
so that it's shared.

Signed-off-by: Baoquan He 
---
 arch/microblaze/pci/pci-common.c | 37 -
 arch/powerpc/kernel/pci-common.c | 35 ---
 include/linux/ioport.h   |  1 +
 kernel/resource.c| 39 +++
 4 files changed, 40 insertions(+), 72 deletions(-)

diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index f34346d56095..7899bafab064 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -619,43 +619,6 @@ int pcibios_add_device(struct pci_dev *dev)
 EXPORT_SYMBOL(pcibios_add_device);
 
 /*
- * Reparent resource children of pr that conflict with res
- * under res, and make res replace those children.
- */
-static int __init reparent_resources(struct resource *parent,
-struct resource *res)
-{
-   struct resource *p, **pp;
-   struct resource **firstpp = NULL;
-
-   for (pp = >child; (p = *pp) != NULL; pp = >sibling) {
-   if (p->end < res->start)
-   continue;
-   if (res->end < p->start)
-   break;
-   if (p->start < res->start || p->end > res->end)
-   return -1;  /* not completely contained */
-   if (firstpp == NULL)
-   firstpp = pp;
-   }
-   if (firstpp == NULL)
-   return -1;  /* didn't find any conflicting entries? */
-   res->parent = parent;
-   res->child = *firstpp;
-   res->sibling = *pp;
-   *firstpp = res;
-   *pp = NULL;
-   for (p = res->child; p != NULL; p = p->sibling) {
-   p->parent = res;
-   pr_debug("PCI: Reparented %s [%llx..%llx] under %s\n",
-p->name,
-(unsigned long long)p->start,
-(unsigned long long)p->end, res->name);
-   }
-   return 0;
-}
-
-/*
  *  Handle resources of PCI devices.  If the world were perfect, we could
  *  just allocate all the resource regions and do nothing more.  It isn't.
  *  On the other hand, we cannot just re-allocate all devices, as it would
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index fe9733aa..926035bb378d 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1088,41 +1088,6 @@ resource_size_t pcibios_align_resource(void *data, const 
struct resource *res,
 EXPORT_SYMBOL(pcibios_align_resource);
 
 /*
- * Reparent resource children of pr that conflict with res
- * under res, and make res replace those children.
- */
-static int reparent_resources(struct resource *parent,
-struct resource *res)
-{
-   struct resource *p, **pp;
-   struct resource **firstpp = NULL;
-
-   for (pp = >child; (p = *pp) != NULL; pp = >sibling) {
-   if (p->end < res->start)
-   continue;
-   if (res->end < p->start)
-   break;
-   if (p->start < res->start || p->end > res->end)
-   return -1;  /* not completely contained */
-   if (firstpp == NULL)
-   firstpp = pp;
-   }
-   if (firstpp == NULL)
-   return -1;  /* didn't find any conflicting entries? */
-   res->parent = parent;
-   res->child = *firstpp;
-   res->sibling = *pp;
-   *firstpp = res;
-   *pp = NULL;
-   for (p = res->child; p != NULL; p = p->sibling) {
-   p->parent = res;
-   pr_debug("PCI: Reparented %s %pR under %s\n",
-p->name, p, res->name);
-   }
-   return 0;
-}
-
-/*
  *  Handle resources of PCI devices.  If the world were perfect, we could
  *  just allocate all the resource regions and do nothing more.  It isn't.
  *  On the other hand, we cannot just re-allocate all devices, as it would
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index da0ebaec25f0..dfdcd0bfe54e 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -192,6 +192,7 @@ extern int allocate_resource(struct resource *root, struct 
resource *new,
 struct resource *lookup_resource(struct resource *root, resource_size_t start);
 int adjust_resource(struct resource *res, resource_size_t start,
resource_size_t size);
+int reparent_resources(struct resource *parent, struct resource *res);
 resource_size_t resource_alignment(struct resource *res);
 static inline resource_size_t resource_size(const struct resource *res)
 {
diff --git a/kernel/resource.c b/kernel/resource.c
index 30e1bc68503b..d1cbf4b50e17 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -983,6 +983,45 @@ int 

[PATCH v6 4/4] kexec_file: Load kernel at top of system RAM if required

2018-07-04 Thread Baoquan He
For kexec_file loading, if kexec_buf.top_down is 'true', the memory which
is used to load kernel/initrd/purgatory is supposed to be allocated from
top to down. This is what we have been doing all along in the old kexec
loading interface and the kexec loading is still default setting in some
distributions. However, the current kexec_file loading interface doesn't
do likt this. The function arch_kexec_walk_mem() it calls ignores checking
kexec_buf.top_down, but calls walk_system_ram_res() directly to go through
all resources of System RAM from bottom to up, to try to find memory region
which can contain the specific kexec buffer, then call 
locate_mem_hole_callback()
to allocate memory in that found memory region from top to down. This brings
confusion especially when KASLR is widely supported , users have to make clear
why kexec/kdump kernel loading position is different between these two
interfaces in order to exclude unnecessary noises. Hence these two interfaces
need be unified on behaviour.

Here add checking if kexec_buf.top_down is 'true' in arch_kexec_walk_mem(),
if yes, call the newly added walk_system_ram_res_rev() to find memory region
from top to down to load kernel.

Signed-off-by: Baoquan He 
Cc: Eric Biederman 
Cc: Vivek Goyal 
Cc: Dave Young 
Cc: Andrew Morton 
Cc: Yinghai Lu 
Cc: kexec@lists.infradead.org
---
 kernel/kexec_file.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index c6a3b6851372..75226c1d08ce 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -518,6 +518,8 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
   IORESOURCE_SYSTEM_RAM | 
IORESOURCE_BUSY,
   crashk_res.start, crashk_res.end,
   kbuf, func);
+   else if (kbuf->top_down)
+   return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func);
else
return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
 }
-- 
2.13.6


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public

2018-07-04 Thread Baoquan He
On 07/03/18 at 11:57pm, Andy Shevchenko wrote:
> On Tue, Jul 3, 2018 at 5:55 PM, Baoquan He  wrote:
> > On 06/12/18 at 05:24pm, Andy Shevchenko wrote:
> >> On Tue, Jun 12, 2018 at 5:20 PM, Andy Shevchenko
> >>  wrote:
> 
> >> > I briefly looked at the code and error codes we have, so, my proposal
> >> > is one of the following
> >>
> >> >  - use -ECANCELED (not the best choice for first occurrence here,
> >> > though I can't find better)
> >>
> >> Actually -ENOTSUPP might suit the first case (although the actual
> >> would be something like -EOVERLAP, which we don't have)
> >
> > Sorry for late reply, and many thanks for your great suggestion.
> >
> 
> > I am fine to use -ENOTSUPP as the first returned value, and -ECANCELED
> > for the 2nd one.
> 
> I have no strong opinion, but I like (slightly better) this approach ^^^

Done, post v6 in this way, many thanks.

> 
> > Or define an enum as you suggested inside the function
> > or in header file.
> 
> >
> > Or use -EBUSY for the first case because existing resource is
> > overlapping but not fully contained by 'res'; and -EINVAL for
> > the 2nd case since didn't find any one resources which is contained by
> > 'res', means we passed in a invalid resource.
> >
> > All is fine to me, I can repost with each of them.
> 
> >> >  - use positive integers (or enum), like
> >> >   #define RES_REPARENTED 0
> >> >   #define RES_OVERLAPPED 1
> >> >   #define RES_NOCONFLICT 2
> 
> -- 
> With Best Regards,
> Andy Shevchenko

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 4/8] firmware: add call to LSM hook before firmware sysfs fallback

2018-07-04 Thread kbuild test robot
Hi Mimi,

I love your patch! Yet something to improve:

[auto build test ERROR on integrity/next-integrity]
[also build test ERROR on v4.18-rc3 next-20180702]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Mimi-Zohar/kexec-firmware-support-system-wide-policy-requiring-signatures/20180703-04
base:   
https://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git 
next-integrity
config: x86_64-randconfig-ws0-07031240 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

>> ERROR: "security_kernel_load_data" 
>> [drivers/base/firmware_loader/firmware_class.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public

2018-07-04 Thread Andy Shevchenko
On Tue, Jul 3, 2018 at 5:55 PM, Baoquan He  wrote:
> On 06/12/18 at 05:24pm, Andy Shevchenko wrote:
>> On Tue, Jun 12, 2018 at 5:20 PM, Andy Shevchenko
>>  wrote:

>> > I briefly looked at the code and error codes we have, so, my proposal
>> > is one of the following
>>
>> >  - use -ECANCELED (not the best choice for first occurrence here,
>> > though I can't find better)
>>
>> Actually -ENOTSUPP might suit the first case (although the actual
>> would be something like -EOVERLAP, which we don't have)
>
> Sorry for late reply, and many thanks for your great suggestion.
>

> I am fine to use -ENOTSUPP as the first returned value, and -ECANCELED
> for the 2nd one.

I have no strong opinion, but I like (slightly better) this approach ^^^

> Or define an enum as you suggested inside the function
> or in header file.

>
> Or use -EBUSY for the first case because existing resource is
> overlapping but not fully contained by 'res'; and -EINVAL for
> the 2nd case since didn't find any one resources which is contained by
> 'res', means we passed in a invalid resource.
>
> All is fine to me, I can repost with each of them.

>> >  - use positive integers (or enum), like
>> >   #define RES_REPARENTED 0
>> >   #define RES_OVERLAPPED 1
>> >   #define RES_NOCONFLICT 2

-- 
With Best Regards,
Andy Shevchenko

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


makedumpfile-1.6.4: Add support for 5-level paging

2018-07-04 Thread Kazuhito Hagio
Hello,

makedumpfile version 1.6.4 is released.
Your comments/patches are welcome.

I took over the maintainer role from Tachibana-san. Thank you so much
for development and maintenance of makedumpfile for a long time!
And I'm pleased to work on makedumpfile with everyone here. The maintainer
changed to me, but there is no need to change your way of contribution.

Main new features:
o 5-level paging support
  - 5-level paging on x86_64 system is supported from this version.

o --mem-usage support for arm64
  - The makedumpfile --mem-usage option is now supported on arm64 system.
And the option was also tested on ppc64 and s390x by Bhupesh Sharma.

o Support new kernels
  - The supported kernel is updated to 4.17.0 in this version.

Changelog:
o New feature
  - [PATCH v2 0/3] arm64: Add '--mem-usage' support (Bhupesh Sharma) f49ca13
  - [PATCH v2 1/3] Add a new helper file 'tools.c' that provides some useful 
APIs (Bhupesh Sharma) 4d86cc7
  - [PATCH v2 2/3] arm64: Add support to read symbols like _stext from 
'/proc/kallsyms' (Bhupesh Sharma) d1e7805
  - [PATCH v2 3/3] Documentation: Update documentation regarding --mem-usage' 
option (Bhupesh Sharma) 8449bda
  - [PATCH 0/4] Add 5-level paging support (Baoquan He) 342fdab
  - [PATCH 2/4] Add pgtable_l5_enabled to number_table (Baoquan He) 8ae6f36
  - [PATCH 3/4] Add a new function check_5level_paging() (Baoquan He) 33ca808
  - [PATCH 4/4] arch/x86_64: Add 5-level paging support (Dou Liyang) 30a3214
  - [PATCH] handle mem_section as either a pointer or an array (Thadeu Lima de 
Souza Cascardo) 14876c4
  - [PATCH] ppc64: Increase the VA range (Hari Bathini) bd1ae9c

o Bugfix
  - [PATCH] sadump: Fix a problem of PTI enabled kernel (Takao Indoh) 38de12a
  - [PATCH] Fix a bug when multi-threads feature meets enospace (Zhou Wenjian) 
63b05c9
  - [PATCH] Fix array index out of bound exception (Lianbo Jiang) e5f96e7
  - [PATCH] fix for hugepages filtering (Hari Bathini) 246801c
  - [PATCH v2] Use integer arithmetics for the progress bar (Petr Tesarik) 
c6b79cb
  - [PATCH 2/2] Check PG_swapbacked for swap cache pages (Petr Tesarik) ca1f31e
  - [PATCH] Fix -g for change of mem_section (Kazuhito Hagio) f3c87e0
  - [PATCH] Fix -i for KASLR (Kazuhito Hagio) 08db98c
  - [PATCH] Fix validate_mem_section() (Kazuhito Hagio) 8f1aafa
  - [PATCH] arm64: Fix calculation of page_offset in case we are running cases 
other than mem-usage
  (Bhupesh Sharma) cb7af5f

o Cleanup
  - [PATCH v2] Always use bigger SECTION_MAP_MASK (Petr Tesarik) 6c6789f
  - [PATCH 1/2] Add is_cache_page() helper to check if a page belongs to the 
cache (Petr Tesarik) 694f28c
  - [PATCH 1/4] arch/x86_64: Cleanup the address translation of the 4-level 
page tables (Dou Liyang) acab2a2
  - [PATCH] Fix some compilation warnings on x86 system (Kazuhito Hagio) 2148e43
  - [PATCH] Update maintainers (Masaki Tachibana) bc7420b

Explanation of makedumpfile:
  To shorten the size of the dumpfile and the time of creating the
  dumpfile, makedumpfile copies only the necessary pages for analysis
  to the dumpfile from /proc/vmcore. You can specify the kind of
  unnecessary pages with dump_level. If you want to shorten the size
  further, enable the compression of the page data.

Download:
  You can download the latest makedumpfile from the following URL.
  Details of the change are written on the git page of the following site.
  https://sourceforge.net/projects/makedumpfile/

Method of installation:
  You can compile the makedumpfile command as follows;
  1. "tar -zxvf makedumpfile-x.y.z.tar.gz"
  2. "cd makedumpfile-x.y.z"
  3. "make; make install"

Usage:
  makedumpfile [-c|-l|-p] [-E] [-d dump_level] [-x vmlinux] dump_mem dump_file

Example:
  If you want to exclude pages filled by zero, cache pages, user pages
  and free pages and to enable compression, please execute the following
  command.

  # makedumpfile -c -d 31 /proc/vmcore dumpfile


Thanks,
Kazu

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] arm64: Fix calculation of page_offset in case we are running cases other than mem-usage

2018-07-04 Thread Bhupesh Sharma
On Wed, Jul 4, 2018 at 12:15 AM, Kazuhito Hagio  wrote:
> On 7/3/2018 1:23 AM, Bhupesh Sharma wrote:
>> Hello Kazu-san,
>>
>> On Sat, Jun 30, 2018 at 3:18 AM, Bhupesh Sharma  wrote:
>>> Patch f49ca13e5eed5bbdc69e0fd5ef099cb46050cb3d added '--mem-usage'
>>> support for arm64 architecture.
>>>
>>> However, we also need to make sure that the calculation of
>>> 'page_offset' is valid in case we are running cases other than
>>> '--mem-usage'.
>>>
>>> This patch does the same. I tested this patch on my qualcomm and apm
>>> mustang arm64 boards.
>>>
>>> Signed-off-by: Bhupesh Sharma 
>>> ---
>>>  arch/arm64.c | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/arch/arm64.c b/arch/arm64.c
>>> index c9dab677f2c9..2fd3e1874376 100644
>>> --- a/arch/arm64.c
>>> +++ b/arch/arm64.c
>>> @@ -222,6 +222,12 @@ get_stext_symbol(void)
>>>  int
>>>  get_machdep_info_arm64(void)
>>>  {
>>> +   /* Check if va_bits is still not initialized. If still 0, call
>>> +* get_versiondep_info() to initialize the same.
>>> +*/
>>> +   if (!va_bits)
>>> +   get_versiondep_info_arm64();
>>> +
>>> if (!calculate_plat_config()) {
>>> ERRMSG("Can't determine platform config values\n");
>>> return FALSE;
>>> --
>>> 2.7.4
>>>
>>
>> This is an important fix for arm64 makedumpfile as it fixes issues
>> reported with Fedora when we use makedumpfile for cases other than
>> '--mem-usage'.
>>
>> So kindly consider this one for the next makedumpfile release.
>
> Hi Bhupesh,
>
> I understood its importance and I think that this problem should be
> fixed before the v1.6.4 release.  Thanks for letting me know it.
>
> I've reviewed the patch and the process flow of arm64 architecture
> around it.  It looks good to me, and applied.

Great. Thanks a lot for your help.

Regards,
Bhupesh

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: panic kexec broken on ARM64?

2018-07-04 Thread James Morse
Hi Akashi,

On 04/07/18 09:41, takahiro.aka...@linaro.org wrote:
> On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
>> On 03/07/18 08:01, takahiro.aka...@linaro.org wrote:
>>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
 On Wed, 06 Jun 2018 12:37:02 +0100 James Morse wrote:,
> Bingo: its the lan78xx driver that is sleeping from the irqchip
> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> is why the RPi-3-B doesn't do this.
>
> It may be valid for kdump to only teardown the 'root irqdomain' (if
> that even means anything). I assume these secondary irqchip's would
> have a summary-interrupt that goes to another irqchip. But I can't
> see a way to tell them apart..,

 Overall, I can't think of an easy fix. We have a few options, but none
 of them involve a centralised change:

 1) We provide a reset infrastructure for irqchips, with an opt-in
mechanism. This involves changing the way we teardown irqs at
crash-time, and we'd then need some notion of reset ordering (think
of the layered ITS and GICv3, for example).
>>>
>>> Does this mean that all the irqchips have to be implemented with reset?
>>
>> No. Only those that want to be reset at kexec time.
> 
> I don't get the point yet.

(this stuff is new to me, below terminology is probably wrong:)

It seems there is actually a tree of irqchips, which feed interrupts through the
tree via some summary-interrupt.
The problem is one of these later-irqchips is on the other end of the USB bus,
and requires a fair amount of sleeping in order to reset it.

The trick is everything comes through the root irqchip. So we can reset that to
disable all the other interrupts in this tree.

This only works until the new kernel re-enables the summary-interrupt, it needs
to reset the irqchip that the summary-interrupt leads to first.

(I assume shared summary-interrupts are somehow forbidden).


> Who should have reset interface?
> What is the criteria?

(reset-interface -> be reset at kdump time)

Just the root irchip that all interrupts have to come through.


Thanks,

James

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 0/1] x86/purgatory: Fix Makefile bug

2018-07-04 Thread Philipp Rudo
Hi everybody,

when i moved the purgatories sha256 implementation to common code, i forgot
to add FORCE to the new Makefile target. This patch fixes it.

Note: There is an equivalent bug in the s390 purgatory. The fix for that
will go upstream via Martin.

Thanks
Philipp

Philipp Rudo (1):
  x86/purgatory: Add missing FORCE to Makefile target

 arch/x86/purgatory/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.16.4


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 1/1] x86/purgatory: Add missing FORCE to Makefile target

2018-07-04 Thread Philipp Rudo
Without FORCE make does not detect changes only made to the command line
options. So object files might not be re-built even when they should be.
Fix this by adding FORCE where it is missing.

Fixes: df6f2801f511 ("kernel/kexec_file.c: move purgatories sha256 to common 
code")
Cc:  # 4.17
Signed-off-by: Philipp Rudo 
---
 arch/x86/purgatory/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 2e9ee023e6bc..81a8e33115ad 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -6,7 +6,7 @@ purgatory-y := purgatory.o stack.o setup-x86_$(BITS).o sha256.o 
entry64.o string
 targets += $(purgatory-y)
 PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
 
-$(obj)/sha256.o: $(srctree)/lib/sha256.c
+$(obj)/sha256.o: $(srctree)/lib/sha256.c FORCE
$(call if_changed_rule,cc_o_c)
 
 LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined -nostdlib -z 
nodefaultlib
-- 
2.16.4


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: panic kexec broken on ARM64?

2018-07-04 Thread Marc Zyngier
On 04/07/18 09:41, takahiro.aka...@linaro.org wrote:
> On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
>> On 03/07/18 08:01, takahiro.aka...@linaro.org wrote:
>>> Marc, James,
>>>
>>> I'd like to re-ignite the discussion.
>>>
>>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
 On Wed, 06 Jun 2018 12:37:02 +0100,
 James Morse wrote:
>
> Hi Stefan,
>
> On 06/06/18 08:02, Stefan Wahren wrote:
>> Am 05.06.2018 um 19:46 schrieb James Morse:
>>> On 05/06/18 09:01, Petr Tesarik wrote:
 I attached a hardware debugger and found
 out that all CPU cores were stopped except one which was stuck in the
 idle thread. It seems that irq_set_irqchip_state() may sleep, which is
 definitely not safe after a kernel panic.
>
>>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just 
>>> poking
>>> around in mmio registers, this should all be safe unless you re-entered 
>>> the same
>>> code.
>
 If I'm right, then this is broken in general, but I have only ever seen
 it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
 be more subtle.
>
>>> Is there a hardware difference around the interrupt controller on these?
>
>> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, 
>> Fast
>> ethernet) instead of lan78xx (Gigabit ethernet).
>
> Bingo: its the lan78xx driver that is sleeping from the irqchip
> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> is why the RPi-3-B doesn't do this.
>
> It may be valid for kdump to only teardown the 'root irqdomain' (if
> that even means anything). I assume these secondary irqchip's would
> have a summary-interrupt that goes to another irqchip. But I can't
> see a way to tell them apart..,

 There is none. A cascaded irqchip is just like a root irqchip, just
 that its output line is connected to another irqchip. But we have no
 easy way to identify the parent. Also, this particular driver looks
 quite creative (it reinvents the wheel for chained interrupts -- see
 intr_complete and lan78xx_status), meaning that even if we could have
 a magic way of identify a chained irqchip, we'd miss that one. Broken.

> I think we need to wait until after the merge window for Marc's
> wisdom on this!

 Overall, I can't think of an easy fix. We have a few options, but none
 of them involve a centralised change:

 1) We provide a reset infrastructure for irqchips, with an opt-in
mechanism. This involves changing the way we teardown irqs at
crash-time, and we'd then need some notion of reset ordering (think
of the layered ITS and GICv3, for example).
>>>
>>> Does this mean that all the irqchips have to be implemented with reset?
>>
>> No. Only those that want to be reset at kexec time.
> 
> I don't get the point yet. Who should have reset interface?
> What is the criteria?

The criteria is "this irqchip requires a reset to be safely used in the
secondary kernel". This is a judgement call from the person writing the
driver.

> 

 2) We provide a way to identify interrupts that are ultimately backed
by a root controller, which implies walking down the hierarchy for
>>>
>>> To be clear, from bottom to top (or root), right?
>>
>> I'm not sure I understand your question. The idea is to walk the
>> irq_data chain, until we hit a root irqchip. If we do hit one, we
>> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
> 
> I thought that we would traverse the (chained irq) hierarchy from
> bottom to top and call deactivate or others in that order.
> Am I wrong here?

You *cannot* traverse a hierarchy through a chained irq. At that stage,
irq_data->parent_data is NULL. The only thing you can do is to iterate
over all the interrupts and deactivate those that are directly on the
root interrupt controller. This will have the effect of stopping the
interrupts that are behind a chained controller.

M.
-- 
Jazz is not dead. It just smells funny...

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: panic kexec broken on ARM64?

2018-07-04 Thread takahiro.aka...@linaro.org
On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote:
> On 03/07/18 08:01, takahiro.aka...@linaro.org wrote:
> > Marc, James,
> > 
> > I'd like to re-ignite the discussion.
> > 
> > On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
> >> On Wed, 06 Jun 2018 12:37:02 +0100,
> >> James Morse wrote:
> >>>
> >>> Hi Stefan,
> >>>
> >>> On 06/06/18 08:02, Stefan Wahren wrote:
>  Am 05.06.2018 um 19:46 schrieb James Morse:
> > On 05/06/18 09:01, Petr Tesarik wrote:
> >> I attached a hardware debugger and found
> >> out that all CPU cores were stopped except one which was stuck in the
> >> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
> >> definitely not safe after a kernel panic.
> >>>
> > I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
> > raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just 
> > poking
> > around in mmio registers, this should all be safe unless you re-entered 
> > the same
> > code.
> >>>
> >> If I'm right, then this is broken in general, but I have only ever seen
> >> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
> >> be more subtle.
> >>>
> > Is there a hardware difference around the interrupt controller on these?
> >>>
>  No, but the RPi 3 B has a different USB network chip on board (smsc95xx, 
>  Fast
>  ethernet) instead of lan78xx (Gigabit ethernet).
> >>>
> >>> Bingo: its the lan78xx driver that is sleeping from the irqchip
> >>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
> >>> is why the RPi-3-B doesn't do this.
> >>>
> >>> It may be valid for kdump to only teardown the 'root irqdomain' (if
> >>> that even means anything). I assume these secondary irqchip's would
> >>> have a summary-interrupt that goes to another irqchip. But I can't
> >>> see a way to tell them apart..,
> >>
> >> There is none. A cascaded irqchip is just like a root irqchip, just
> >> that its output line is connected to another irqchip. But we have no
> >> easy way to identify the parent. Also, this particular driver looks
> >> quite creative (it reinvents the wheel for chained interrupts -- see
> >> intr_complete and lan78xx_status), meaning that even if we could have
> >> a magic way of identify a chained irqchip, we'd miss that one. Broken.
> >>
> >>> I think we need to wait until after the merge window for Marc's
> >>> wisdom on this!
> >>
> >> Overall, I can't think of an easy fix. We have a few options, but none
> >> of them involve a centralised change:
> >>
> >> 1) We provide a reset infrastructure for irqchips, with an opt-in
> >>mechanism. This involves changing the way we teardown irqs at
> >>crash-time, and we'd then need some notion of reset ordering (think
> >>of the layered ITS and GICv3, for example).
> > 
> > Does this mean that all the irqchips have to be implemented with reset?
> 
> No. Only those that want to be reset at kexec time.

I don't get the point yet. Who should have reset interface?
What is the criteria?

> >>
> >> 2) We provide a way to identify interrupts that are ultimately backed
> >>by a root controller, which implies walking down the hierarchy for
> > 
> > To be clear, from bottom to top (or root), right?
> 
> I'm not sure I understand your question. The idea is to walk the
> irq_data chain, until we hit a root irqchip. If we do hit one, we
> deactivate/eoi/disable this interrupt. If we don't, we do nothing.

I thought that we would traverse the (chained irq) hierarchy from
bottom to top and call deactivate or others in that order.
Am I wrong here?

> This would avoid the above brokenness, and still ensures that no
> interrupt reaches the CPU.
> 
> > 
> >>each one of them. Fairly expensive, but minimal in way of changes
> >>in the crash code. Requires a per-irqchip flag, but ordering comes
> >>in for free.
> >>
> >> 3) We do the same as (2), but at the irqdomain level. Not sure that's
> >>any better, and it may be even more complicated and bring back some
> >>ordering issues.
> > 
> > Do you think that the same thing may happen in case of pci/msi?
> > I have no confidence but MSI has some kind of irq domain hierarchy.
> 
> Anything can happen, as people implement their interrupt infrastructure
> in weird and wonderful ways. So we need to be prepared for the worse.
> 
> I've pushed 3 patches on a branch[1]. It is mostly untested, but it
> should allow the above RPi3 disaster to cope with kexec.

I don't have any hardware that sees this kind of issue and can't test.

-Takahiro AKASHI

>   M.
> 
> [1]: 
> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip
> 
> -- 
> Jazz is not dead, it just smell funny.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: panic kexec broken on ARM64?

2018-07-04 Thread Matthias Brugger



On 03/07/18 10:58, Marc Zyngier wrote:
> On 03/07/18 08:01, takahiro.aka...@linaro.org wrote:
>> Marc, James,
>>
>> I'd like to re-ignite the discussion.
>>
>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote:
>>> On Wed, 06 Jun 2018 12:37:02 +0100,
>>> James Morse wrote:

 Hi Stefan,

 On 06/06/18 08:02, Stefan Wahren wrote:
> Am 05.06.2018 um 19:46 schrieb James Morse:
>> On 05/06/18 09:01, Petr Tesarik wrote:
>>> I attached a hardware debugger and found
>>> out that all CPU cores were stopped except one which was stuck in the
>>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is
>>> definitely not safe after a kernel panic.

>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a
>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just 
>> poking
>> around in mmio registers, this should all be safe unless you re-entered 
>> the same
>> code.

>>> If I'm right, then this is broken in general, but I have only ever seen
>>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may
>>> be more subtle.

>> Is there a hardware difference around the interrupt controller on these?

> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, 
> Fast
> ethernet) instead of lan78xx (Gigabit ethernet).

 Bingo: its the lan78xx driver that is sleeping from the irqchip
 callbacks; The smsc95xx driver doesn't have a struct irq_chip, which
 is why the RPi-3-B doesn't do this.

 It may be valid for kdump to only teardown the 'root irqdomain' (if
 that even means anything). I assume these secondary irqchip's would
 have a summary-interrupt that goes to another irqchip. But I can't
 see a way to tell them apart..,
>>>
>>> There is none. A cascaded irqchip is just like a root irqchip, just
>>> that its output line is connected to another irqchip. But we have no
>>> easy way to identify the parent. Also, this particular driver looks
>>> quite creative (it reinvents the wheel for chained interrupts -- see
>>> intr_complete and lan78xx_status), meaning that even if we could have
>>> a magic way of identify a chained irqchip, we'd miss that one. Broken.
>>>
 I think we need to wait until after the merge window for Marc's
 wisdom on this!
>>>
>>> Overall, I can't think of an easy fix. We have a few options, but none
>>> of them involve a centralised change:
>>>
>>> 1) We provide a reset infrastructure for irqchips, with an opt-in
>>>mechanism. This involves changing the way we teardown irqs at
>>>crash-time, and we'd then need some notion of reset ordering (think
>>>of the layered ITS and GICv3, for example).
>>
>> Does this mean that all the irqchips have to be implemented with reset?
> 
> No. Only those that want to be reset at kexec time.
> 
>>>
>>> 2) We provide a way to identify interrupts that are ultimately backed
>>>by a root controller, which implies walking down the hierarchy for
>>
>> To be clear, from bottom to top (or root), right?
> 
> I'm not sure I understand your question. The idea is to walk the
> irq_data chain, until we hit a root irqchip. If we do hit one, we
> deactivate/eoi/disable this interrupt. If we don't, we do nothing.
> 
> This would avoid the above brokenness, and still ensures that no
> interrupt reaches the CPU.
> 
>>
>>>each one of them. Fairly expensive, but minimal in way of changes
>>>in the crash code. Requires a per-irqchip flag, but ordering comes
>>>in for free.
>>>
>>> 3) We do the same as (2), but at the irqdomain level. Not sure that's
>>>any better, and it may be even more complicated and bring back some
>>>ordering issues.
>>
>> Do you think that the same thing may happen in case of pci/msi?
>> I have no confidence but MSI has some kind of irq domain hierarchy.
> 
> Anything can happen, as people implement their interrupt infrastructure
> in weird and wonderful ways. So we need to be prepared for the worse.
> 
> I've pushed 3 patches on a branch[1]. It is mostly untested, but it
> should allow the above RPi3 disaster to cope with kexec.
> 
>   M.
> 
> [1]: 
> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip
> 

I threw the kernel on my RPi3+ model but I wasn't able to start the crash
kernel. Unfortunately I don't have a JTAG adapter to check if it hangs for the
same reason.

Regards,
Matthias

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: panic kexec broken on ARM64?

2018-07-04 Thread Marc Zyngier
On Wed, 04 Jul 2018 15:08:38 +0100,
Matthias Brugger  wrote:
> On 03/07/18 10:58, Marc Zyngier wrote:
> > I've pushed 3 patches on a branch[1]. It is mostly untested, but it
> > should allow the above RPi3 disaster to cope with kexec.
> > 
> > M.
> > 
> > [1]: 
> > https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip
> > 
> 
> I threw the kernel on my RPi3+ model but I wasn't able to start the crash
> kernel. Unfortunately I don't have a JTAG adapter to check if it hangs for the
> same reason.

Have you annotated the RPi3 root irq_chip structure so that the arm64
kexec code knows that this is the root interrupt controller? I've only
done so on the GIC drivers.

  M.

-- 
Jazz is not dead, it just smell funny.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 3/4] efi/arm: map UEFI memory map earlier on boot

2018-07-04 Thread Ard Biesheuvel
On 4 July 2018 at 19:06, Will Deacon  wrote:
> Hi all,
>
> [Ard -- please can you look at the EFI parts of this patch]
>
> On Tue, Jun 19, 2018 at 03:44:23PM +0900, AKASHI Takahiro wrote:
>> Since arm_enter_runtime_services() was modified to always create a virtual
>> mapping of UEFI memory map in the previous patch, it is now renamed to
>> efi_enter_virtual_mode() and called earlier before acpi_load_tables()
>> in acpi_early_init().
>>
>> This will allow us to use UEFI memory map in acpi_os_ioremap() to create
>> mappings of ACPI tables using memory attributes described in UEFI memory
>> map.
>>
>> See a relevant commit:
>> arm64: acpi: fix alignment fault in accessing ACPI tables
>>
>> Signed-off-by: AKASHI Takahiro 
>> Cc: Ard Biesheuvel 
>> Cc: Andrew Morton 
>> ---
>>  drivers/firmware/efi/arm-runtime.c | 15 ++-
>>  init/main.c|  3 +++
>>  2 files changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/firmware/efi/arm-runtime.c 
>> b/drivers/firmware/efi/arm-runtime.c
>> index 30ac5c82051e..566ef0a9edb5 100644
>> --- a/drivers/firmware/efi/arm-runtime.c
>> +++ b/drivers/firmware/efi/arm-runtime.c
>> @@ -106,46 +106,43 @@ static bool __init efi_virtmap_init(void)
>>   * non-early mapping of the UEFI system table and virtual mappings for all
>>   * EFI_MEMORY_RUNTIME regions.
>>   */
>> -static int __init arm_enable_runtime_services(void)
>> +void __init efi_enter_virtual_mode(void)
>>  {
>>   u64 mapsize;
>>
>>   if (!efi_enabled(EFI_BOOT)) {
>>   pr_info("EFI services will not be available.\n");
>> - return 0;
>> + return;
>>   }
>>
>>   mapsize = efi.memmap.desc_size * efi.memmap.nr_map;
>>
>>   if (efi_memmap_init_late(efi.memmap.phys_map, mapsize)) {
>>   pr_err("Failed to remap EFI memory map\n");
>> - return 0;
>> + return;
>>   }
>>
>>   if (efi_runtime_disabled()) {
>>   pr_info("EFI runtime services will be disabled.\n");
>> - return 0;
>> + return;
>>   }
>>
>>   if (efi_enabled(EFI_RUNTIME_SERVICES)) {
>>   pr_info("EFI runtime services access via paravirt.\n");
>> - return 0;
>> + return;
>>   }
>>
>>   pr_info("Remapping and enabling EFI services.\n");
>>
>>   if (!efi_virtmap_init()) {
>>   pr_err("UEFI virtual mapping missing or invalid -- runtime 
>> services will not be available\n");
>> - return -ENOMEM;
>> + return;
>>   }
>>
>>   /* Set up runtime services function pointers */
>>   efi_native_runtime_setup();
>>   set_bit(EFI_RUNTIME_SERVICES, );
>> -
>> - return 0;
>>  }
>> -early_initcall(arm_enable_runtime_services);
>>
>>  void efi_virtmap_load(void)
>>  {
>> diff --git a/init/main.c b/init/main.c
>> index 3b4ada11ed52..532fc0d02353 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -694,6 +694,9 @@ asmlinkage __visible void __init start_kernel(void)
>>   debug_objects_mem_init();
>>   setup_per_cpu_pageset();
>>   numa_policy_init();
>> + if (IS_ENABLED(CONFIG_EFI) &&
>> + (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM)))
>> + efi_enter_virtual_mode();
>
> Hmm, this is ugly as hell. Is there nothing else we can piggy-back off?
> It's also fairly jarring that, on x86, efi_enter_virtual_mode() is called
> a few lines later, *after* acpi_early_init() has been called.
>

Currently, there is a gap where we have already torn down the early
mapping and haven't created the definitive mapping of the UEFI memory
map. There are other reasons why this is an issue, and I recently
proposed [0] myself to address one of them (and I didn't remember this
particular series, or the fact that I actually suggested this approach
IIRC)

Akashi-san, could you please confirm whether the patch below would be
sufficient for you? Apologies for going back and forth on this, but I
agree with Will that we should try to avoid warts like the one above
in generic code.

[0] https://marc.info/?l=linux-efi=152930773507524=2

> The rest of the series looks fine to me, but I'm not comfortable taking
> changes like this via the arm64 tree.
>
> Will

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 3/4] efi/arm: map UEFI memory map earlier on boot

2018-07-04 Thread Will Deacon
Hi all,

[Ard -- please can you look at the EFI parts of this patch]

On Tue, Jun 19, 2018 at 03:44:23PM +0900, AKASHI Takahiro wrote:
> Since arm_enter_runtime_services() was modified to always create a virtual
> mapping of UEFI memory map in the previous patch, it is now renamed to
> efi_enter_virtual_mode() and called earlier before acpi_load_tables()
> in acpi_early_init().
> 
> This will allow us to use UEFI memory map in acpi_os_ioremap() to create
> mappings of ACPI tables using memory attributes described in UEFI memory
> map.
> 
> See a relevant commit:
> arm64: acpi: fix alignment fault in accessing ACPI tables
> 
> Signed-off-by: AKASHI Takahiro 
> Cc: Ard Biesheuvel 
> Cc: Andrew Morton 
> ---
>  drivers/firmware/efi/arm-runtime.c | 15 ++-
>  init/main.c|  3 +++
>  2 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/firmware/efi/arm-runtime.c 
> b/drivers/firmware/efi/arm-runtime.c
> index 30ac5c82051e..566ef0a9edb5 100644
> --- a/drivers/firmware/efi/arm-runtime.c
> +++ b/drivers/firmware/efi/arm-runtime.c
> @@ -106,46 +106,43 @@ static bool __init efi_virtmap_init(void)
>   * non-early mapping of the UEFI system table and virtual mappings for all
>   * EFI_MEMORY_RUNTIME regions.
>   */
> -static int __init arm_enable_runtime_services(void)
> +void __init efi_enter_virtual_mode(void)
>  {
>   u64 mapsize;
>  
>   if (!efi_enabled(EFI_BOOT)) {
>   pr_info("EFI services will not be available.\n");
> - return 0;
> + return;
>   }
>  
>   mapsize = efi.memmap.desc_size * efi.memmap.nr_map;
>  
>   if (efi_memmap_init_late(efi.memmap.phys_map, mapsize)) {
>   pr_err("Failed to remap EFI memory map\n");
> - return 0;
> + return;
>   }
>  
>   if (efi_runtime_disabled()) {
>   pr_info("EFI runtime services will be disabled.\n");
> - return 0;
> + return;
>   }
>  
>   if (efi_enabled(EFI_RUNTIME_SERVICES)) {
>   pr_info("EFI runtime services access via paravirt.\n");
> - return 0;
> + return;
>   }
>  
>   pr_info("Remapping and enabling EFI services.\n");
>  
>   if (!efi_virtmap_init()) {
>   pr_err("UEFI virtual mapping missing or invalid -- runtime 
> services will not be available\n");
> - return -ENOMEM;
> + return;
>   }
>  
>   /* Set up runtime services function pointers */
>   efi_native_runtime_setup();
>   set_bit(EFI_RUNTIME_SERVICES, );
> -
> - return 0;
>  }
> -early_initcall(arm_enable_runtime_services);
>  
>  void efi_virtmap_load(void)
>  {
> diff --git a/init/main.c b/init/main.c
> index 3b4ada11ed52..532fc0d02353 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -694,6 +694,9 @@ asmlinkage __visible void __init start_kernel(void)
>   debug_objects_mem_init();
>   setup_per_cpu_pageset();
>   numa_policy_init();
> + if (IS_ENABLED(CONFIG_EFI) &&
> + (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM)))
> + efi_enter_virtual_mode();

Hmm, this is ugly as hell. Is there nothing else we can piggy-back off?
It's also fairly jarring that, on x86, efi_enter_virtual_mode() is called
a few lines later, *after* acpi_early_init() has been called.

The rest of the series looks fine to me, but I'm not comfortable taking
changes like this via the arm64 tree.

Will

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec