Re: [PATCH] makedumpfile/Makefile: remove -lebl from LIBS

2019-12-05 Thread piliu



On 12/05/2019 06:36 AM, Kazuhito Hagio wrote:
> Hi Pingfan,
> 
> Thank you for the patch.
> 
>> -Original Message-
>> since the following commit, -lebl has been removed from elfutils.
>> commit b833c731359af12af9f16bcb621b3cdc170eafbc
>> Author: Mark Wielaard 
>> Date:   Thu Aug 29 23:34:11 2019 +0200
>>
>> libebl: Don't install libebl.a, libebl.h and remove backends from spec.
>>
>> All archive members from libebl.a are now in libdw.a. We don't generate
>> separate backend shared libraries anymore. So remove them from the
>> elfutils.spec file.
>>
>> Signed-off-by: Mark Wielaard 
>>
>> So remove it from LIBS for makedumpfile
> 
> It seems that this is ok with the latest elfutils, but with older ones?
> Is it possible to remove -lebl when elfutils does not have libebl.a?
I have no idea about it for now. The method to check version depends on
distribution. Is it doable by checking /usr/lib64/libebl ?

Thanks,
Pingfan
> 
> Thanks,
> Kazu
> 
>>
>> Signed-off-by: Pingfan Liu 
>> ---
>>  Makefile | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/Makefile b/Makefile
>> index 1fdb628..df21b93 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -50,7 +50,7 @@ OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
>>  SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c 
>> arch/ppc64.c arch/s390x.c
>> arch/ppc.c arch/sparc64.c
>>  OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
>>
>> -LIBS = -ldw -lbz2 -lebl -ldl -lelf -lz
>> +LIBS = -ldw -lbz2 -ldl -lelf -lz
>>  ifneq ($(LINKTYPE), dynamic)
>>  LIBS := -static $(LIBS)
>>  endif
>> --
>> 2.7.5
>>
> 
> 
> 
> ___
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4 7/8] linux/log2.h: Fix 64bit calculations in roundup/down_pow_two()

2019-12-05 Thread Bjorn Helgaas
You got the "n" on "down" in the subject, but still missing "of" ;)

On Tue, Dec 03, 2019 at 12:47:40PM +0100, Nicolas Saenz Julienne wrote:
> Some users need to make sure their rounding function accepts and returns
> 64bit long variables regardless of the architecture. Sadly
> roundup/rounddown_pow_two() takes and returns unsigned longs. It turns
> out ilog2() already handles 32/64bit calculations properly, and being
> the building block to the round functions we can rework them as a
> wrapper around it.

Missing "of" in the function names here.
s/a wrapper/wrappers/

IIUC the point of this is that roundup_pow_of_two() returned
"unsigned long", which can be either 32 or 64 bits (worth pointing
out, I think), and many callers need something that returns
"unsigned long long" (always 64 bits).

It's a nice simplification to remove the "__" variants.  Just as a
casual reader of this commit message, I'd like to know why we had both
the roundup and the __roundup versions in the first place, and why we
no longer need both.

> -#define roundup_pow_of_two(n)\
> -(\
> - __builtin_constant_p(n) ? ( \
> - (n == 1) ? 1 :  \
> - (1UL << (ilog2((n) - 1) + 1))   \
> -) :  \
> - __roundup_pow_of_two(n) \
> - )
> +#define roundup_pow_of_two(n)  \
> +(  \
> + (__builtin_constant_p(n) && ((n) == 1)) ? \
> + 1 : (1ULL << (ilog2((n) - 1) + 1))\
> +)

Should the resulting type of this expression always be a ULL, even
when n==1, i.e., should it be this?

  1ULL : (1ULL << (ilog2((n) - 1) + 1))\

Or maybe there's no case where that makes a difference?

Bjorn

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] x86/efi: update e820 about reserved EFI boot services data to fix kexec breakage

2019-12-05 Thread Michael Weiser
On Thu, Dec 05, 2019 at 06:55:45PM +0800, Dave Young wrote:

> >esrt: Unsupported ESRT version 2904149718861218184.
> > 
> >  The ESRT memory stays in EFI boot services data, and it was reserved
> >  in kernel via efi_mem_reserve().  The initial purpose of the reservation
> >  is to reuse the EFI boot services data across kexec reboot. For example
> >  the BGRT image data and some ESRT memory like Michael reported.
> > 
> >  But although the memory is reserved it is not updated in the X86 E820 
> > table,
> >  and kexec_file_load() iterates system RAM in the IO resource list to find 
> > places
> >  for kernel, initramfs and other stuff. In Michael's case the kexec loaded
> >  initramfs overwrote the ESRT memory and then the failure happened.
> > 
> >  Since kexec_file_load() depends on the E820 table being updated, just fix 
> > this
> >  by updating the reserved EFI boot services memory as reserved type in E820.
> Thanks for the amending, also thank all for the review and test.

Same from me, particularly everyone's patience with my haphazard
guesswork around an area I clearly know nothing about. :)
-- 
Thanks,
Michael

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH v4 0/4] makedumpfile/arm64: Add support for ARMv8.2 extensions

2019-12-05 Thread Kazuhito Hagio
> -Original Message-
> This is your makedumpfile pulled from sourceforge .
> 
> It would be helpful if you bumped the VERSION and DATE to be certain we are 
> using the correct pieces .

Good suggestion.

I wanted the command line that executed makedumpfile in debug message
as well, so I'll think about adding them together.

Thanks,
Kazu
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH v4 3/4] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support)

2019-12-05 Thread Kazuhito Hagio
> -Original Message-
> > > > +/*
> > > > + * The linear kernel range starts at the bottom of the virtual address
> > > > + * space. Testing the top bit for the start of the region is a
> > > > + * sufficient check and avoids having to worry about the tag.
> > > > + */
> > > > +#define is_linear_addr(addr)   (!(((unsigned long)addr) & (1UL << 
> > > > (vabits_actual - 1
> > >
> > > Does this check cover 5.3 or earlier kernels?
> > > There is no case that vabits_actual is zero?
> 
> We can set vabits_actual as va_bits for older kernels. That shouldn't
> be a big change.
> Will add it in v5. See more below ...
> 
> > As you know, 14c127c957c1 ("arm64: mm: Flip kernel VA space") changed
> > the check for linear address:
> >
> > -#define __is_lm_address(addr)  (!!((addr) & BIT(VA_BITS - 1)))
> > +#define __is_lm_address(addr)  (!((addr) & BIT(VA_BITS - 1)))
> >
> > so if we use the same check as kernel has, I think we will need the
> > former one to support earlier kernels.
> 
> See above, we can use va_bits where vabits_actual is not present.

Yes, but it is not the problem that I wanted to say here.

The problem is that, even if we set vabits_actual to va_bits, we cannot
determine whether an address is in linear map range with just one macro
for 5.3 and 5.4 kernels.

Because the bit value to be checked by the macro changed:

5.3 VA_BITS=48
  linear map : 0x8000 to 0x
5.4 VA_BITS=48
  linear map : 0x to 0x7fff

or I missed something?

Thanks,
Kazu

> 
> > > > +
> > > >  static unsigned long long
> > > >  __pa(unsigned long vaddr)
> > > >  {
> > > > if (kimage_voffset == NOT_FOUND_NUMBER ||
> > > > -   (vaddr >= PAGE_OFFSET))
> > > > -   return (vaddr - PAGE_OFFSET + info->phys_base);
> > > > +   is_linear_addr(vaddr))
> > > > +   return (vaddr + info->phys_base - PAGE_OFFSET);
> > > > else
> > > > return (vaddr - kimage_voffset);
> > > >  }
> > > > @@ -253,6 +261,7 @@ static int calculate_plat_config(void)
> > > > (PAGESIZE() == SZ_64K && va_bits == 42)) {
> > > > pgtable_level = 2;
> > > > } else if ((PAGESIZE() == SZ_64K && va_bits == 48) ||
> > > > +   (PAGESIZE() == SZ_64K && va_bits == 52) ||
> > > > (PAGESIZE() == SZ_4K && va_bits == 39) ||
> > > > (PAGESIZE() == SZ_16K && va_bits == 47)) {
> > > > pgtable_level = 3;
> > > > @@ -287,6 +296,16 @@ get_phys_base_arm64(void)
> > > > return TRUE;
> > > > }
> > > >
> > > > +   /* If both vabits_actual and va_bits are now initialized, always
> > > > +* prefer vabits_actual over va_bits to calculate PAGE_OFFSET
> > > > +* value.
> > > > +*/
> > > > +   if (vabits_actual && va_bits && vabits_actual != va_bits) {
> > > > +   info->page_offset = (-(1UL << vabits_actual));
> > > > +   DEBUG_MSG("page_offset: %lx (via vabits_actual)\n",
> > > > +   info->page_offset);
> > > > +   }
> > > > +
> > >
> > > Is this for --mem-usage?
> > > If so, let's drop from this patch and think about it later because
> > > some additional base functions will be needed for the option, I think.
> 
> Ok.
> 
> > > > if (get_num_pt_loads() && PAGE_OFFSET) {
> > > > for (i = 0;
> > > > get_pt_load(i, _start, NULL, _start, NULL);
> > > > @@ -406,6 +425,73 @@ get_stext_symbol(void)
> > > > return(found ? kallsym : FALSE);
> > > >  }
> > > >
> > > > +static int
> > > > +get_va_bits_from_stext_arm64(void)
> > > > +{
> > > > +   ulong _stext;
> > > > +
> > > > +   _stext = get_stext_symbol();
> > > > +   if (!_stext) {
> > > > +   ERRMSG("Can't get the symbol of _stext.\n");
> > > > +   return FALSE;
> > > > +   }
> > > > +
> > > > +   /* Derive va_bits as per arch/arm64/Kconfig. Note that this is a
> > > > +* best case approximation at the moment, as there can be
> > > > +* inconsistencies in this calculation (for e.g., for
> > > > +* 52-bit kernel VA case, even the 48th bit might be set in
> > > > +* the _stext symbol).
> > > > +*
> > > > +* So, we need to rely on the actual VA_BITS symbol in the
> > > > +* vmcoreinfo for a accurate value.
> > > > +*
> > > > +* TODO: Improve this further once there is a closure with arm64
> > > > +* kernel maintainers on the same.
> > > > +*/
> > > > +   if ((_stext & PAGE_OFFSET_52) == PAGE_OFFSET_52) {
> > > > +   va_bits = 52;
> > > > +   } else if ((_stext & PAGE_OFFSET_48) == PAGE_OFFSET_48) {
> > > > +   va_bits = 48;
> > > > +   } else if ((_stext & PAGE_OFFSET_47) == PAGE_OFFSET_47) {
> > > > +   va_bits = 47;
> > > > +   } else if ((_stext & PAGE_OFFSET_42) == PAGE_OFFSET_42) {
> > > > +   va_bits = 42;
> > > > +   } else if ((_stext & PAGE_OFFSET_39) == PAGE_OFFSET_39) {
> > > > +   va_bits = 39;
> > > > + 

RE: [PATCH v4 2/4] makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support)

2019-12-05 Thread Kazuhito Hagio
> -Original Message-
> Hi Kazu,
> 
> On Wed, Dec 4, 2019 at 11:07 PM Kazuhito Hagio  wrote:
> >
> > > -Original Message-
> > > ARMv8.2-LPA architecture extension (if available on underlying hardware)
> > > can support 52-bit physical addresses, while the kernel virtual
> > > addresses remain 48-bit.
> > >
> > > Make sure that we read the 52-bit PA address capability from
> > > 'MAX_PHYSMEM_BITS' variable (if available in vmcoreinfo) and
> > > accordingly change the pte_to_phy() mask values and also traverse
> > > the page-table walk accordingly.
> > >
> > > Also make sure that it works well for the existing 48-bit PA address
> > > platforms and also on environments which use newer kernels with 52-bit
> > > PA support but hardware which is not ARM8.2-LPA compliant.
> > >
> > > I have sent a kernel patch upstream to add 'MAX_PHYSMEM_BITS' to
> > > vmcoreinfo for arm64 (see [0]).
> > >
> > > This patch is in accordance with ARMv8 Architecture Reference Manual
> > > version D.a
> > >
> > > [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023960.html
> > >
> > > Cc: Kazuhito Hagio 
> > > Cc: John Donnelly 
> > > Cc: kexec@lists.infradead.org
> > > Signed-off-by: Bhupesh Sharma 
> > > ---
> > >  arch/arm64.c | 292 
> > > +--
> > >  1 file changed, 204 insertions(+), 88 deletions(-)
> > >
> > > diff --git a/arch/arm64.c b/arch/arm64.c
> > > index 3516b340adfd..ecb19139e178 100644
> > > --- a/arch/arm64.c
> > > +++ b/arch/arm64.c
> > > @@ -39,72 +39,184 @@ typedef struct {
> > >   unsigned long pte;
> > >  } pte_t;
> > >
> >
> > > +#define __pte(x) ((pte_t) { (x) } )
> > > +#define __pmd(x) ((pmd_t) { (x) } )
> > > +#define __pud(x) ((pud_t) { (x) } )
> > > +#define __pgd(x) ((pgd_t) { (x) } )
> >
> > Is it possible to remove these macros?
> 
> Ok, will fix in v5.
> 
> > > +
> > > +static int lpa_52_bit_support_available;
> > >  static int pgtable_level;
> > >  static int va_bits;
> > >  static unsigned long kimage_voffset;
> > >
> > > -#define SZ_4K(4 * 1024)
> > > -#define SZ_16K   (16 * 1024)
> > > -#define SZ_64K   (64 * 1024)
> > > -#define SZ_128M  (128 * 1024 * 1024)
> > > +#define SZ_4K4096
> > > +#define SZ_16K   16384
> > > +#define SZ_64K   65536
> > >
> > > -#define PAGE_OFFSET_36 ((0xUL) << 36)
> > > -#define PAGE_OFFSET_39 ((0xUL) << 39)
> > > -#define PAGE_OFFSET_42 ((0xUL) << 42)
> > > -#define PAGE_OFFSET_47 ((0xUL) << 47)
> > > -#define PAGE_OFFSET_48 ((0xUL) << 48)
> > > +#define PAGE_OFFSET_36   ((0xUL) << 36)
> > > +#define PAGE_OFFSET_39   ((0xUL) << 39)
> > > +#define PAGE_OFFSET_42   ((0xUL) << 42)
> > > +#define PAGE_OFFSET_47   ((0xUL) << 47)
> > > +#define PAGE_OFFSET_48   ((0xUL) << 48)
> > > +#define PAGE_OFFSET_52   ((0xUL) << 52)
> > >
> > >  #define pgd_val(x)   ((x).pgd)
> > >  #define pud_val(x)   (pgd_val((x).pgd))
> > >  #define pmd_val(x)   (pud_val((x).pud))
> > >  #define pte_val(x)   ((x).pte)
> > >
> > > -#define PAGE_MASK(~(PAGESIZE() - 1))
> > > -#define PGDIR_SHIFT  ((PAGESHIFT() - 3) * pgtable_level + 3)
> > > -#define PTRS_PER_PGD (1 << (va_bits - PGDIR_SHIFT))
> > > -#define PUD_SHIFTget_pud_shift_arm64()
> > > -#define PUD_SIZE (1UL << PUD_SHIFT)
> > > -#define PUD_MASK (~(PUD_SIZE - 1))
> > > -#define PTRS_PER_PTE (1 << (PAGESHIFT() - 3))
> > > -#define PTRS_PER_PUD PTRS_PER_PTE
> > > -#define PMD_SHIFT((PAGESHIFT() - 3) * 2 + 3)
> > > -#define PMD_SIZE (1UL << PMD_SHIFT)
> > > -#define PMD_MASK (~(PMD_SIZE - 1))
> >
> > > +/* See 'include/uapi/linux/const.h' for definitions below */
> > > +#define __AC(X,Y)(X##Y)
> > > +#define _AC(X,Y) __AC(X,Y)
> > > +#define _AT(T,X) ((T)(X))
> > > +
> > > +/* See 'include/asm/pgtable-types.h' for definitions below */
> > > +typedef unsigned long pteval_t;
> > > +typedef unsigned long pmdval_t;
> > > +typedef unsigned long pudval_t;
> > > +typedef unsigned long pgdval_t;
> >
> > Is it possible to remove these macros/typedefs as well?
> > I don't think they make the code easier to read..
> 
> Ok. The idea behind it was to keep the makedumpfile(user-space)
> page-table-walk code similar to Linux kernel (as much as possible), so
> that if we identify an issue in user-space code, it is easier to
> correlate it with the corresponding kernel code.
> 
> I will try to see how this can be addressed while making the code
> easier to read/understand.

Thank you for understanding.

Yes, 

RE: [PATCH v4 1/4] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available)

2019-12-05 Thread Kazuhito Hagio
Hi Bhupesh,

> -Original Message-
> > > -Original Message-
> > > This patch adds a common feature for archs (except arm64, for which
> > > similar support is added via subsequent patch) to retrieve
> > > 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available).
> >
> > We already have the calibrate_machdep_info() function, which sets
> > info->max_physmem_bits from vmcoreinfo, so practically we don't need
> > to add this patch for the benefit.

I meant that we already have an arch-independent setter for 
info->max_physmem_bits:

 3714 int
 3715 calibrate_machdep_info(void)
 3716 {
 3717 if (NUMBER(MAX_PHYSMEM_BITS) > 0)
 3718 info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
 3719 
 3720 if (NUMBER(SECTION_SIZE_BITS) > 0)
 3721 info->section_size_bits = NUMBER(SECTION_SIZE_BITS);
 3722 
 3723 return TRUE;
 3724 }

so if NUMBER(MAX_PHYSMEM_BITS) appears, it is automatically used in makedumpfile
without this patch 1/4.

Thanks,
Kazu

> 
> Since other user-space tools like crash use the 'MAX_PHYSMEM_BITS' value as 
> well
> it was agreed with the arm64 maintainers that it would be a good
> approach to export the
> same in vmcoreinfo and not use different methods to determine the same
> in user-space.
> 
> Take an example of the PPC makedumpfile implementation for example. It
> uses the following complex method of dtereming
> 'info->max_physmem_bits':
> int
> set_ppc64_max_physmem_bits(void)
> {
> long array_len = ARRAY_LENGTH(mem_section);
> /*
>  * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
>  * newer kernels 3.7 onwards uses 46 bits.
>  */
> 
> info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT(
> return TRUE;
> 
> info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_7;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT(
> return TRUE;
> 
> info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_19;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT(
> return TRUE;
> 
> info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_20;
> if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> || (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT(
> return TRUE;
> 
> return FALSE;
> }
> 
> This might need modification and introduction of another
> _MAX_PHYSMEM_BITS_x_y macro when this changes for a newer kernel
> version.
> 
> I think this makes the code error-prone and hard to read. Its much
> better to replace it with:
> /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
> info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> return TRUE;
> } else {
> ..
> }
> 
> I think it will reduce future reworks (as per kernel versions) and
> also reduce issues while backporting makedumpfile to older kernels.
> 
> What do you think?
> 
> Regards,
> Bhupesh
> > > I recently posted a kernel patch (see [0]) which appends
> > > 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than
> > > in arch-specific code, so that user-space code can also benefit from
> > > this addition to the vmcoreinfo and use it as a standard way of
> > > determining 'SECTIONS_SHIFT' value in 'makedumpfile' utility.
> > >
> > > This patch ensures backward compatibility for kernel versions in which
> > > 'MAX_PHYSMEM_BITS' is not available in vmcoreinfo.
> > >
> > > [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023960.html
> > >
> > > Cc: Kazuhito Hagio 
> > > Cc: John Donnelly 
> > > Cc: kexec@lists.infradead.org
> > > Signed-off-by: Bhupesh Sharma 
> > > ---
> > >  arch/arm.c |  8 +++-
> > >  arch/ia64.c|  7 ++-
> > >  arch/ppc.c |  8 +++-
> > >  arch/ppc64.c   | 49 -
> > >  arch/s390x.c   | 29 ++---
> > >  arch/sparc64.c |  9 +++--
> > >  arch/x86.c | 34 --
> > >  arch/x86_64.c  | 27 ---
> > >  8 files changed, 109 insertions(+), 62 deletions(-)
> > >
> > > diff --git a/arch/arm.c b/arch/arm.c
> > > index af7442ac70bf..33536fc4dfc9 100644
> > > --- a/arch/arm.c
> > > +++ b/arch/arm.c
> > > @@ -81,7 +81,13 @@ int
> > >  get_machdep_info_arm(void)
> > >  {
> > >   info->page_offset = SYMBOL(_stext) & 0xUL;
> > > - info->max_physmem_bits = _MAX_PHYSMEM_BITS;
> > > +
> > > + /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> > > + if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> > > + info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> > > + 

Re: [PATCH v4 4/4] makedumpfile: Mark --mem-usage option unsupported for arm64

2019-12-05 Thread Bhupesh Sharma
Hi Kazu,

On Wed, Dec 4, 2019 at 11:20 PM Kazuhito Hagio  wrote:
>
> > -Original Message-
> > This patch marks '--mem-usage' option as unsupported for arm64
> > architecture.
> >
> > With the newer arm64 kernels supporting 48-bit/52-bit VA address spaces
> > and keeping a single binary for supporting the same, the address of
> > kernel symbols like _stext which could be earlier used to determine
> > VA_BITS value, can no longer to determine whether VA_BITS is set to 48
> > or 52 in the kernel space.
>
> The --mem-usage option works with older arm64 kernels, so we should not
> mark it unsupported for all arm64 kernels.
>
> (If we use ELF note vmcoreinfo in kcore, is it possible to support the
> option?  Let's think about it later..)

Ok, I am in the process of discussing this with arm64 maintainers in
detail as _stext symbol address can no longer be used to separate
48-bit v/s 52-bit kernel VA space configurations.

Also other user-space utilities like 'kexec-tools' also face a similar
problem with the 52-bit change (as the vmcore-dmesg stops working).

I am currently caught up with another high priority issue. Will come
back with more thoughts on this in a couple of days.

Thanks,
Bhupesh

> > Hence for now, it makes sense to mark '--mem-usage' option as
> > unsupported for arm64 architecture until we have more clarity from arm64
> > kernel maintainers on how to manage the same in future
> > kernel/makedumpfile versions.
> >
> > Cc: John Donnelly 
> > Cc: Kazuhito Hagio 
> > Cc: kexec@lists.infradead.org
> > Signed-off-by: Bhupesh Sharma 
> > ---
> >  makedumpfile.c | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/makedumpfile.c b/makedumpfile.c
> > index baf559e4d74e..ae60466a1e9c 100644
> > --- a/makedumpfile.c
> > +++ b/makedumpfile.c
> > @@ -11564,6 +11564,11 @@ main(int argc, char *argv[])
> >   MSG("\n");
> >   MSG("The dmesg log is saved to %s.\n", info->name_dumpfile);
> >   } else if (info->flag_mem_usage) {
> > +#ifdef __aarch64__
> > + MSG("mem-usage not supported for arm64 architecure.\n");
> > + goto out;
> > +#endif
> > +
> >   if (!check_param_for_creating_dumpfile(argc, argv)) {
> >   MSG("Commandline parameter is invalid.\n");
> >   MSG("Try `makedumpfile --help' for more 
> > information.\n");
> > --
> > 2.7.4
> >
>
>
>
> ___
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4 2/4] makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support)

2019-12-05 Thread Bhupesh Sharma
Hi Kazu,

On Wed, Dec 4, 2019 at 11:07 PM Kazuhito Hagio  wrote:
>
> > -Original Message-
> > ARMv8.2-LPA architecture extension (if available on underlying hardware)
> > can support 52-bit physical addresses, while the kernel virtual
> > addresses remain 48-bit.
> >
> > Make sure that we read the 52-bit PA address capability from
> > 'MAX_PHYSMEM_BITS' variable (if available in vmcoreinfo) and
> > accordingly change the pte_to_phy() mask values and also traverse
> > the page-table walk accordingly.
> >
> > Also make sure that it works well for the existing 48-bit PA address
> > platforms and also on environments which use newer kernels with 52-bit
> > PA support but hardware which is not ARM8.2-LPA compliant.
> >
> > I have sent a kernel patch upstream to add 'MAX_PHYSMEM_BITS' to
> > vmcoreinfo for arm64 (see [0]).
> >
> > This patch is in accordance with ARMv8 Architecture Reference Manual
> > version D.a
> >
> > [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023960.html
> >
> > Cc: Kazuhito Hagio 
> > Cc: John Donnelly 
> > Cc: kexec@lists.infradead.org
> > Signed-off-by: Bhupesh Sharma 
> > ---
> >  arch/arm64.c | 292 
> > +--
> >  1 file changed, 204 insertions(+), 88 deletions(-)
> >
> > diff --git a/arch/arm64.c b/arch/arm64.c
> > index 3516b340adfd..ecb19139e178 100644
> > --- a/arch/arm64.c
> > +++ b/arch/arm64.c
> > @@ -39,72 +39,184 @@ typedef struct {
> >   unsigned long pte;
> >  } pte_t;
> >
>
> > +#define __pte(x) ((pte_t) { (x) } )
> > +#define __pmd(x) ((pmd_t) { (x) } )
> > +#define __pud(x) ((pud_t) { (x) } )
> > +#define __pgd(x) ((pgd_t) { (x) } )
>
> Is it possible to remove these macros?

Ok, will fix in v5.

> > +
> > +static int lpa_52_bit_support_available;
> >  static int pgtable_level;
> >  static int va_bits;
> >  static unsigned long kimage_voffset;
> >
> > -#define SZ_4K(4 * 1024)
> > -#define SZ_16K   (16 * 1024)
> > -#define SZ_64K   (64 * 1024)
> > -#define SZ_128M  (128 * 1024 * 1024)
> > +#define SZ_4K4096
> > +#define SZ_16K   16384
> > +#define SZ_64K   65536
> >
> > -#define PAGE_OFFSET_36 ((0xUL) << 36)
> > -#define PAGE_OFFSET_39 ((0xUL) << 39)
> > -#define PAGE_OFFSET_42 ((0xUL) << 42)
> > -#define PAGE_OFFSET_47 ((0xUL) << 47)
> > -#define PAGE_OFFSET_48 ((0xUL) << 48)
> > +#define PAGE_OFFSET_36   ((0xUL) << 36)
> > +#define PAGE_OFFSET_39   ((0xUL) << 39)
> > +#define PAGE_OFFSET_42   ((0xUL) << 42)
> > +#define PAGE_OFFSET_47   ((0xUL) << 47)
> > +#define PAGE_OFFSET_48   ((0xUL) << 48)
> > +#define PAGE_OFFSET_52   ((0xUL) << 52)
> >
> >  #define pgd_val(x)   ((x).pgd)
> >  #define pud_val(x)   (pgd_val((x).pgd))
> >  #define pmd_val(x)   (pud_val((x).pud))
> >  #define pte_val(x)   ((x).pte)
> >
> > -#define PAGE_MASK(~(PAGESIZE() - 1))
> > -#define PGDIR_SHIFT  ((PAGESHIFT() - 3) * pgtable_level + 3)
> > -#define PTRS_PER_PGD (1 << (va_bits - PGDIR_SHIFT))
> > -#define PUD_SHIFTget_pud_shift_arm64()
> > -#define PUD_SIZE (1UL << PUD_SHIFT)
> > -#define PUD_MASK (~(PUD_SIZE - 1))
> > -#define PTRS_PER_PTE (1 << (PAGESHIFT() - 3))
> > -#define PTRS_PER_PUD PTRS_PER_PTE
> > -#define PMD_SHIFT((PAGESHIFT() - 3) * 2 + 3)
> > -#define PMD_SIZE (1UL << PMD_SHIFT)
> > -#define PMD_MASK (~(PMD_SIZE - 1))
>
> > +/* See 'include/uapi/linux/const.h' for definitions below */
> > +#define __AC(X,Y)(X##Y)
> > +#define _AC(X,Y) __AC(X,Y)
> > +#define _AT(T,X) ((T)(X))
> > +
> > +/* See 'include/asm/pgtable-types.h' for definitions below */
> > +typedef unsigned long pteval_t;
> > +typedef unsigned long pmdval_t;
> > +typedef unsigned long pudval_t;
> > +typedef unsigned long pgdval_t;
>
> Is it possible to remove these macros/typedefs as well?
> I don't think they make the code easier to read..

Ok. The idea behind it was to keep the makedumpfile(user-space)
page-table-walk code similar to Linux kernel (as much as possible), so
that if we identify an issue in user-space code, it is easier to
correlate it with the corresponding kernel code.

I will try to see how this can be addressed while making the code
easier to read/understand.

Regards,
Bhupesh

> > +
> > +#define PAGE_SHIFT   PAGESHIFT()
> > +
> > +/* See 'arch/arm64/include/asm/pgtable-hwdef.h' for definitions below */
> > +
> > +#define ARM64_HW_PGTABLE_LEVEL_SHIFT(n)  ((PAGE_SHIFT - 3) * (4 - (n)) 
> > + 3)
> > +
> > +#define PTRS_PER_PTE (1 << (PAGE_SHIFT - 

Re: [PATCH v4 1/4] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available)

2019-12-05 Thread Bhupesh Sharma
Hi Kazu,

On Wed, Dec 4, 2019 at 11:05 PM Kazuhito Hagio  wrote:
>
> Hi Bhupesh,
>
> Sorry for the late reply.

No problem.

> > -Original Message-
> > This patch adds a common feature for archs (except arm64, for which
> > similar support is added via subsequent patch) to retrieve
> > 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available).
>
> We already have the calibrate_machdep_info() function, which sets
> info->max_physmem_bits from vmcoreinfo, so practically we don't need
> to add this patch for the benefit.

Since other user-space tools like crash use the 'MAX_PHYSMEM_BITS' value as well
it was agreed with the arm64 maintainers that it would be a good
approach to export the
same in vmcoreinfo and not use different methods to determine the same
in user-space.

Take an example of the PPC makedumpfile implementation for example. It
uses the following complex method of dtereming
'info->max_physmem_bits':
int
set_ppc64_max_physmem_bits(void)
{
long array_len = ARRAY_LENGTH(mem_section);
/*
 * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
 * newer kernels 3.7 onwards uses 46 bits.
 */

info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT(
return TRUE;

info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_7;
if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT(
return TRUE;

info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_19;
if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT(
return TRUE;

info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_20;
if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT(
return TRUE;

return FALSE;
}

This might need modification and introduction of another
_MAX_PHYSMEM_BITS_x_y macro when this changes for a newer kernel
version.

I think this makes the code error-prone and hard to read. Its much
better to replace it with:
/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
return TRUE;
} else {
..
}

I think it will reduce future reworks (as per kernel versions) and
also reduce issues while backporting makedumpfile to older kernels.

What do you think?

Regards,
Bhupesh
> > I recently posted a kernel patch (see [0]) which appends
> > 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than
> > in arch-specific code, so that user-space code can also benefit from
> > this addition to the vmcoreinfo and use it as a standard way of
> > determining 'SECTIONS_SHIFT' value in 'makedumpfile' utility.
> >
> > This patch ensures backward compatibility for kernel versions in which
> > 'MAX_PHYSMEM_BITS' is not available in vmcoreinfo.
> >
> > [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023960.html
> >
> > Cc: Kazuhito Hagio 
> > Cc: John Donnelly 
> > Cc: kexec@lists.infradead.org
> > Signed-off-by: Bhupesh Sharma 
> > ---
> >  arch/arm.c |  8 +++-
> >  arch/ia64.c|  7 ++-
> >  arch/ppc.c |  8 +++-
> >  arch/ppc64.c   | 49 -
> >  arch/s390x.c   | 29 ++---
> >  arch/sparc64.c |  9 +++--
> >  arch/x86.c | 34 --
> >  arch/x86_64.c  | 27 ---
> >  8 files changed, 109 insertions(+), 62 deletions(-)
> >
> > diff --git a/arch/arm.c b/arch/arm.c
> > index af7442ac70bf..33536fc4dfc9 100644
> > --- a/arch/arm.c
> > +++ b/arch/arm.c
> > @@ -81,7 +81,13 @@ int
> >  get_machdep_info_arm(void)
> >  {
> >   info->page_offset = SYMBOL(_stext) & 0xUL;
> > - info->max_physmem_bits = _MAX_PHYSMEM_BITS;
> > +
> > + /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> > + if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> > + info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> > + else
> > + info->max_physmem_bits = _MAX_PHYSMEM_BITS;
> > +
> >   info->kernel_start = SYMBOL(_stext);
> >   info->section_size_bits = _SECTION_SIZE_BITS;
> >
> > diff --git a/arch/ia64.c b/arch/ia64.c
> > index 6c33cc7c8288..fb44dda47172 100644
> > --- a/arch/ia64.c
> > +++ b/arch/ia64.c
> > @@ -85,7 +85,12 @@ get_machdep_info_ia64(void)
> >   }
> >
> >   info->section_size_bits = _SECTION_SIZE_BITS;
> > - info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
> > +
> > + /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> > + if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> > + info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> > + 

Re: [PATCH v4 3/4] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support)

2019-12-05 Thread Bhupesh Sharma
Hi Kazu,

On Thu, Dec 5, 2019 at 9:00 PM Kazuhito Hagio  wrote:
>
> > -Original Message-
> > > -Original Message-
> > > With ARMv8.2-LVA architecture extension availability, arm64 hardware
> > > which supports this extension can support upto 52-bit virtual
> > > addresses. It is specially useful for having a 52-bit user-space virtual
> > > address space while the kernel can still retain 48-bit/52-bit virtual
> > > addressing.
> > >
> > > Since at the moment we enable the support of this extension in the
> > > kernel via a CONFIG flag (CONFIG_ARM64_VA_BITS_52), so there are
> > > no clear mechanisms in user-space to determine this CONFIG
> > > flag value and use it to determine the kernel-space VA address range
> > > values.
> > >
> > > 'makedumpfile' can instead use 'TCR_EL1.T1SZ' value from vmcoreinfo
> > > which indicates the size offset of the memory region addressed by
> > > TTBR1_EL1 (and hence can be used for determining the
> > > vabits_actual value).
> > >
> > > The user-space computation for determining whether an address lies in
> > > the linear map range is the same as we have in kernel-space:
> > >
> > >   #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 
> > > 1)))
> > >
> > > I have sent a kernel patch upstream to add 'TCR_EL1.T1SZ' to
> > > vmcoreinfo for arm64 (see [0]).
> > >
> > > This patch is in accordance with ARMv8 Architecture Reference Manual
> > > version D.a
> > >
> > > Note that with these changes the '--mem-usage' option will not work
> > > properly for arm64 (a subsequent patch in this series will address the
> > > same) and there is a discussion on-going with the arm64 maintainers to
> > > find a way-out for the same (via standard kernel symbols like _stext).
> > >
> > > [0].http://lists.infradead.org/pipermail/kexec/2019-November/023962.html
> > >
> > > Cc: Kazuhito Hagio 
> > > Cc: John Donnelly 
> > > Cc: kexec@lists.infradead.org
> > > Signed-off-by: Bhupesh Sharma 
> > > ---
> > >  arch/arm64.c   | 148 
> > > +
> > >  makedumpfile.c |   2 +
> > >  makedumpfile.h |   3 +-
> > >  3 files changed, 122 insertions(+), 31 deletions(-)
> > >
> > > diff --git a/arch/arm64.c b/arch/arm64.c
> > > index ecb19139e178..094d73b8a60f 100644
> > > --- a/arch/arm64.c
> > > +++ b/arch/arm64.c
> > > @@ -47,6 +47,7 @@ typedef struct {
> > >  static int lpa_52_bit_support_available;
> > >  static int pgtable_level;
> > >  static int va_bits;
> > > +static int vabits_actual;
> > >  static unsigned long kimage_voffset;
> > >
> > >  #define SZ_4K  4096
> > > @@ -218,12 +219,19 @@ pmd_page_paddr(pmd_t pmd)
> > >  #define pte_index(vaddr)   (((vaddr) >> PAGESHIFT()) & 
> > > (PTRS_PER_PTE - 1))
> > >  #define pte_offset(dir, vaddr) (pmd_page_paddr((*dir)) + 
> > > pte_index(vaddr) * sizeof(pte_t))
> > >
> > > +/*
> > > + * The linear kernel range starts at the bottom of the virtual address
> > > + * space. Testing the top bit for the start of the region is a
> > > + * sufficient check and avoids having to worry about the tag.
> > > + */
> > > +#define is_linear_addr(addr)   (!(((unsigned long)addr) & (1UL << 
> > > (vabits_actual - 1
> >
> > Does this check cover 5.3 or earlier kernels?
> > There is no case that vabits_actual is zero?

We can set vabits_actual as va_bits for older kernels. That shouldn't
be a big change.
Will add it in v5. See more below ...

> As you know, 14c127c957c1 ("arm64: mm: Flip kernel VA space") changed
> the check for linear address:
>
> -#define __is_lm_address(addr)  (!!((addr) & BIT(VA_BITS - 1)))
> +#define __is_lm_address(addr)  (!((addr) & BIT(VA_BITS - 1)))
>
> so if we use the same check as kernel has, I think we will need the
> former one to support earlier kernels.

See above, we can use va_bits where vabits_actual is not present.

> > > +
> > >  static unsigned long long
> > >  __pa(unsigned long vaddr)
> > >  {
> > > if (kimage_voffset == NOT_FOUND_NUMBER ||
> > > -   (vaddr >= PAGE_OFFSET))
> > > -   return (vaddr - PAGE_OFFSET + info->phys_base);
> > > +   is_linear_addr(vaddr))
> > > +   return (vaddr + info->phys_base - PAGE_OFFSET);
> > > else
> > > return (vaddr - kimage_voffset);
> > >  }
> > > @@ -253,6 +261,7 @@ static int calculate_plat_config(void)
> > > (PAGESIZE() == SZ_64K && va_bits == 42)) {
> > > pgtable_level = 2;
> > > } else if ((PAGESIZE() == SZ_64K && va_bits == 48) ||
> > > +   (PAGESIZE() == SZ_64K && va_bits == 52) ||
> > > (PAGESIZE() == SZ_4K && va_bits == 39) ||
> > > (PAGESIZE() == SZ_16K && va_bits == 47)) {
> > > pgtable_level = 3;
> > > @@ -287,6 +296,16 @@ get_phys_base_arm64(void)
> > > return TRUE;
> > > }
> > >
> > > +   /* If both vabits_actual and va_bits are now initialized, always
> > > +* 

Re: [PATCH v4 7/8] linux/log2.h: Fix 64bit calculations in roundup/down_pow_two()

2019-12-05 Thread Robin Murphy

On 03/12/2019 11:47 am, Nicolas Saenz Julienne wrote:

Some users need to make sure their rounding function accepts and returns
64bit long variables regardless of the architecture. Sadly
roundup/rounddown_pow_two() takes and returns unsigned longs. It turns
out ilog2() already handles 32/64bit calculations properly, and being
the building block to the round functions we can rework them as a
wrapper around it.


Neat! Although all the additional ULL casts this introduces seem 
somewhat unwelcome - I suppose the (1ULL << (ilog2(n))) makes it 
effectively always return unsigned long long now. Might it make sense to 
cast the return value to typeof(n) to avoid this slightly non-obvious 
behaviour (and the associated churn)?


Robin.


Suggested-by: Robin Murphy 
Signed-off-by: Nicolas Saenz Julienne 
---
  drivers/clk/clk-divider.c|  8 ++--
  drivers/clk/sunxi/clk-sunxi.c|  2 +-
  drivers/infiniband/hw/hfi1/chip.c|  4 +-
  drivers/infiniband/hw/hfi1/init.c|  4 +-
  drivers/infiniband/hw/mlx4/srq.c |  2 +-
  drivers/infiniband/hw/mthca/mthca_srq.c  |  2 +-
  drivers/infiniband/sw/rxe/rxe_qp.c   |  4 +-
  drivers/iommu/intel-iommu.c  |  4 +-
  drivers/iommu/intel-svm.c|  4 +-
  drivers/iommu/intel_irq_remapping.c  |  2 +-
  drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c |  4 +-
  drivers/net/ethernet/marvell/sky2.c  |  2 +-
  drivers/net/ethernet/rocker/rocker_hw.h  |  4 +-
  drivers/net/ethernet/sfc/ef10.c  |  2 +-
  drivers/net/ethernet/sfc/efx.h   |  2 +-
  drivers/net/ethernet/sfc/falcon/efx.h|  2 +-
  drivers/pci/msi.c|  2 +-
  include/linux/log2.h | 44 +---
  kernel/kexec_core.c  |  3 +-
  lib/rhashtable.c |  2 +-
  net/sunrpc/xprtrdma/verbs.c  |  2 +-
  21 files changed, 41 insertions(+), 64 deletions(-)

diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c
index 098b2b01f0af..ba947e4c8193 100644
--- a/drivers/clk/clk-divider.c
+++ b/drivers/clk/clk-divider.c
@@ -222,7 +222,7 @@ static int _div_round_up(const struct clk_div_table *table,
int div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
  
  	if (flags & CLK_DIVIDER_POWER_OF_TWO)

-   div = __roundup_pow_of_two(div);
+   div = roundup_pow_of_two(div);
if (table)
div = _round_up_table(table, div);
  
@@ -240,8 +240,8 @@ static int _div_round_closest(const struct clk_div_table *table,

down = parent_rate / rate;
  
  	if (flags & CLK_DIVIDER_POWER_OF_TWO) {

-   up = __roundup_pow_of_two(up);
-   down = __rounddown_pow_of_two(down);
+   up = roundup_pow_of_two(up);
+   down = rounddown_pow_of_two(down);
} else if (table) {
up = _round_up_table(table, up);
down = _round_down_table(table, down);
@@ -278,7 +278,7 @@ static int _next_div(const struct clk_div_table *table, int 
div,
div++;
  
  	if (flags & CLK_DIVIDER_POWER_OF_TWO)

-   return __roundup_pow_of_two(div);
+   return roundup_pow_of_two(div);
if (table)
return _round_up_table(table, div);
  
diff --git a/drivers/clk/sunxi/clk-sunxi.c b/drivers/clk/sunxi/clk-sunxi.c

index 27201fd26e44..faec99dc09c0 100644
--- a/drivers/clk/sunxi/clk-sunxi.c
+++ b/drivers/clk/sunxi/clk-sunxi.c
@@ -311,7 +311,7 @@ static void sun6i_get_ahb1_factors(struct factors_request 
*req)
  
  		calcm = DIV_ROUND_UP(div, 1 << calcp);

} else {
-   calcp = __roundup_pow_of_two(div);
+   calcp = roundup_pow_of_two(div);
calcp = calcp > 3 ? 3 : calcp;
}
  
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c

index 9b1fb84a3d45..96b1d343c32f 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -14199,10 +14199,10 @@ static int qos_rmt_entries(struct hfi1_devdata *dd, 
unsigned int *mp,
max_by_vl = krcvqs[i];
if (max_by_vl > 32)
goto no_qos;
-   m = ilog2(__roundup_pow_of_two(max_by_vl));
+   m = ilog2(roundup_pow_of_two(max_by_vl));
  
  	/* determine bits for vl */

-   n = ilog2(__roundup_pow_of_two(num_vls));
+   n = ilog2(roundup_pow_of_two(num_vls));
  
  	/* reject if too much is used */

if ((m + n) > 7)
diff --git a/drivers/infiniband/hw/hfi1/init.c 
b/drivers/infiniband/hw/hfi1/init.c
index 26b792bb1027..838c789c7cce 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -467,7 +467,7 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int 
numa,
 * MTU supported.
 */
if (rcd->egrbufs.size < hfi1_max_mtu) {
-

RE: [PATCH v4 3/4] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support)

2019-12-05 Thread Kazuhito Hagio
> -Original Message-
> > -Original Message-
> > With ARMv8.2-LVA architecture extension availability, arm64 hardware
> > which supports this extension can support upto 52-bit virtual
> > addresses. It is specially useful for having a 52-bit user-space virtual
> > address space while the kernel can still retain 48-bit/52-bit virtual
> > addressing.
> >
> > Since at the moment we enable the support of this extension in the
> > kernel via a CONFIG flag (CONFIG_ARM64_VA_BITS_52), so there are
> > no clear mechanisms in user-space to determine this CONFIG
> > flag value and use it to determine the kernel-space VA address range
> > values.
> >
> > 'makedumpfile' can instead use 'TCR_EL1.T1SZ' value from vmcoreinfo
> > which indicates the size offset of the memory region addressed by
> > TTBR1_EL1 (and hence can be used for determining the
> > vabits_actual value).
> >
> > The user-space computation for determining whether an address lies in
> > the linear map range is the same as we have in kernel-space:
> >
> >   #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 
> > 1)))
> >
> > I have sent a kernel patch upstream to add 'TCR_EL1.T1SZ' to
> > vmcoreinfo for arm64 (see [0]).
> >
> > This patch is in accordance with ARMv8 Architecture Reference Manual
> > version D.a
> >
> > Note that with these changes the '--mem-usage' option will not work
> > properly for arm64 (a subsequent patch in this series will address the
> > same) and there is a discussion on-going with the arm64 maintainers to
> > find a way-out for the same (via standard kernel symbols like _stext).
> >
> > [0].http://lists.infradead.org/pipermail/kexec/2019-November/023962.html
> >
> > Cc: Kazuhito Hagio 
> > Cc: John Donnelly 
> > Cc: kexec@lists.infradead.org
> > Signed-off-by: Bhupesh Sharma 
> > ---
> >  arch/arm64.c   | 148 
> > +
> >  makedumpfile.c |   2 +
> >  makedumpfile.h |   3 +-
> >  3 files changed, 122 insertions(+), 31 deletions(-)
> >
> > diff --git a/arch/arm64.c b/arch/arm64.c
> > index ecb19139e178..094d73b8a60f 100644
> > --- a/arch/arm64.c
> > +++ b/arch/arm64.c
> > @@ -47,6 +47,7 @@ typedef struct {
> >  static int lpa_52_bit_support_available;
> >  static int pgtable_level;
> >  static int va_bits;
> > +static int vabits_actual;
> >  static unsigned long kimage_voffset;
> >
> >  #define SZ_4K  4096
> > @@ -218,12 +219,19 @@ pmd_page_paddr(pmd_t pmd)
> >  #define pte_index(vaddr)   (((vaddr) >> PAGESHIFT()) & 
> > (PTRS_PER_PTE - 1))
> >  #define pte_offset(dir, vaddr) (pmd_page_paddr((*dir)) + 
> > pte_index(vaddr) * sizeof(pte_t))
> >
> > +/*
> > + * The linear kernel range starts at the bottom of the virtual address
> > + * space. Testing the top bit for the start of the region is a
> > + * sufficient check and avoids having to worry about the tag.
> > + */
> > +#define is_linear_addr(addr)   (!(((unsigned long)addr) & (1UL << 
> > (vabits_actual - 1
> 
> Does this check cover 5.3 or earlier kernels?
> There is no case that vabits_actual is zero?

As you know, 14c127c957c1 ("arm64: mm: Flip kernel VA space") changed
the check for linear address:

-#define __is_lm_address(addr)  (!!((addr) & BIT(VA_BITS - 1)))
+#define __is_lm_address(addr)  (!((addr) & BIT(VA_BITS - 1)))

so if we use the same check as kernel has, I think we will need the
former one to support earlier kernels.

> 
> > +
> >  static unsigned long long
> >  __pa(unsigned long vaddr)
> >  {
> > if (kimage_voffset == NOT_FOUND_NUMBER ||
> > -   (vaddr >= PAGE_OFFSET))
> > -   return (vaddr - PAGE_OFFSET + info->phys_base);
> > +   is_linear_addr(vaddr))
> > +   return (vaddr + info->phys_base - PAGE_OFFSET);
> > else
> > return (vaddr - kimage_voffset);
> >  }
> > @@ -253,6 +261,7 @@ static int calculate_plat_config(void)
> > (PAGESIZE() == SZ_64K && va_bits == 42)) {
> > pgtable_level = 2;
> > } else if ((PAGESIZE() == SZ_64K && va_bits == 48) ||
> > +   (PAGESIZE() == SZ_64K && va_bits == 52) ||
> > (PAGESIZE() == SZ_4K && va_bits == 39) ||
> > (PAGESIZE() == SZ_16K && va_bits == 47)) {
> > pgtable_level = 3;
> > @@ -287,6 +296,16 @@ get_phys_base_arm64(void)
> > return TRUE;
> > }
> >
> > +   /* If both vabits_actual and va_bits are now initialized, always
> > +* prefer vabits_actual over va_bits to calculate PAGE_OFFSET
> > +* value.
> > +*/
> > +   if (vabits_actual && va_bits && vabits_actual != va_bits) {
> > +   info->page_offset = (-(1UL << vabits_actual));
> > +   DEBUG_MSG("page_offset: %lx (via vabits_actual)\n",
> > +   info->page_offset);
> > +   }
> > +
> 
> Is this for --mem-usage?
> If so, let's drop from this patch and think about it later because
> some 

Re: [RFC PATCH v5 0/3] printk: new ringbuffer implementation

2019-12-05 Thread John Ogness
Hi Prarit,

On 2019-12-05, Prarit Bhargava  wrote:
> Based on the comments there is going to be a v6 but in any case I am
> starting testing of this patchset on several large core systems across
> multiple architectures (x86_64, ARM64, s390, ppc64le).  Some of those
> systems are known to fail boot due to the large amount of printk output so
> it will be good to see if these changes resolve those issues.

Right now the patches only include the ringbuffer as a separate entity
with a test module. So they do not yet have any effect on printk.

If you apply the patches and then build the "modules" target, you will
have a new test_prb.ko module. Loading that module will start some heavy
testing of the ringbuffer. As long as the testing is successful, the
module will keep testing. During this time the machine will be very
slow, but should still respond.

The test can be stopped by unloading the module. If the test stops on
its own, then a problem was found. The output of the test is put into
the ftrace buffer.

It would be nice if you could run the test module on some fat machines,
at least for a few minutes to see if anything explodes. ARM64 and
ppc64le will probably be the most interesting, due to memory barrier
testing.

Otherwise I will definitely be reaching out to you when we are ready to
perform actual printk testing with the newly agreed up semantics
(lockless, per-console printing threads, synchronous panic
consoles). Thanks for your help with this.

John Ogness

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 0/3] printk: new ringbuffer implementation

2019-12-05 Thread Prarit Bhargava
  John Ogness  wrote:
> Hello,
> 
> This is a follow-up RFC on the work to re-implement much of the
> core of printk. The threads for the previous RFC versions are
> here[0][1][2][3].
> 
> This RFC includes only the ringbuffer and a test module. This is
> a rewrite of the proposed ringbuffer, now based on the proof of
> concept[4] from Petr Mladek as agreed at the meeting[5] during
> LPC2019 in Lisbon.
> 
> The internal structure has been reworked such that the printk
> strings are in their own array, each separated by a 32-bit
> integer.
> 
> A 2nd array contains the dictionary strings (also with each
> separated by a 32-bit integer).
> 
> A 3rd array is made up of descriptors that contain all the
> meta-data for each printk record (sequence number, timestamp,
> loglevel, caller, etc.) as well as pointers into the other data
> arrays for the text and dictionary data.
> 
> The writer interface is somewhat similar to v4, but the reader
> interface has changed significantly. Rather than using an
> iterator object, readers just specify the sequence number they
> want to read. In effect, the sequence number acts as the
> iterator.
> 
> I have been communicating with Petr the last couple months to
> make sure this implementation fits his expectations. This RFC is
> mainly to get some feedback from anyone else that may see
> something that Petr and I have missed.
> 
> This series also includes my test module. On a 16-core ARM64
> test machine, the module runs without any errors. I am seeing
> the 15 writing cores each writing about 34500 records per
> second, while the 1 reading core misses only about 15% of the
> total records.
> 

John,

Based on the comments there is going to be a v6 but in any case I am
starting testing of this patchset on several large core systems across
multiple architectures (x86_64, ARM64, s390, ppc64le).  Some of those
systems are known to fail boot due to the large amount of printk output so
it will be good to see if these changes resolve those issues.

Sorry for the delay and I'll report back in a few days.

P.

> John Ogness
> 
> [0] https://lkml.kernel.org/r/20190212143003.48446-1-john.ogn...@linutronix.de
> [1] https://lkml.kernel.org/r/20190607162349.18199-1-john.ogn...@linutronix.de
> [2] https://lkml.kernel.org/r/2019072701.11260-1-john.ogn...@linutronix.de
> [3] https://lkml.kernel.org/r/20190807222634.1723-1-john.ogn...@linutronix.de
> [4] https://lkml.kernel.org/r/20190704103321.10022-1-pmla...@suse.com
> [5] https://lkml.kernel.org/r/87k1acz5rx@linutronix.de
> 
> John Ogness (3):
>   printk-rb: new printk ringbuffer implementation (writer)
>   printk-rb: new printk ringbuffer implementation (reader)
>   printk-rb: add test module
> 
>  kernel/printk/Makefile|   3 +
>  kernel/printk/printk_ringbuffer.c | 910 ++
>  kernel/printk/printk_ringbuffer.h | 249 
>  kernel/printk/test_prb.c  | 347 
>  4 files changed, 1509 insertions(+)
>  create mode 100644 kernel/printk/printk_ringbuffer.c
>  create mode 100644 kernel/printk/printk_ringbuffer.h
>  create mode 100644 kernel/printk/test_prb.c
> 
> -- 
> 2.20.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 1/3] printk-rb: new printk ringbuffer implementation (writer)

2019-12-05 Thread John Ogness
On 2019-12-03, Steven Rostedt  wrote:
> +/* Reserve a new descriptor, invalidating the oldest if necessary. */
> +static bool desc_reserve(struct printk_ringbuffer *rb, u32 *id_out)
> +{
> + struct prb_desc_ring *desc_ring = >desc_ring;
> + struct prb_desc *desc;
> + u32 id_prev_wrap;
> + u32 head_id;
> + u32 id;
> +
> + head_id = atomic_read(_ring->head_id);
> +
> + do {
> + desc = to_desc(desc_ring, head_id);
> +
> + id = DESC_ID(head_id + 1);
> + id_prev_wrap = DESC_ID_PREV_WRAP(desc_ring, id);
> +
> + if (id_prev_wrap == atomic_read(_ring->tail_id)) {
> + if (!desc_push_tail(rb, id_prev_wrap))
> + return false;
> + }
> + } while (!atomic_try_cmpxchg(_ring->head_id, _id, id));

 Hmm, in theory, ABA problem might cause that we successfully
 move desc_ring->head_id when tail has not been pushed yet.

 As a result we would never call desc_push_tail() until
 it overflows again.

 I am not sure if we need to take care of it. The code is called
 with interrupts disabled. IMHO, only NMI could cause ABA problem in
 reality. But the game (debugging) is lost anyway when NMI ovewrites
 the buffer several times.

 Also it should not be a complete failure. We still could bail out
 when the previous state was not reusable. We are on the safe side
 when it was reusable.

 Also we could probably detect when desc_ring->tail_id is not
 updated any longer and do something about it.

 BTW: If I am counting correctly. The ABA problem would happen when
 exactly 2^30 (1G) messages is written in the mean time.
>>> 
>>> All the ringbuffer code assumes that the use of index numbers
>>> handles the ABA problem (i.e. there must not be 1 billion printk's
>>> within an NMI). If we want to support 1 billion+ printk's within an
>>> NMI, then perhaps the index number should be increased. For 64-bit
>>> systems it would be no problem to go to 62 bits. For 32-bit systems,
>>> I don't know how well the 64-bit atomic operations are supported.
>> 
>> ftrace dumps from NMI (DUMP_ALL type ftrace_dump_on_oops on a $BIG
>> machine)? 1G seems large enough, but who knows.
>
> ftrace dump from NMI is the most likely case to hit this, but when
> that happens, you are in debugging mode, and the system usually
> becomes unreliable at this moment. I agree with Petr, that we should
> not complicate the code more to handle this theoretical condition.

I will change the data block ID size to "unsigned long". Then it is
really only an issue for 32-bit systems.

AFAICT, the only real issue is that the head can be pushed to the
descriptor index that the tail is pointing to. From there, the head can
be advanced beyond and the tail may never move again. So writers just
need to make sure the tail gets pushed beyond the newly reserved head
before making any changes to that descriptor.

Aside from moving to "unsigned long" ID's, I believe this can be handled
with the following 4 changes when reserving a new descriptor:

- also push the tail forward if it falls behind

- after pushing the head, re-check if the tail is still ok

- verify the state of the reserved descriptor before changing it

- use cmpxchg() to change it

Below is the patch snippet I use for this. (Note that the patch is
against a version where ID's have already been changed to unsigned
longs.)

John Ogness


@@ -468,19 +470,53 @@ static bool desc_reserve(struct printk_ringbuffer *rb, 
unsigned long *id_out)
id = DESC_ID(head_id + 1);
id_prev_wrap = DESC_ID_PREV_WRAP(desc_ring, id);
 
-   if (id_prev_wrap == atomic_long_read(_ring->tail_id)) {
-   if (!desc_push_tail(rb, id_prev_wrap))
+   /*
+* Make space for the new descriptor by pushing the tail
+* beyond the descriptor to be reserved. Normally this only
+* requires pushing the tail once. But due to possible ABA
+* problems with the ID, the tail could be too far behind.
+* Use a loop to push the tail where it needs to be.
+*/
+   tail_id = atomic_long_read(_ring->tail_id);
+   while (DESC_ID(id_prev_wrap - tail_id) <
+  DESCS_COUNT(desc_ring)) {
+
+   if (!desc_push_tail(rb, tail_id))
return false;
+   tail_id = DESC_ID(tail_id + 1);
}
} while (!atomic_long_try_cmpxchg(_ring->head_id, _id, id));
 
+   /*
+* Before moving the head, it was ensured that the tail was pushed to
+* where it needs to be. Due to possible ABA problems with the ID, the
+* reserved descriptor may not be what was expected. Re-check the tail
+* to see if it is still where it needs to 

Re: [PATCH] x86/efi: update e820 about reserved EFI boot services data to fix kexec breakage

2019-12-05 Thread Dave Young
On 12/04/19 at 11:14am, Ingo Molnar wrote:
> 
> * Dave Young  wrote:
> 
> > On 12/04/19 at 03:52pm, Dave Young wrote:
> > > Michael Weiser reported he got below error during a kexec rebooting:
> > > esrt: Unsupported ESRT version 2904149718861218184.
> > > 
> > > The ESRT memory stays in EFI boot services data, and it was reserved
> > > in kernel via efi_mem_reserve().  The initial purpose of the reservation
> > > is to reuse the EFI boot services data across kexec reboot. For example
> > > the BGRT image data and some ESRT memory like Michael reported. 
> > > 
> > > But although the memory is reserved it is not updated in X86 e820 table.
> > > And kexec_file_load iterate system ram in io resource list to find places
> > > for kernel, initramfs and other stuff. In Michael's case the kexec loaded
> > > initramfs overwritten the ESRT memory and then the failure happened.
> > 
> > s/overwritten/overwrote :)  If need a repost please let me know..
> > 
> > > 
> > > Since kexec_file_load depends on the e820 to be updated, just fix this
> > > by updating the reserved EFI boot services memory as reserved type in 
> > > e820.
> > > 
> > > Originally any memory descriptors with EFI_MEMORY_RUNTIME attribute are
> > > bypassed in the reservation code path because they are assumed as 
> > > reserved.
> > > But the reservation is still needed for multiple kexec reboot.
> > > And it is the only possible case we come here thus just drop the code
> > > chunk then everything works without side effects. 
> > > 
> > > On my machine the ESRT memory sits in an EFI runtime data range, it does
> > > not trigger the problem, but I successfully tested with BGRT instead.
> > > both kexec_load and kexec_file_load work and kdump works as well.
> > > 
> > > Signed-off-by: Dave Young 
> 
> 
> So I edited this to:
> 
>  From: Dave Young 
> 
>  Michael Weiser reported he got this error during a kexec rebooting:
> 
>esrt: Unsupported ESRT version 2904149718861218184.
> 
>  The ESRT memory stays in EFI boot services data, and it was reserved
>  in kernel via efi_mem_reserve().  The initial purpose of the reservation
>  is to reuse the EFI boot services data across kexec reboot. For example
>  the BGRT image data and some ESRT memory like Michael reported.
> 
>  But although the memory is reserved it is not updated in the X86 E820 table,
>  and kexec_file_load() iterates system RAM in the IO resource list to find 
> places
>  for kernel, initramfs and other stuff. In Michael's case the kexec loaded
>  initramfs overwrote the ESRT memory and then the failure happened.
> 
>  Since kexec_file_load() depends on the E820 table being updated, just fix 
> this
>  by updating the reserved EFI boot services memory as reserved type in E820.
> 
>  Originally any memory descriptors with EFI_MEMORY_RUNTIME attribute are
>  bypassed in the reservation code path because they are assumed as reserved.
> 
>  But the reservation is still needed for multiple kexec reboots,
>  and it is the only possible case we come here thus just drop the code
>  chunk, then everything works without side effects.
> 
>  On my machine the ESRT memory sits in an EFI runtime data range, it does
>  not trigger the problem, but I successfully tested with BGRT instead.
>  both kexec_load() and kexec_file_load() work and kdump works as well.
> 

Thanks for the amending, also thank all for the review and test.

Dave


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] efi/memreserve: register reservations as 'reserved' in /proc/iomem

2019-12-05 Thread Ard Biesheuvel
On Wed, 4 Dec 2019 at 20:13, Bhupesh SHARMA  wrote:
>
> Hello Masa,
>
> (+Cc Simon)
>
> On Thu, Dec 5, 2019 at 12:27 AM Masayoshi Mizuma  
> wrote:
> >
> > On Wed, Dec 04, 2019 at 06:17:59PM +, James Morse wrote:
> > > Hi Masa,
> > >
> > > On 04/12/2019 17:17, Masayoshi Mizuma wrote:
> > > > Thank you for sending the patch, but unfortunately it doesn't work for 
> > > > the issue...
> > > >
> > > > After applied your patch, the LPI tables are marked as reserved in
> > > > /proc/iomem like as:
> > > >
> > > > 8030-a1fd : System RAM
> > > >   8048-8134 : Kernel code
> > > >   8135-817b : reserved
> > > >   817c-82ac : Kernel data
> > > >   830f-830f : reserved # Property table
> > > >   8348-83480fff : reserved # Pending table
> > > >   8349-8349 : reserved # Pending table
> > > >
> > > > However, kexec tries to allocate memory from System RAM, it doesn't care
> > > > the reserved in System RAM.
> > >
> > > > I'm not sure why kexec doesn't care the reserved in System RAM, however,
> > >
> > > Hmm, we added these to fix a problem with the UEFI memory map, and more 
> > > recently ACPI
> > > tables being overwritten by kexec.
> > >
> > > Which version of kexec-tools are you using? Could you try:
> > > https://git.linaro.org/people/takahiro.akashi/kexec-tools.git/commit/?h=arm64/resv_mem
> >
> > Thanks a lot! It worked and the issue is gone with Ard's patch and
> > the linaro kexec (arm64/resv_mem branch).
> >
> > Ard, please feel free to add:
> >
> > Tested-by: Masayoshi Mizuma 
>
> Same results at my side, so:
> Tested-and-Reviewed-by: Bhipesh Sharma 
>

Thank you all. I'll get this queued as a fix with cc:stable for v5.4


> > >
> > > > if the kexec behaivor is right, the LPI tables should not belong to
> > > > System RAM.
> > >
> > > > Like as:
> > > >
> > > > 8030-830e : System RAM
> > > >   8048-8134 : Kernel code
> > > >   8135-817b : reserved
> > > >   817c-82ac : Kernel data
> > > > 830f-830f : reserved # Property table
> > > > 8348-83480fff : reserved # Pending table
> > > > 8349-8349 : reserved # Pending table
> > > > 834a-a1fd : System RAM
> > > >
> > > > I don't have ideas to separete LPI tables from System RAM... so I tried
> > > > to add a new file to inform the LPI tables to userspace.
> > >
> > > This is how 'nomap' memory appears, we carve it out of System RAM. A side 
> > > effect of this
> > > is kdump can't touch it, as you've told it this isn't memory.
> > >
> > > As these tables are memory, mapped by the linear map, I think Ard's patch 
> > > is the right
> > > thing to do ... I suspect your kexec-tools doesn't have those patches 
> > > from Akashi to make
> > > it honour all second level entries.
> >
> > I used the kexec on the top of master branch:
> > git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
> >
> > Should we use the linaro kexec for aarch64 machine?
> > Or will the arm64/resv_mem branch be merged to the kexec on
> > git.kernel.org...?
>
> Glad that Ard's patch fixes the issue for you.
> Regarding Akashi's patch, I think it was sent to upstream kexec-tools
> some time ago (see [0}) but  seems not integrated in upstream
> kexec-tools (now I noticed my Tested-by email for the same got bounced
> off due to some gmail msmtp setting issues at my end - sorry for
> that). I have added Simon in Cc list.
>
> Hi Simon,
>
> Can you please help pick [0] in upstream kexec-tools with Tested-by
> from Masa and myself? Thanks a lot for your help.
>
> [0]. http://lists.infradead.org/pipermail/kexec/2019-January/022201.html
>
> Thanks,
> Bhupesh

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4 7/8] linux/log2.h: Fix 64bit calculations in roundup/down_pow_two()

2019-12-05 Thread Leon Romanovsky
On Tue, Dec 03, 2019 at 12:47:40PM +0100, Nicolas Saenz Julienne wrote:
> Some users need to make sure their rounding function accepts and returns
> 64bit long variables regardless of the architecture. Sadly
> roundup/rounddown_pow_two() takes and returns unsigned longs. It turns
> out ilog2() already handles 32/64bit calculations properly, and being
> the building block to the round functions we can rework them as a
> wrapper around it.
>
> Suggested-by: Robin Murphy 
> Signed-off-by: Nicolas Saenz Julienne 
> ---
>  drivers/clk/clk-divider.c|  8 ++--
>  drivers/clk/sunxi/clk-sunxi.c|  2 +-
>  drivers/infiniband/hw/hfi1/chip.c|  4 +-
>  drivers/infiniband/hw/hfi1/init.c|  4 +-
>  drivers/infiniband/hw/mlx4/srq.c |  2 +-
>  drivers/infiniband/hw/mthca/mthca_srq.c  |  2 +-
>  drivers/infiniband/sw/rxe/rxe_qp.c   |  4 +-


Thanks, for infiniband.
Reviewed-by: Leon Romanovsky 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec