Re: [PATCH v2] powerpc: add documentation for HWCAPs

2022-07-15 Thread Tulio Magno Quites Machado Filho
Segher Boessenkool  writes:

> That is a usability problem.  Can it be fixed, or will that create its
> own compatibility problems?  In practice I mean.  If it is, the C
> libraries could fix it up, for new programs, and then after a while the
> kernel can do the sane thing?
>
> How big is the problem, anyway?  Is it only 2.05, or also 2.04, 2.03?

PPC_FEATURE_ARCH_2_05 is the first bit referring to an ISA level.
Before that, AT_HWCAP used to have bits for specific processors, e.g.
PPC_FEATURE_CELL and PPC_FEATURE_POWER4.

Notice that glibc creates its own hwcap-based information that is used by
__builtin_cpu_supports().  In this case bits PPC_FEATURE_ARCH_2_05,
PPC_FEATURE_POWER5_PLUS, PPC_FEATURE_POWER5 and PPC_FEATURE_POWER4 are enabled
whenever if the processor is compatible with the features provided by any of
the previous processors [1].
AT_HWCAP and AT_HWCAP2 are kept intact, though.

[1] 
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/powerpc/hwcapinfo.c;h=afde05f86382413ce1f0c38e33c9bdd38d6b7e9d;hb=HEAD#l45

-- 
Tulio Magno


Re: [PATCH v2] powerpc: add documentation for HWCAPs

2022-07-15 Thread Tulio Magno Quites Machado Filho
Nicholas Piggin  writes:

> +PPC_FEATURE_ARCH_2_05
> +The processor supports the v2.05 userlevel architecture. Processors
> +supporting later architectures also set this feature.

I don't think this bit is enabled when processors support later architectures.
In my tests, this behavior started only with v2.06, i.e. processors that
support v2.07 enable bit v2.06, but do not enable bit v2.05.

Otherwise, looks good to me.

-- 
Tulio Magno


Re: [PATCH 2/2] powerpc/signal: Report minimum signal frame size to userspace via AT_MINSIGSTKSZ

2022-05-18 Thread Tulio Magno Quites Machado Filho
Nicholas Piggin  writes:

> Implement the AT_MINSIGSTKSZ AUXV entry, allowing userspace to
> dynamically size stack allocations in a manner forward-compatible with
> new processor state saved in the signal frame
>
> For now these statically find the maximum signal frame size rather than
> doing any runtime testing of features to minimise the size.
>
> glibc 2.34 will take advantage of this, as will applications that use
> use _SC_MINSIGSTKSZ and _SC_SIGSTKSZ.
>
> Cc: Alan Modra 
> Cc: Tulio Magno Quites Machado Filho 
> References: 94b07c1f8c39 ("arm64: signal: Report signal frame size to 
> userspace via auxv")
> Signed-off-by: Nicholas Piggin 

Both patches LGTM from a glibc point of view.

Reviewed-by: Tulio Magno Quites Machado Filho 

Thanks!

-- 
Tulio Magno


Re: Linux powerpc new system call instruction and ABI

2021-05-19 Thread Tulio Magno Quites Machado Filho
Nicholas Piggin via Libc-alpha  writes:

> As a more hacky thing you could make a syscall with -1 and see how
> the error looks, and then assume all syscalls will be the same.

I'm not sure this would work.
Even in glibc, it's expected that early syscalls will use sc while scv is used
later in the execution.

-- 
Tulio Magno


Re: [PATCH v8 5/8] powerpc/vdso: Prepare for switching VDSO to generic C implementation.

2020-07-16 Thread Tulio Magno Quites Machado Filho
Christophe Leroy  writes:

> Michael Ellerman  a écrit :
>
>> Christophe Leroy  writes:
>>> Prepare for switching VDSO to generic C implementation in following
>>> patch. Here, we:
>>> - Modify __get_datapage() to take an offset
>>> - Prepare the helpers to call the C VDSO functions
>>> - Prepare the required callbacks for the C VDSO functions
>>> - Prepare the clocksource.h files to define VDSO_ARCH_CLOCKMODES
>>> - Add the C trampolines to the generic C VDSO functions
>>>
>>> powerpc is a bit special for VDSO as well as system calls in the
>>> way that it requires setting CR SO bit which cannot be done in C.
>>> Therefore, entry/exit needs to be performed in ASM.
>>>
>>> Implementing __arch_get_vdso_data() would clobber the link register,
>>> requiring the caller to save it. As the ASM calling function already
>>> has to set a stack frame and saves the link register before calling
>>> the C vdso function, retriving the vdso data pointer there is lighter.
>> ...
>>
>>> diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h  
>>> b/arch/powerpc/include/asm/vdso/gettimeofday.h
>>> new file mode 100644
>>> index ..4452897f9bd8
>>> --- /dev/null
>>> +++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
>>> @@ -0,0 +1,175 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +#ifndef __ASM_VDSO_GETTIMEOFDAY_H
>>> +#define __ASM_VDSO_GETTIMEOFDAY_H
>>> +
>>> +#include 
>>> +
>>> +#ifdef __ASSEMBLY__
>>> +
>>> +.macro cvdso_call funct
>>> +  .cfi_startproc
>>> +   PPC_STLUr1, -STACK_FRAME_OVERHEAD(r1)
>>> +   mflrr0
>>> +  .cfi_register lr, r0
>>> +   PPC_STL r0, STACK_FRAME_OVERHEAD + PPC_LR_STKOFF(r1)
>>
>> This doesn't work for me on ppc64(le) with glibc.
>>
>> glibc doesn't create a stack frame before making the VDSO call, so the
>> store of r0 (LR) goes into the caller's frame, corrupting the saved LR,
>> leading to an infinite loop.
>
> Where should it be saved if it can't be saved in the standard location ?

As Michael pointed out, userspace doesn't treat the VDSO as a normal function
call.  In order to keep compatibility with existent software, LR would need to
be saved on another stack frame.

-- 
Tulio Magno


Re: [PATCH] powerpc: Add new HWCAP bits

2020-03-31 Thread Tulio Magno Quites Machado Filho
Alistair Popple  writes:

> diff --git a/arch/powerpc/include/uapi/asm/cputable.h 
> b/arch/powerpc/include/uapi/asm/cputable.h
> index 540592034740..c6fe10b2 100644
> --- a/arch/powerpc/include/uapi/asm/cputable.h
> +++ b/arch/powerpc/include/uapi/asm/cputable.h
> @@ -50,6 +50,8 @@
>  #define PPC_FEATURE2_DARN0x0020 /* darn random number insn */
>  #define PPC_FEATURE2_SCV 0x0010 /* scv syscall */
>  #define PPC_FEATURE2_HTM_NO_SUSPEND  0x0008 /* TM w/out suspended state 
> */
> +#define PPC_FEATURE2_ARCH_3_10   0x0004 /* ISA 3.10 */

I think this should have been:

#define PPC_FEATURE2_ARCH_3_1   0x0004 /* ISA 3.1 */

-- 
Tulio Magno


Re: powerpc Linux scv support and scv system call ABI proposal

2020-01-29 Thread Tulio Magno Quites Machado Filho
Nicholas Piggin  writes:

> Adhemerval Zanella's on January 29, 2020 3:26 am:
>> 
>> We already had to push a similar hack where glibc used to abort transactions
>> prior syscalls to avoid some side-effects on kernel (commit 56cf2763819d2f).
>> It was eventually removed from syscall handling by f0458cf4f9ff3d870, where
>> we only enable TLE if kernel suppors PPC_FEATURE2_HTM_NOSC.
>> 
>> The transaction syscall abort used to read a variable directly from TCB,
>> so this could be an option. I would expect that we could optimize it where
>> if glibc is building against a recent kernel and compiler is building
>> for a ISA 3.0+ cpu we could remove the 'sc' code.
>> 
>
> We would just have to be careful of running on ISA 3.0 CPUs on older
> kernels which do not support scv.

Can we assume that, if a syscall is available through sc it's also available
in scv 0?

Because if that's true, I believe your suggestion to interpret PPC_FEATURE2_SCV
as scv 0 support would be helpful to provide this support via IFUNC even
when glibc is built using --with-cpu=power8, which is the most common scenario
in ppc64le.

In that scenario, it seems new HWCAP bits for new vectors wouldn't be too
frequent, which was the only downside of this proposal.

-- 
Tulio Magno


Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Tulio Magno Quites Machado Filho
Florian Weimer  writes:

> * Tulio Magno Quites Machado Filho:
>
>> I wonder if this is restricted to linker that Golang uses.
>> Were you able to reproduce the same problem with Binutils' linker?
>
> The example is carefully constructed to use the external linker.  It
> invokes gcc, which then invokes the BFD linker in my case.

Indeed. That question was unnecessary.  :-D

> Based on the relocations, I assume there is only so much the linker can
> do here.  I'm amazed that it produces an executable at all, let alone
> one that runs correctly on some kernel versions!

Agreed.  That isn't expected to work.  Both the compiler and the linker have
to generate PIE for it to work.

> I assume that the Go toolchain simply lacks PIE support on ppc64le.

Maybe the support is there, but it doesn't generate PIC by default?

-- 
Tulio Magno


Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Tulio Magno Quites Machado Filho
Florian Weimer  writes:

> * Michal Suchánek:
>
>> On Wed, 31 Oct 2018 18:20:56 +0100
>> Florian Weimer  wrote:
>>
>>> And it needs to be built with:
>>> 
>>>   go build -ldflags=-extldflags=-pie extld.go
>>> 
>>> I'm not entirely sure what to make of this, but I'm worried that this
>>> could be a regression that matters to userspace.
>>
>> I encountered the same when trying to build go on ppc64le. I am not
>> familiar with the internals so I just let it be.
>>
>> It does not seem to matter to any other userspace.
>
> It would matter to C code which returns the address of a global variable
> in the main program through and (implicit) int return value.

I wonder if this is restricted to linker that Golang uses.
Were you able to reproduce the same problem with Binutils' linker?

-- 
Tulio Magno


Re: [PATCH 3/4] powerpc/powernv: Enable TM without suspend if possible

2017-10-19 Thread Tulio Magno Quites Machado Filho
Forwarding some comments from Adhemerval sent to libc-alpha [1]...

Adhemerval Zanella  writes:
>Florian Weimer  writes:
>
>> On 10/12/2017 12:17 PM, Michael Ellerman wrote:
>>> +   pr_info("Enabling TM (Transactional Memory) with Suspend Disabled\n");
>>> +   cur_cpu_spec->cpu_features |= CPU_FTR_TM;
>>> +   cur_cpu_spec->cpu_user_features2 |= PPC_FEATURE2_HTM_NO_SUSPEND;
>>> +   tm_suspend_disabled = true;
>>
>> This doesn't look right because if suspend is not available, you need to 
>> clear the original PPC_FEATURE2_HTM flag because the semantics are not 
>> right, so that applications can use fallback code.  Otherwise, 
>> applications may incorrectly select the HTM code and break if running on 
>> a system which supports HTM, but without the suspend state.
>>
>> The new flag should say that HTM is supported, but without the suspend 
>> state, and it should be always set if PPC_FEATURE2_HTM is set.
>
> Will it also change TEXARS with the abort information?

It should, with a permanent error cause so that old applications entering
suspended state can adopt another technique.
Michael, could you clarify if this is indeed happening, please?

> I completely agree with Florian here, this is as *ABI* change
> and the kernel need to advertise a different TM ABI instead
> of as an extension.

Adhemerval, could you elaborate which problems you're foreseeing, please?

-- 
Tulio Magno

[1] https://sourceware.org/ml/libc-alpha/2017-10/msg00893.html



Re: [PATCH 3/4] powerpc/powernv: Enable TM without suspend if possible

2017-10-19 Thread Tulio Magno Quites Machado Filho
Florian Weimer  writes:

> On 10/12/2017 12:17 PM, Michael Ellerman wrote:
>> +pr_info("Enabling TM (Transactional Memory) with Suspend Disabled\n");
>> +cur_cpu_spec->cpu_features |= CPU_FTR_TM;
>> +cur_cpu_spec->cpu_user_features2 |= PPC_FEATURE2_HTM_NO_SUSPEND;
>> +tm_suspend_disabled = true;
>
> This doesn't look right because if suspend is not available, you need to 
> clear the original PPC_FEATURE2_HTM flag because the semantics are not 
> right, so that applications can use fallback code.  Otherwise, 
> applications may incorrectly select the HTM code and break if running on 
> a system which supports HTM, but without the suspend state.

Just clarifying: it won't break an application, but abort the transaction,
which can cause a performance hit.

> The new flag should say that HTM is supported, but without the suspend 
> state, and it should be always set if PPC_FEATURE2_HTM is set.

If we change the semantics of this bit, old applications that don't suspend
transactions won't run on these processors, even if they're safe to run.

If we adopt the current semantics, only applications that enter in suspend
state will have to be modified in order to not get a performance regression.

-- 
Tulio Magno



Re: [PATCH 9/9] powerpc: A new cache shape aux vectors

2017-01-05 Thread Tulio Magno Quites Machado Filho
Benjamin Herrenschmidt <b...@kernel.crashing.org> writes:

> On Wed, 2017-01-04 at 11:04 -0200, Tulio Magno Quites Machado Filho
>
>> > +#define AT_L1I_CACHESIZE  40
>> > +#define AT_L1I_CACHESHAPE 41
>> > +#define AT_L1D_CACHESIZE  42
>> > +#define AT_L1D_CACHESHAPE 43
>> > +#define AT_L2_CACHESIZE   44
>> > +#define AT_L2_CACHESHAPE  45
>> > +#define AT_L3_CACHESIZE   46
>> > +#define AT_L3_CACHESHAPE  47
>> 
>> These names will clash with the other ones defined by alpha and sh:
>> 
>> /* Shapes of the caches.  Bits 0-3 contains associativity; bits 4-7
>> contains
>>    log2 of line size; mask those to get cache size.  */
>> #define AT_L1I_CACHESHAPE34
>> #define AT_L1D_CACHESHAPE35
>> #define AT_L2_CACHESHAPE 36
>> #define AT_L3_CACHESHAPE 37
>
> Is this a problem though ? In the kernel at least these are defined in
> arch specific headers so there is no clash.

It could become a problem if an architecture tries to use both of these
types.

glibc doesn't distinct between them:
https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/elf.h#l1113

> Otherwise, I can change them to *_CACHEGEOMETRY, is that ok ?

Looks good to me.

-- 
Tulio Magno



Re: [PATCH 9/9] powerpc: A new cache shape aux vectors

2017-01-04 Thread Tulio Magno Quites Machado Filho
Benjamin Herrenschmidt  writes:


> diff --git a/arch/powerpc/include/uapi/asm/auxvec.h 
> b/arch/powerpc/include/uapi/asm/auxvec.h
> index ce17d2c..79183d2 100644
> --- a/arch/powerpc/include/uapi/asm/auxvec.h
> +++ b/arch/powerpc/include/uapi/asm/auxvec.h
> @@ -16,6 +16,37 @@
>   */
>  #define AT_SYSINFO_EHDR  33
>
> -#define AT_VECTOR_SIZE_ARCH 6 /* entries in ARCH_DLINFO */
> +/*
> + * AT_*CACHEBSIZE above represent the cache *block* size which is
> + * the size that is affected by the cache management instructions.
> + *
> + * It doesn't nececssarily matches the cache *line* size which is
> + * more of a performance tuning hint. Additionally the latter can
> + * be different for the different cache levels.
> + *
> + * The set of entries below represent more extensive information
> + * about the caches, in the form of two entry per cache type,
> + * one entry containing the cache size in bytes, and the other
> + * containing the cache line size in bytes in the bottom 16 bits
> + * and the cache associativity in the next 16 bits.
> + *
> + * The associativity is such that if N is the 16-bit value, the
> + * cache is N way set associative. A value if 0x means fully
> + * associative, a value of 1 means directly mapped.
> + *
> + * For all these fields, a value of 0 means that the information
> + * is not known.
> + */
> +
> +#define AT_L1I_CACHESIZE 40
> +#define AT_L1I_CACHESHAPE41
> +#define AT_L1D_CACHESIZE 42
> +#define AT_L1D_CACHESHAPE43
> +#define AT_L2_CACHESIZE  44
> +#define AT_L2_CACHESHAPE 45
> +#define AT_L3_CACHESIZE  46
> +#define AT_L3_CACHESHAPE 47

These names will clash with the other ones defined by alpha and sh:

/* Shapes of the caches.  Bits 0-3 contains associativity; bits 4-7 contains
   log2 of line size; mask those to get cache size.  */
#define AT_L1I_CACHESHAPE   34
#define AT_L1D_CACHESHAPE   35
#define AT_L2_CACHESHAPE36
#define AT_L3_CACHESHAPE37

-- 
Tulio Magno



Re: [PATCH] Add hwcap2 bits for POWER9

2016-01-11 Thread Tulio Magno Quites Machado Filho
Adhemerval Zanella <adhemerval.zane...@linaro.org> writes:

> On 08-01-2016 13:36, Peter Bergner wrote:
>> On Fri, 2016-01-08 at 11:25 -0200, Tulio Magno Quites Machado Filho wrote:
>>> Peter, this solves the issue you reported previously [1].
>>>
>>> [1] https://sourceware.org/ml/libc-alpha/2015-12/msg00522.html
>> 
>> Agreed, thanks.  I'll also add the POWER9 support to the GCC side
>> of the patch now that the glibc code is upstream.
>
> I do not see these bits being added in kernel side yet and GLIBC usual
> only sync these kind of bits *after* they are included in kernel side.
> So I would advise to either get these pieces (kernel support and hwcap
> advertise) in kernel before 2.23 release, otherwise revert the patches.

Ack.
It has just been sent to the correspondent Linux mailing list:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-January/137763.html

-- 
Tulio Magno

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Add hwcap2 bits for POWER9

2016-01-11 Thread Tulio Magno Quites Machado Filho
"Carlos O'Donell" <car...@redhat.com> writes:

> On 01/11/2016 10:16 AM, Tulio Magno Quites Machado Filho wrote:
>> Adhemerval Zanella <adhemerval.zane...@linaro.org> writes:
>> 
>>> On 08-01-2016 13:36, Peter Bergner wrote:
>>>> On Fri, 2016-01-08 at 11:25 -0200, Tulio Magno Quites Machado Filho wrote:
>>>>> Peter, this solves the issue you reported previously [1].
>>>>>
>>>>> [1] https://sourceware.org/ml/libc-alpha/2015-12/msg00522.html
>>>>
>>>> Agreed, thanks.  I'll also add the POWER9 support to the GCC side
>>>> of the patch now that the glibc code is upstream.
>>>
>>> I do not see these bits being added in kernel side yet and GLIBC usual
>>> only sync these kind of bits *after* they are included in kernel side.
>>> So I would advise to either get these pieces (kernel support and hwcap
>>> advertise) in kernel before 2.23 release, otherwise revert the patches.
>> 
>> Ack.
>> It has just been sent to the correspondent Linux mailing list:
>> https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-January/137763.html
>
> Please revert the changes from glibc until you checkin support to linux
> kernel mainline.
>
> Leaving these bits in increases the risk that someone uses to deploy a glibc
> that then may have the wrong value.

Could you clarify this statement, please?
I fail to see how they could have the wrong value.

These bits are enabled if you're running a kernel on a processor that
supports ISA 3.0 and/or VSX IEEE Float 128-bit.  The kernel must also support
this processor.

I see 3 possible scenarios:
 - Processor that doesn't support these features: as these bits are 0 by
   default, they have the correct value.
 - Both processor and kernel support these features: both bits are set,
   which is the correct value.
 - Processor support these features, but an old kernel doesn't support: these
   bits would be 0.  But, that's a valid scenario.

Am I missing another scenario?

However, I do agree with the concerns raised by Peter and Adhemerval: glibc
should be in sync with the kernel by the time of the release in order to
guarantee both bits are reserved for the exact same goal and we should have
both AT_HWCAP and AT_PLATFORM supporting the new processor.
With that said, I was planning to revert both commits d2de9ef7 and b1f19b8e
if we don't get the kernel patch accepted into the powerpc tree in time for
the release 2.23.

-- 
Tulio Magno

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev