date:20210712

Re: [PATCH v2] x86: Don't enable UINTR in 32-bit mode

2021-07-12 Thread Jakub Jelinek via Gcc-patches

On Mon, Jul 12, 2021 at 06:51:30PM -0700, H.J. Lu wrote:
> @@ -404,9 +404,18 @@ const char *host_detect_local_cpu (int argc, const char 
> **argv)
>if (argc < 1)
>  return NULL;

I think it would be simpler to use 2 arguments instead of one.
So change the above to if (argc < 2)

>  
> -  arch = !strcmp (argv[0], "arch");
> +  arch = !strncmp (argv[0], "arch", 4);
>  
> -  if (!arch && strcmp (argv[0], "tune"))
> +  if (!arch && strncmp (argv[0], "tune", 4))
> +return NULL;

Keep strcmp as is here.

> +
> +  bool codegen_x86_64;
> +
> +  if (!strcmp (argv[0] + 4, "32"))
> +codegen_x86_64 = false;
> +  else if (!strcmp (argv[0] + 4, "64"))
> +codegen_x86_64 = true;
> +  else
>  return NULL;

Check argv[1] here instead.

> @@ -813,7 +826,8 @@ const char *host_detect_local_cpu (int argc, const char 
> **argv)
>  }
>  
>  done:
> -  return concat (cache, "-m", argv[0], "=", cpu, options, NULL);
> +  const char *moption = arch ? "-march=" : "-mtune=";
> +  return concat (cache, moption, cpu, options, NULL);
>  }
>  #else

You don't need this change.

> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index 7a35c468da3..7cba655595e 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -2109,6 +2109,7 @@ ix86_option_override_internal (bool main_args_p,
>  #define DEF_PTA(NAME) \
>   if (((processor_alias_table[i].flags & PTA_ ## NAME) != 0) \
>   && PTA_ ## NAME != PTA_64BIT \
> + && (TARGET_64BIT || PTA_ ## NAME != PTA_UINTR) \
>   && !TARGET_EXPLICIT_ ## NAME ## _P (opts)) \
> SET_TARGET_ ## NAME (opts);
>  #include "i386-isa.def"
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 8c3eace56da..ae9f455c48d 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -577,9 +577,12 @@ extern const char *host_detect_local_cpu (int argc, 
> const char **argv);
>  #define CC1_CPU_SPEC CC1_CPU_SPEC_1
>  #else
>  #define CC1_CPU_SPEC CC1_CPU_SPEC_1 \
> -"%{march=native:%>march=native %:local_cpu_detect(arch) \
> -  %{!mtune=*:%>mtune=native %:local_cpu_detect(tune)}} \
> -%{mtune=native:%>mtune=native %:local_cpu_detect(tune)}"
> +"%{" OPT_ARCH32 ":%{march=native:%>march=native %:local_cpu_detect(arch32) \
> +   %{!mtune=*:%>mtune=native %:local_cpu_detect(tune32)}}}" \
> +"%{" OPT_ARCH32 ":%{mtune=native:%>mtune=native 
> %:local_cpu_detect(tune32)}}" \
> +"%{" OPT_ARCH64 ":%{march=native:%>march=native %:local_cpu_detect(arch64) \
> +   %{!mtune=*:%>mtune=native %:local_cpu_detect(tune64)}}}" \
> +"%{" OPT_ARCH64 ":%{mtune=native:%>mtune=native %:local_cpu_detect(tune64)}}"

And you can use
#define ARCH_ARG "%{" OPT_ARCH64 ":64;32}"

%:local_cpu_detect(arch, " ARCH_ARG ")
etc.

Jakub

Re: [patch] PR jit/87808: Allow libgccjit to work without an external gcc driver

2021-07-12 Thread Richard Biener via Gcc-patches

On Mon, Jul 12, 2021 at 11:00 PM Matthias Klose  wrote:
>
> On 3/26/19 12:52 PM, Matthias Klose wrote:
> > On 22.03.19 23:00, David Malcolm wrote:
> >> On Thu, 2019-03-21 at 12:26 +0100, Matthias Klose wrote:
> >>> Fix PR jit/87808, the embedded driver still needing the external gcc
> >>> driver to
> >>> find the gcc_lib_dir. This can happen in a packaging context when
> >>> libgccjit
> >>> doesn't depend on the gcc package, but just on binutils and libgcc-
> >>> dev packages.
> >>> libgccjit probably could use /proc/self/maps to find the gcc_lib_dir,
> >>> but that
> >>> doesn't seem to be very portable.
> >>>
> >>> Ok for the trunk and the branches?
> >>>
> >>> Matthias
> >>
> >> [CCing the jit list]
> >>
> >> I've been trying to reproduce this bug in a working copy, and failing.
> >>
> >> Matthias, do you have a recipe you've been using to reproduce this?
> >
> > the JIT debug log shows the driver names that it wants to call.  Are you 
> > sure
> > that this driver isn't available anywhere?  I configure the gcc build with
> > --program-suffix=-8 --program-prefix=x86_64-linux-gnu-, and that one was 
> > only
> > available in one place, /usr/bin.
> >
> > Matthias
>
> David, the bug report now has two more comments from people that the current
> behavior is broken.  Please could you review the patch?

I think libgccjit should use the same strathegy for finding the install location
like the driver does itself.  I couldn't readily decipher its magic but at least
there's STANDARD_EXEC_PREFIX which seems to be used as possible
fallback.

In particular your patch doesn't seem to work with a DESTDIR=
install?

Can we instead add a --with-gccjit-install-dir= or sth like that (whatever
path to whatever files the JIT exactly looks for)?

Richard.

> Thanks, Matthias

Re: [PATCH v2] docs: Add 'S' to Machine Constraints for RISC-V

2021-07-12 Thread Kito Cheng via Gcc-patches

committed to trunk.

On Mon, Jul 12, 2021 at 12:48 PM Fangrui Song  wrote:
>
> On 2021-07-12, Kito Cheng wrote:
> >It was undocument before, but it might used in linux kernel for resolve
> >code model issue, so LLVM community suggest we should document that,
> >so that make it become supported/documented/non-internal machine constraints.
> >
> >gcc/ChangeLog:
> >
> >   PR target/101275
> >   * config/riscv/constraints.md ("S"): Update description and remove
> >   @internal.
> >   * doc/md.texi (Machine Constraints): Document the 'S' constraints
> >   for RISC-V.
> >---
> > gcc/config/riscv/constraints.md | 3 +--
> > gcc/doc/md.texi | 3 +++
> > 2 files changed, 4 insertions(+), 2 deletions(-)
> >
> >diff --git a/gcc/config/riscv/constraints.md 
> >b/gcc/config/riscv/constraints.md
> >index 8c15c6c0486..c87d5b796a5 100644
> >--- a/gcc/config/riscv/constraints.md
> >+++ b/gcc/config/riscv/constraints.md
> >@@ -67,8 +67,7 @@ (define_memory_constraint "A"
> >(match_test "GET_CODE(XEXP(op,0)) == REG")))
> >
> > (define_constraint "S"
> >-  "@internal
> >-   A constant call address."
> >+  "A constraint that matches an absolute symbolic address."
> >   (match_operand 0 "absolute_symbolic_operand"))
> >
> > (define_constraint "U"
> >diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> >index 00caf3844cc..2d120da96cf 100644
> >--- a/gcc/doc/md.texi
> >+++ b/gcc/doc/md.texi
> >@@ -3536,6 +3536,9 @@ A 5-bit unsigned immediate for CSR access instructions.
> > @item A
> > An address that is held in a general-purpose register.
> >
> >+@item S
> >+A constraint that matches an absolute symbolic address.
> >+
> > @end table
> >
> > @item RX---@file{config/rx/constraints.md}
> >--
> >2.31.1
>
> LGTM

Re: [PATCH 10/10] vect: Reuse reduction accumulators between loops

2021-07-12 Thread Richard Biener via Gcc-patches

On Mon, Jul 12, 2021 at 7:55 PM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > On Fri, Jul 9, 2021 at 3:12 PM Richard Sandiford
> >  wrote:
> >>
> >> Thanks for the review.
> >>
> >> Richard Biener  writes:
> >> >> @@ -588,6 +600,23 @@ public:
> >> >>/* Unrolling factor  */
> >> >>poly_uint64 vectorization_factor;
> >> >>
> >> >> +  /* If this loop is an epilogue loop whose main loop can be skipped,
> >> >> + MAIN_LOOP_EDGE is the edge from the main loop to this loop's
> >> >> + preheader.  SKIP_MAIN_LOOP_EDGE is then the edge that skips the
> >> >> + main loop and goes straight to this loop's preheader.
> >> >> +
> >> >> + Both fields are null otherwise.  */
> >> >> +  edge main_loop_edge;
> >> >> +  edge skip_main_loop_edge;
> >> >> +
> >> >> +  /* If this loop is an epilogue loop that might be skipped after 
> >> >> executing
> >> >> + the main loop, this edge is the one that skips the epilogue.  */
> >> >> +  edge skip_this_loop_edge;
> >> >> +
> >> >> +  /* After vectorization, maps live-out SSA names to information about
> >> >> + the reductions that generated them.  */
> >> >> +  hash_map reusable_accumulators;
> >> >
> >> > Is that the LC PHI node defs or the definition inside of the loop?
> >> > If the latter we could attach the info directly to its stmt-info?
> >>
> >> Ah, yeah, I should improve the comment there.  It's the vectoriser's
> >> replacement for the original LC PHI node, i.e. the final scalar result
> >> after the reduction has taken place.
> >
> > OK
> >
> >> >> @@ -1186,6 +1215,21 @@ public:
> >> >>/* The vector type for performing the actual reduction.  */
> >> >>tree reduc_vectype;
> >> >>
> >> >> +  /* If IS_REDUC_INFO is true and if the reduction is operating on N
> >> >> + elements in parallel, this vector gives the initial values of 
> >> >> these
> >> >> + N elements.  */
> >> >
> >> > That's N scalar elements or N vector elements?  I suppose it's for
> >> > SLP reductions (rather than SLP reduction chains) and never non-SLP
> >> > reductions?
> >>
> >> Yeah, poor wording again, sorry.  I meant something closer to:
> >>
> >>   /* If IS_REDUC_INFO is true and if the vector code is performing
> >>  N scalar reductions in parallel, this vector gives the initial
> >>  scalar values of those N reductions.  */
> >>
> >> >> +  vec reduc_initial_values;
> >> >> +
> >> >> +  /* If IS_REDUC_INFO is true and if the reduction is operating on N
> >> >> + elements in parallel, this vector gives the scalar result of each
> >> >> + reduction.  */
> >> >> +  vec reduc_scalar_results;
> >>
> >> Same change here.
> >>
> >> >> […]
> >> >> diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
> >> >> index 2909e8a0fc3..b7b0523e3c8 100644
> >> >> --- a/gcc/tree-vect-loop-manip.c
> >> >> +++ b/gcc/tree-vect-loop-manip.c
> >> >> @@ -2457,6 +2457,31 @@ vect_update_epilogue_niters (loop_vec_info 
> >> >> epilogue_vinfo,
> >> >>return vect_determine_partial_vectors_and_peeling (epilogue_vinfo, 
> >> >> true);
> >> >>  }
> >> >>
> >> >> +/* LOOP_VINFO is an epilogue loop and MAIN_LOOP_VALUE is available on 
> >> >> exit
> >> >> +   from the corresponding main loop.  Return a value that is available 
> >> >> in
> >> >> +   LOOP_VINFO's preheader, using SKIP_VALUE if the main loop is 
> >> >> skipped.
> >> >> +   Passing a null SKIP_VALUE is equivalent to passing zero.  */
> >> >> +
> >> >> +tree
> >> >> +vect_get_main_loop_result (loop_vec_info loop_vinfo, tree 
> >> >> main_loop_value,
> >> >> +  tree skip_value)
> >> >> +{
> >> >> +  if (!loop_vinfo->main_loop_edge)
> >> >> +return main_loop_value;
> >> >> +
> >> >> +  if (!skip_value)
> >> >> +skip_value = build_zero_cst (TREE_TYPE (main_loop_value));
> >> >
> >> > shouldn't that be the initial value?
> >>
> >> For the current use case, the above two conditions are never true.
> >> I wrote it like this because I had a follow-on patch (which might
> >> not go anywhere) that needed this function for 0-based IVs.
> >>
> >> Maybe that's a bad risk/reward trade-off though.  Not having to pass
> >> zero makes things only slightly simpler for the follow-on patch,
> >> and I guess could be dangerous in other cases.
> >>
> >> Perhaps in that case though I should change loop_vinfo->main_loop_edge
> >> into a gcc_assert as well.
> >
> > Yeah, I think asserts (and comments in case it's because we don't handle
> > some specific cases yet) are better than possibly wrong behavior.
>
> OK.
>
> >> >> +  tree phi_result = make_ssa_name (TREE_TYPE (main_loop_value));
> >> >> +  basic_block bb = loop_vinfo->main_loop_edge->dest;
> >> >> +  gphi *new_phi = create_phi_node (phi_result, bb);
> >> >> +  add_phi_arg (new_phi, main_loop_value, loop_vinfo->main_loop_edge,
> >> >> +  UNKNOWN_LOCATION);
> >> >> +  add_phi_arg (new_phi, skip_value,
> >> >> +  loop_vinfo->skip_main_loop_edge, UNKNOWN_LOCATION);
> >> >>

Re: [Patch, fortran] PR fortran/100906/100907/100911/100914/100915/100916

2021-07-12 Thread Sandra Loosemore


On 6/13/21 12:36 PM, José Rui Faustino de Sousa via Gcc-patches wrote:

Hi All!

Proposed patch to:

Bug 100906 - Bind(c): failure handling character with len/=1
Bug 100907 - Bind(c): failure handling wide character
Bug 100911 - Bind(c): failure handling C_PTR
Bug 100914 - Bind(c): errors handling complex
Bug 100915 - Bind(c): failure handling C_FUNPTR
Bug 100916 - Bind(c): CFI_type_other unimplemented


I've been playing a bit with this patch in conjunction with my TS29113 
testsuite.  It seems to help some things, but it causes a regression in 
library/section-1.f90 that I don't really understand.  :-S  I'm also 
still running up against the long double bug in PR100917; after doing 
some experiments on top of this patch, I suggested some possible 
solutions/workarounds in that issue.



Patch tested only on x86_64-pc-linux-gnu.


In general, I think it is a good idea to test interoperability features 
on a 32-bit host as well.  I've been building i686-pc-linux-gnu and 
testing both -m32 and -m64 multilibs.  ISO_Fortran_binding.h is pretty 
broken on 32-bit host but I'll have a patch for that in the next day or so.


Anyway, I have a few comments on the libgfortran changes, mostly 
cosmetic things.  I'm not at all familiar with the workings of the 
Fortran front end code that produces descriptors, so I don't think I 
have anything useful to say about that part of the patch.



diff --git a/libgfortran/runtime/ISO_Fortran_binding.c 
b/libgfortran/runtime/ISO_Fortran_binding.c
index 20833ad..3a269d7 100644
--- a/libgfortran/runtime/ISO_Fortran_binding.c
+++ b/libgfortran/runtime/ISO_Fortran_binding.c
@@ -36,31 +36,81 @@ export_proto(cfi_desc_to_gfc_desc);
 void
 cfi_desc_to_gfc_desc (gfc_array_void *d, CFI_cdesc_t **s_ptr)
 {
+  signed char type, kind;
+  size_t size;
   int n;
-  index_type kind;
   CFI_cdesc_t *s = *s_ptr;
 
   if (!s)

 return;
 
+  /* Verify descriptor.  */

+  switch(s->attribute)


GNU coding standards:  space before the open paren.


+{
+case CFI_attribute_pointer:
+case CFI_attribute_allocatable:
+  break;
+case CFI_attribute_other:
+  if (s->base_addr)
+   break;
+  /* FALL THROUGH */
+default:
+  internal_error (NULL, "INVALID CFI DESCRIPTOR");


We could have a better message here that doesn't SHOUT.  Maybe "Invalid 
attribute in CFI descriptor"?
 
+  kind = _CFI_DECODE_KIND (s->type);

+  switch(type)


Space before the paren again.


+{
+case BT_INTEGER:
+case BT_LOGICAL:
+case BT_REAL:
+  size = (size_t)kind;
+  break;
+case BT_COMPLEX:
+  size = (size_t)(kind << 1);
+  break;
+case BT_DERIVED:
+case BT_CHARACTER:
+case BT_VOID:
+  size = s->elem_len;
+  break;
+default:
+  if (type != CFI_type_other)
+   internal_error(NULL, "TYPE ERROR");


Space before the paren and better error message again.

Section 18.5.4 of the Fortran 2018 standard says:  "If a C type is not 
interoperable with a Fortran type and kind supported by the Fortran 
processor, its macro shall evaluate to a negative value."  And in fact 
I'll have a patch for PR101305 soon that will use a negative value 
distinct from CFI_type_other for that.  If you construct a descriptor 
for such an object in C and try passing it to Fortran, I think that's a 
user error, not an internal error.



+  if (size <= 0)
+internal_error(NULL, "SIZE ERROR");


Same issues here with fixing formatting and better message.


+  GFC_DESCRIPTOR_SIZE (d) = size;
+  
   d->dtype.version = s->version;

+
+  if ((s->rank < 0) || (s->rank > CFI_MAX_RANK))


You could delete the inner parentheses here.


+internal_error(NULL, "Rank out of range.");


Space before the paren again.  It also looks like other error messages 
in libgfortran don't add punctuation at the end.


There are several more instances of space-before-paren problems and 
fixing the errpr messages, I'm not going to point the rest of them out 
individually.



@@ -74,14 +124,19 @@ cfi_desc_to_gfc_desc (gfc_array_void *d, CFI_cdesc_t 
**s_ptr)
 }
 
   d->offset = 0;

-  for (n = 0; n < GFC_DESCRIPTOR_RANK (d); n++)
-{
-  GFC_DESCRIPTOR_LBOUND(d, n) = (index_type)s->dim[n].lower_bound;
-  GFC_DESCRIPTOR_UBOUND(d, n) = (index_type)(s->dim[n].extent
-   + s->dim[n].lower_bound - 1);
-  GFC_DESCRIPTOR_STRIDE(d, n) = (index_type)(s->dim[n].sm / s->elem_len);
-  d->offset -= GFC_DESCRIPTOR_STRIDE(d, n) * GFC_DESCRIPTOR_LBOUND(d, n);
-}
+  if (GFC_DESCRIPTOR_DATA (d))
+for (n = 0; n < GFC_DESCRIPTOR_RANK (d); n++)
+  {
+   CFI_index_t lb = 1;
+   
+   if (s->attribute != CFI_attribute_other)
+ lb = s->dim[n].lower_bound;
+
+   GFC_DESCRIPTOR_LBOUND(d, n) = (index_type)lb;
+   GFC_DESCRIPTOR_UBOUND(d, n) = (index_type)(s->dim[n].extent + lb - 1);
+   GFC_DESCRIPTOR_STRIDE(d, n) = (index_type)(s->dim[n].sm / s->elem_len);
+   d->offset -= GFC_DESCRIPTOR

Re: Repost: [PATCH] Deal with prefixed loads/stores in tests, PR testsuite/100166

2021-07-12 Thread Michael Meissner via Gcc-patches

On Mon, Jul 12, 2021 at 10:32:47AM -0500, Bill Schmidt wrote:
> Hi Mike,
> >diff --git a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c 
> >b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
> >index 1269fe635c6..d5ed700b9bc 100644
> >--- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
> >+++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
> >@@ -30,5 +30,5 @@ vector signed long long splats4(void)
> >  /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */
> >  /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */
> >-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */
> >+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M|\mplvx\M} 
> >2 } } */
> 
> Oopsie.  I think you mean "plxv" for this one.
> 
> Otherwise LGTM.  I can't approve, but recommend approval with those changes.

Yes I did.  Good catch. 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

RE: [llvm-dev] [PATCH] Add optional _Float16 support

2021-07-12 Thread Wang, Pengfei via Gcc-patches

> Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.

Can you please explain the behavior here? Is there difference between _Float16 
and _Complex _Float16 when return? I.e.,
1, In which case will _Float16 values return in both %xmm0 and %xmm1?
2, For a single _Float16 value, are both real part and imaginary part returned 
in %xmm0? Or returned in %xmm0 and %xmm1 respectively?

Thanks
Pengfei

-Original Message-
From: llvm-dev  On Behalf Of H.J. Lu via 
llvm-dev
Sent: Friday, July 2, 2021 6:28 AM
To: Joseph Myers 
Cc: llvm-...@lists.llvm.org; GCC Patches ; GNU C 
Library ; IA32 System V Application Binary Interface 

Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support

On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers  wrote:
>
> On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote:
>
> > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
>
> That restricts use of _Float16 to processors with SSE.  Is that what 
> we want in the ABI, or should _Float16 be available with base 32-bit 
> x86 architecture features only, much like _Float128 and the decimal FP 
> types

Yes, _Float16 requires XMM registers.

> are?  (If it is restricted to SSE, we can of course ensure relevant 
> libgcc functions are built with SSE enabled, and likewise in glibc if 
> that gains
> _Float16 functions, though maybe with some extra complications to get 
> relevant testcases to run whenever possible.)
>

_Float16 functions in libgcc should be compiled with SSE enabled.

BTW, _Float16 software emulation may require more than just SSE since we need 
to do _Float16 load and store with XMM registers.
There is no 16bit load/store for XMM registers without AVX512FP16.

--
H.J.
___
LLVM Developers mailing list
llvm-...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Re: [PATCH] [android] Disable large files when unsupported

2021-07-12 Thread Abraão de Santana via Gcc-patches

This could be better achieved by rewriting the libstdc++-v3/configure file
to fail the test for _FILE_OFFSET_BITS = 64 when fpos is not defined.
But this is kind of an odd spot to be, as this is an oddity from those
specific conditions on Bionic and I don't know if it's worth devising a
universal test for that.

I don't know if android/bionic patches are still relevant for the
community, but I still build recent versions of GCC with NDK r17c for some
old embedded devices, so I felt like sharing.

Em ter., 13 de jul. de 2021 às 00:16, Abraão de Santana <
abraaocsant...@gmail.com> escreveu:

> If C++ is enabled with Bionic on API Level < 24 ( and with NDK >= r15 as
> lower versions just ignore the issue ) you get errors of ::fgetpos and
> ::fsetpos error: 'fsetpos' has not been declared in '::' .
>
> The issue is that API Level < 24 doesn't implement large files (64 bit),
> so when _FILE_OFFSET_BITS is 64 and API Level < 24, the functions fgetpos,
> fsetpos, fseeko, and ftello are not defined.
>
> The behavior described above can be observed by going to
> 'android-ndk-rxxx/sysroot/usr/include/stdio.h' and watching the condition
> when those symbols are defined ( note: the control define
> __USE_FILE_OFFSET64 is defined if _FILE_OFFSET_BITS if 64)
>
> It's a known issue and bionic's docs ask you to stop defining
> _FILE_OFFSET_BITS to 64 to solve it:
> https://android.googlesource.com/platform/bionic/+/master/docs/32-bit-abi.md
>
> Clang builds seem to get around this using libandroid_support on those
> cases ( android/ndk#480 ), which is not available when building the gnu std
> c++.
>
>
> libstdc++-v3/ChangeLog
> * config/os/bionic/os_defines.h: Undefines _FILE_OFFSET_BITS when
> it's 64-bit and Api Level < 24
>
> ---
>  libstdc++-v3/config/os/bionic/os_defines.h | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/config/os/bionic/os_defines.h
> b/libstdc++-v3/config/os/bionic/os_defines.h
> index c534838aea3..27605fa8baa 100644
> --- a/libstdc++-v3/config/os/bionic/os_defines.h
> +++ b/libstdc++-v3/config/os/bionic/os_defines.h
> @@ -33,4 +33,16 @@
>  // System-specific #define, typedefs, corrections, etc, go here.  This
>  // file will come before all others.
>
> +// If _FILE_OFFSET_BITS is 64 and __ANDROID_API__ < 24, there will be
> +// errors of undefined fgetpos, fsetpos, fseeko, and ftello because their
> +//64-bit versions are only available from API Level 24 onwards.
> +//
> +// See https://github.com/android/ndk/issues/480
> +//
> +// See also
> +//
> https://android.googlesource.com/platform/bionic/+/master/docs/32-bit-abi.md
> +#if (_FILE_OFFSET_BITS == 64) && (__ANDROID_API__ < 24)
> +#undef _FILE_OFFSET_BITS
> +#endif
> +
>  #endif
>


-- 


*Abraão C. de Santana*

*Skype: * abraao.c.santana
*Mobile :* +55 (21) 9307-9868

[PATCH] [android] Disable large files when unsupported

2021-07-12 Thread Abraão de Santana via Gcc-patches

If C++ is enabled with Bionic on API Level < 24 ( and with NDK >= r15 as
lower versions just ignore the issue ) you get errors of ::fgetpos and
::fsetpos error: 'fsetpos' has not been declared in '::' .

The issue is that API Level < 24 doesn't implement large files (64 bit), so
when _FILE_OFFSET_BITS is 64 and API Level < 24, the functions fgetpos,
fsetpos, fseeko, and ftello are not defined.

The behavior described above can be observed by going to
'android-ndk-rxxx/sysroot/usr/include/stdio.h' and watching the condition
when those symbols are defined ( note: the control define
__USE_FILE_OFFSET64 is defined if _FILE_OFFSET_BITS if 64)

It's a known issue and bionic's docs ask you to stop defining
_FILE_OFFSET_BITS to 64 to solve it:
https://android.googlesource.com/platform/bionic/+/master/docs/32-bit-abi.md

Clang builds seem to get around this using libandroid_support on those
cases ( android/ndk#480 ), which is not available when building the gnu std
c++.


libstdc++-v3/ChangeLog
* config/os/bionic/os_defines.h: Undefines _FILE_OFFSET_BITS when
it's 64-bit and Api Level < 24

---
 libstdc++-v3/config/os/bionic/os_defines.h | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/config/os/bionic/os_defines.h
b/libstdc++-v3/config/os/bionic/os_defines.h
index c534838aea3..27605fa8baa 100644
--- a/libstdc++-v3/config/os/bionic/os_defines.h
+++ b/libstdc++-v3/config/os/bionic/os_defines.h
@@ -33,4 +33,16 @@
 // System-specific #define, typedefs, corrections, etc, go here.  This
 // file will come before all others.

+// If _FILE_OFFSET_BITS is 64 and __ANDROID_API__ < 24, there will be
+// errors of undefined fgetpos, fsetpos, fseeko, and ftello because their
+//64-bit versions are only available from API Level 24 onwards.
+//
+// See https://github.com/android/ndk/issues/480
+//
+// See also
+//
https://android.googlesource.com/platform/bionic/+/master/docs/32-bit-abi.md
+#if (_FILE_OFFSET_BITS == 64) && (__ANDROID_API__ < 24)
+#undef _FILE_OFFSET_BITS
+#endif
+
 #endif

fix typo in attr_fnspec::verify

2021-07-12 Thread Alexandre Oliva



Odd-numbered indices describing argument access sizes in the fnspec
string can only hold 't' or a digit, as tested in the beginning of the
case.  When checking that the size-supplying argument does not have
additional information associated with it, the test that excludes the
't' possibility looks for it at the even position in the fnspec
string.  Oops.

This might yield false positives and negatives if a function has a
fnspec in which an argument uses a 't' access-size, and ('t' - '1')
happens to be the index of an argument described in an fnspec string.
Assuming ASCII encoding, it would take a function with at least 68
arguments described in fnspec.  Still, probably worth fixing.

Regstrapped on x86_64-linux-gnu.  I'm checking this in as obvious unless
there are objections within some 48 hours.


for  gcc/ChangeLog

* tree-ssa-alias.c (attr_fnspec::verify): Fix index in
non-'t'-sized arg check.
---
 gcc/tree-ssa-alias.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 0421bfac99869..742a95a549e20 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -3895,7 +3895,7 @@ attr_fnspec::verify ()
&& str[idx] != 'w' && str[idx] != 'W'
&& str[idx] != 'o' && str[idx] != 'O')
  err = true;
-   if (str[idx] != 't'
+   if (str[idx + 1] != 't'
/* Size specified is scalar, so it should be described
   by ". " if specified at all.  */
&& (arg_specified_p (str[idx + 1] - '1')


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

sync up new type indices for body adjustments

2021-07-12 Thread Alexandre Oliva



The logic in fill_vector_of_new_param_types may skip some parameters
when pushing into m_new_types, but common_initialization doesn't take
that into account, and may end up attempting to access the vector past
its end when IPA_PARAM_OP_(NEW|SPLIT) operands appear after skipped
_COPY ones.

This patch adjusts the consumer logic to match the indexing in the
producer.  It came up in libstdc++-v3's testsuite, in
std/ranges/adaptors/filter.cc, but only with wrappers introduced by a pass
I'm working on.  The _NEW parameters were reference-typed replacements
for some by-value ones in my function-wrapping logic, and other IPA
transformations cause plenty of unused/irrelevant arguments to be
dropped for certain calls.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* ipa-param-manipulation.c
(ipa_param_body_adjustments::common_initialization): Adjust
m_new_types indexing to match producer.
---
 gcc/ipa-param-manipulation.c |   10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 75b5a47a7ae8b..dbbe547832dc5 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -1102,11 +1102,17 @@ ipa_param_body_adjustments::common_initialization (tree 
old_fndecl,
  corresponding m_id->dst_node->clone.performed_splits entries.  */
 
   m_new_decls.reserve_exact (adj_len);
-  for (unsigned i = 0; i < adj_len ; i++)
+  for (unsigned i = 0, nti = 0; i < adj_len ; i++, nti++)
 {
   ipa_adjusted_param *apm = &(*m_adj_params)[i];
   unsigned prev_index = apm->prev_clone_index;
   tree new_parm;
+  if (apm->op == IPA_PARAM_OP_COPY
+ && prev_index >= otypes.length ())
+   /* Keep nti in sync with the m_new_types indices used in
+  fill_vector_of_new_param_types, for any non-IPA_PARAM_OP_COPY
+  parms.  */
+   nti--;
   if (apm->op == IPA_PARAM_OP_COPY
  || apm->prev_clone_adjustment)
{
@@ -1117,7 +1123,7 @@ ipa_param_body_adjustments::common_initialization (tree 
old_fndecl,
   else if (apm->op == IPA_PARAM_OP_NEW
   || apm->op == IPA_PARAM_OP_SPLIT)
{
- tree new_type = m_new_types[i];
+ tree new_type = m_new_types[nti];
  gcc_checking_assert (new_type);
  new_parm = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL_TREE,
 new_type);

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

avoid early reference to debug-only symbol

2021-07-12 Thread Alexandre Oliva



If some IPA pass replaces the only reference to a constant non-public
non-automatic variable with its initializer, namely the address of
another such variable, the former becomes unreferenced and it's
discarded by symbol_table::remove_unreachable_nodes.  It calls
debug_hooks->late_global_decl while at that, and this expands the
initializer, which assigs RTL to the latter variable and forces it to
be retained by remove_unreferenced_decls, and eventually be output
despite not being referenced.  Without debug information, it's not
output.

This has caused a bootstrap-debug compare failure in
libdecnumber/decContext.o, while developing a transformation that ends
up enabling the above substitution in constprop.

This patch makes reference_to_unused slightly more conservative about
such variables at the end of IPA passes, falling back onto
expand_debug_expr for expressions referencing symbols that might or
might not be output, avoiding the loss of debug information when the
symbol is output, while avoiding a symbol output only because of debug
info.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* dwarf2out.c (add_const_value_attribute): Return false if
resolve_one_addr fails.
(reference_to_unused): Don't assume local symbol presence
while it can still be optimized out.
(rtl_for_decl_init): Fallback to expand_debug_expr.
* cfgexpand.c (expand_debug_expr): Export.
* expr.h (expand_debug_expr): Declare.
---
 gcc/cfgexpand.c |   12 +---
 gcc/dwarf2out.c |   15 +--
 gcc/expr.h  |2 ++
 3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 3edd53c37dcb3..b731a5598230c 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -91,8 +91,6 @@ struct ssaexpand SA;
of comminucating the profile info to the builtin expanders.  */
 gimple *currently_expanding_gimple_stmt;
 
-static rtx expand_debug_expr (tree);
-
 static bool defer_stack_allocation (tree, bool);
 
 static void record_alignment_for_reg_var (unsigned int);
@@ -4413,7 +4411,7 @@ expand_debug_parm_decl (tree decl)
 
 /* Return an RTX equivalent to the value of the tree expression EXP.  */
 
-static rtx
+rtx
 expand_debug_expr (tree exp)
 {
   rtx op0 = NULL_RTX, op1 = NULL_RTX, op2 = NULL_RTX;
@@ -5285,6 +5283,14 @@ expand_debug_expr (tree exp)
   else
goto flag_unsupported;
 
+case COMPOUND_LITERAL_EXPR:
+  exp = COMPOUND_LITERAL_EXPR_DECL_EXPR (exp);
+  /* DECL_EXPR is a tcc_statement, which expand_debug_expr does
+not expect, so instead of recursing we take care of it right
+away.  */
+  exp = DECL_EXPR_DECL (exp);
+  return expand_debug_expr (exp);
+
 case CALL_EXPR:
   /* ??? Maybe handle some builtins?  */
   return NULL;
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 82783c4968b85..bb7e2b8dc4e2c 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -20170,7 +20170,8 @@ add_const_value_attribute (dw_die_ref die, rtx rtl)
   if (dwarf_version >= 4 || !dwarf_strict)
{
  dw_loc_descr_ref loc_result;
- resolve_one_addr (&rtl);
+ if (!resolve_one_addr (&rtl))
+   return false;
rtl_addr:
   loc_result = new_addr_loc_descr (rtl, dtprel_false);
  add_loc_descr (&loc_result, new_loc_descr (DW_OP_stack_value, 0, 0));
@@ -20255,6 +20256,12 @@ reference_to_unused (tree * tp, int * walk_subtrees,
   varpool_node *node = varpool_node::get (*tp);
   if (!node || !node->definition)
return *tp;
+  /* If it's local, it might still be optimized out, unless we've
+already committed to outputting it by assigning RTL to it.  */
+  if (! TREE_PUBLIC (*tp) && ! TREE_ASM_WRITTEN (*tp)
+ && symtab->state <= IPA_SSA_AFTER_INLINING
+ && ! DECL_RTL_SET_P (*tp))
+   return *tp;
 }
   else if (TREE_CODE (*tp) == FUNCTION_DECL
   && (!DECL_EXTERNAL (*tp) || DECL_DECLARED_INLINE_P (*tp)))
@@ -20279,6 +20286,7 @@ static rtx
 rtl_for_decl_init (tree init, tree type)
 {
   rtx rtl = NULL_RTX;
+  bool valid_p = false;
 
   STRIP_NOPS (init);
 
@@ -20322,7 +20330,7 @@ rtl_for_decl_init (tree init, tree type)
   /* If the initializer is something that we know will expand into an
  immediate RTL constant, expand it now.  We must be careful not to
  reference variables which won't be output.  */
-  else if (initializer_constant_valid_p (init, type)
+  else if ((valid_p = initializer_constant_valid_p (init, type))
   && ! walk_tree (&init, reference_to_unused, NULL, NULL))
 {
   /* Convert vector CONSTRUCTOR initializers to VECTOR_CST if
@@ -20367,6 +20375,9 @@ rtl_for_decl_init (tree init, tree type)
   /* If expand_expr returns a MEM, it wasn't immediate.  */
   gcc_assert (!rtl || !MEM_P (rtl));
 }
+  else if (valid_p)
+/* Perhaps we could just use this and skip all of the above?  */
+rtl = expand_de

move unreachable user labels to entry point

2021-07-12 Thread Alexandre Oliva



pr42739.C, complicated by some extra wrappers and cleanups from a
feature I'm working on, got me very confused because a user label
ended up in a cleanup introduced by my pass, where it couldn't
possibly have been initially.

The current logic may move such an unreachable user label multiple
times, if it ends up in blocks that get removed one after the other.
Since it doesn't really matter where it lands (save for omp
constraints), I propose we move it once and for all to a stable, final
location, that we currently use only as a last resort: the entry
point.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* tree-cfg.c (remove_bb): When preserving an unreachable user
label, use the entry block as the preferred choice for its
surviving location, rather than as a last resort.
---
 gcc/tree-cfg.c |   18 +-
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 1f0f4a2c6eb2c..f6f005f10a9f5 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -2300,13 +2300,11 @@ remove_bb (basic_block bb)
  FORCED_LABEL (gimple_label_label (label_stmt)) = 1;
}
 
- new_bb = bb->prev_bb;
- /* Don't move any labels into ENTRY block.  */
- if (new_bb == ENTRY_BLOCK_PTR_FOR_FN (cfun))
-   {
- new_bb = single_succ (new_bb);
- gcc_assert (new_bb != bb);
-   }
+ /* We have to move the unreachable label somewhere.
+Moving it to the entry block makes sure it's moved at
+most once, and avoids messing with anonymous landing
+pad labels.  */
+ new_bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
  if ((unsigned) bb->index < bb_to_omp_idx.length ()
  && ((unsigned) new_bb->index >= bb_to_omp_idx.length ()
  || (bb_to_omp_idx[bb->index]
@@ -2316,7 +2314,6 @@ remove_bb (basic_block bb)
 into the right OMP region.  */
  unsigned int i;
  int idx;
- new_bb = NULL;
  FOR_EACH_VEC_ELT (bb_to_omp_idx, i, idx)
if (i >= NUM_FIXED_BLOCKS
&& idx == bb_to_omp_idx[bb->index]
@@ -2325,11 +2322,6 @@ remove_bb (basic_block bb)
new_bb = BASIC_BLOCK_FOR_FN (cfun, i);
break;
  }
- if (new_bb == NULL)
-   {
- new_bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
- gcc_assert (new_bb != bb);
-   }
}
  new_gsi = gsi_after_labels (new_bb);
  gsi_remove (&i, false);

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

adjust landing pads when changing main label

2021-07-12 Thread Alexandre Oliva



If an artificial label created for a landing pad ends up being
dropped in favor of a user-supplied label, the user-supplied label
inherits the landing pad index, but the post_landing_pad field is not
adjusted to point to the new label.

This patch fixes the problem, and adds verification that we don't
remove a label that's still used as a landing pad.

The circumstance in which this problem can be hit was unusual: removal
of a block with an unreachable label moves the label to some other
unrelated block, in case its address is taken.  In the case at hand
(pr42739.C, complicated by wrappers and cleanups), the chosen block
happened to be an EH landing pad.  (A followup patch will change that.)

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* tree-cfg.c (cleanup_dead_labels_eh): Update
post_landing_pad label upon change of landing pad block's
primary label.
(cleanup_dead_labels): Check that a removed label is not that
of a landing pad.
---
 gcc/tree-cfg.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index c73e1cbdda6b9..1f0f4a2c6eb2c 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -1481,6 +1481,7 @@ cleanup_dead_labels_eh (label_record *label_for_bb)
if (lab != lp->post_landing_pad)
  {
EH_LANDING_PAD_NR (lp->post_landing_pad) = 0;
+   lp->post_landing_pad = lab;
EH_LANDING_PAD_NR (lab) = lp->index;
  }
   }
@@ -1707,7 +1708,10 @@ cleanup_dead_labels (void)
  || FORCED_LABEL (label))
gsi_next (&i);
  else
-   gsi_remove (&i, true);
+   {
+ gcc_checking_assert (EH_LANDING_PAD_NR (label) == 0);
+ gsi_remove (&i, true);
+   }
}
 }
 

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

drop va_list from formals if requested

2021-07-12 Thread Alexandre Oliva



I'm working on a feature that involves creating wrappers for some
functions, using alternate means to pass variable argument lists to
the wrapped versions.  The split-out (wrapped) function shouldn't be
stdarg, and though comments in m_always_copy_start in
ipa_param_adjustments suggested that making it negative would cause
variable arguments to be dropped, this was not the case, and AFAICT
there was no way to accomplish that.

Perhaps there was no such intent implied by the comments, but that
sounded like a reasonable plan to me, so I went ahead an implemented
it.  With this patch, a negative m_always_copy_start drops the
variable argument list marker from a cloned function's type, and the
stdarg flag from the cloned cfun data structure.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* ipa-param-manipulation.c (build_adjusted_function_type): Add
skip_valist, obey it.
(ipa_param_adjustments::build_new_function_type): Take
negative m_always_copy_start as skip_valist.
(ipa_param_body_adjustmnets::modify_formal_parameters):
Likewise.
* tee-inline.c (initialize_cfun): Clear stdarg if function
type doesn't support it.
---
 gcc/ipa-param-manipulation.c |   17 -
 gcc/tree-inline.c|2 +-
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 26b02d7aa95c9..75b5a47a7ae8b 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -283,14 +283,17 @@ fill_vector_of_new_param_types (vec *new_types, 
vec *otypes,
 
 static tree
 build_adjusted_function_type (tree orig_type, vec *new_param_types,
- bool method2func, bool skip_return)
+ bool method2func, bool skip_return,
+ bool skip_valist)
 {
   tree new_arg_types = NULL;
   if (TYPE_ARG_TYPES (orig_type))
 {
   gcc_checking_assert (new_param_types);
-  bool last_parm_void = (TREE_VALUE (tree_last (TYPE_ARG_TYPES 
(orig_type)))
-== void_type_node);
+  bool last_parm_void = (skip_valist
+|| (TREE_VALUE (tree_last (TYPE_ARG_TYPES
+   (orig_type)))
+== void_type_node));
   unsigned len = new_param_types->length ();
   for (unsigned i = 0; i < len; i++)
new_arg_types = tree_cons (NULL_TREE, (*new_param_types)[i],
@@ -458,8 +461,10 @@ ipa_param_adjustments::build_new_function_type (tree 
old_type,
   else
 new_param_types_p = NULL;
 
+  bool skip_valist = m_always_copy_start < 0;
   return build_adjusted_function_type (old_type, new_param_types_p,
-  method2func_p (old_type), m_skip_return);
+  method2func_p (old_type), m_skip_return,
+  skip_valist);
 }
 
 /* Build variant of function decl ORIG_DECL which has no return value if
@@ -1309,8 +1314,10 @@ ipa_param_body_adjustments::modify_formal_parameters ()
  through tree_function_versioning, not when modifying function body
  directly.  */
   gcc_assert (!m_adjustments || !m_adjustments->m_skip_return);
+  bool skip_valist = m_adjustments && m_adjustments->m_always_copy_start < 0;
   tree new_type = build_adjusted_function_type (orig_type, &m_new_types,
-   m_method2func, false);
+   m_method2func, false,
+   skip_valist);
 
   TREE_TYPE (m_fndecl) = new_type;
   DECL_VIRTUAL_P (m_fndecl) = 0;
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index f605e763f4a61..0e9bb62242177 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2793,7 +2793,7 @@ initialize_cfun (tree new_fndecl, tree callee_fndecl, 
profile_count count)
   cfun->va_list_fpr_size = src_cfun->va_list_fpr_size;
   cfun->has_nonlocal_label = src_cfun->has_nonlocal_label;
   cfun->calls_eh_return = src_cfun->calls_eh_return;
-  cfun->stdarg = src_cfun->stdarg;
+  cfun->stdarg = src_cfun->stdarg && stdarg_p (TREE_TYPE (new_fndecl));
   cfun->after_inlining = src_cfun->after_inlining;
   cfun->can_throw_non_call_exceptions
 = src_cfun->can_throw_non_call_exceptions;

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

Re: ipa-modref: merge flags when adding escape

2021-07-12 Thread Alexandre Oliva

Ping?  https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573137.html

On Jun 18, 2021, Alexandre Oliva  wrote:

> for  gcc/ChangeLog

>   * ipa-modref.c (modref_lattice::add_escape_point): Merge
>   min_flags into flags.
>   (modref_lattice::dump): Fix escape_point's min_flags dumping.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]

2021-07-12 Thread guojiufu via Gcc-patches


On 2021-07-12 23:53, guojiufu via Gcc-patches wrote:

On 2021-07-12 22:46, Richard Biener wrote:

On Mon, 12 Jul 2021, guojiufu wrote:


On 2021-07-12 18:02, Richard Biener wrote:
> On Mon, 12 Jul 2021, guojiufu wrote:
>
>> On 2021-07-12 16:57, Richard Biener wrote:
>> > On Mon, 12 Jul 2021, guojiufu wrote:
>> >
>> >> On 2021-07-12 14:20, Richard Biener wrote:
>> >> > On Fri, 9 Jul 2021, Segher Boessenkool wrote:
>> >> >
>> >> >> On Fri, Jul 09, 2021 at 08:43:59AM +0200, Richard Biener wrote:
>> >> >> > I wonder if there's a way to query the target what modes the doloop
>> >> >> > pattern can handle (not being too familiar with the doloop code).
>> >> >>
>> >> >> You can look what modes are allowed for operand 0 of doloop_end,
>> >> >> perhaps?  Although that is a define_expand, not a define_insn, so it
>> >> >> is
>> >> >> hard to introspect.
>> >> >>
>> >> >> > Why do you need to do any checks besides the new type being able to
>> >> >> > represent all IV values?  The original doloop IV will never wrap
>> >> >> > (OTOH if niter is U*_MAX then we compute niter + 1 which will
>> >> >> > become
>> >> >> > zero ... I suppose the doloop might still do the correct thing here
>> >> >> > but it also still will with a IV with larger type).
>> >>
>> >> The issue comes from U*_MAX (original short MAX), as you said: on which
>> >> niter + 1 becomes zero.  And because the step for doloop is -1; then, on
>> >> larger type 'zero - 1' will be a very large number on larger type
>> >> (e.g. 0xff...ff); but on the original short type 'zero - 1' is a small
>> >> value
>> >> (e.g. "0xff").
>> >
>> > But for the larger type the small type MAX + 1 fits and does not yield
>> > zero so it should still work exactly as before, no?  Of course you
>> > have to compute the + 1 in the larger type.
>> >
>> You are right, if compute the "+ 1" in the larger type it is ok, as below
>> code:
>> ```
>>/* Use type in word size may fast.  */
>> if (TYPE_PRECISION (ntype) < BITS_PER_WORD)
>>   {
>> ntype = lang_hooks.types.type_for_size (BITS_PER_WORD, 1);
>> niter = fold_convert (ntype, niter);
>>   }
>>
>> tree base = fold_build2 (PLUS_EXPR, ntype, unshare_expr (niter),
>>  build_int_cst (ntype, 1));
>>
>>
>> add_candidate (data, base, build_int_cst (ntype, -1), true, NULL, NULL,
>> true);
>> ```
>> The issue of this is, this code generates more stmt for doloop.xxx:
>>   _12 = (unsigned int) xx(D);
>>   _10 = _12 + 4294967295;
>>   _24 = (long unsigned int) _10;
>>   doloop.6_8 = _24 + 1;
>>
>> if use previous patch, "+ 1" on original type, then the stmts will looks
>> like:
>>   _12 = (unsigned int) xx(D);
>>   doloop.6_8 = (long unsigned int) _12;
>>
>> This is the reason for checking
>>wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE (ntype)))
>
> But this then only works when there's an upper bound on the number
> of iterations.  Note you should not use TYPE_MAX_VALUE here but
> you can instead use
>
>  wi::ltu_p (niter_desc->max, wi::to_widest (wi::max_value
> (TYPE_PRECISION (ntype), TYPE_SIGN (ntype;

Ok, Thanks!
I remember you mentioned that:
widest_int::from (wi::max_value (TYPE_PRECISION (ntype), TYPE_SIGN 
(ntype)),

TYPE_SIGN (ntype))
would be better than
wi::to_widest (TYPE_MAX_VALUE (ntype)).

It seems that:
"TYPE_MAX_VALUE (ntype)" is "NUMERICAL_TYPE_CHECK
(NODE)->type_non_common.maxval"
which do a numerical-check and return the field of maxval.  And then 
call to

wi::to_widest

The other code "widest_int::from (wi::max_value (..,..),..)" calls
wi::max_value
and widest_int::from.

I'm wondering if wi::to_widest (TYPE_MAX_VALUE (ntype)) is cheaper?


TYPE_MAX_VALUE can be "suprising", it does not necessarily match the
underlying modes precision.  At some point we've tried to eliminate
most of its uses, not sure what the situation/position is right now.

Ok, get it, thanks.
I will use "widest_int::from (wi::max_value (..,..),..)".




> I think the -1 above comes from number of latch iterations vs. header
> entries - it's a common source for this kind of issues.  range analysis
> might be able to prove that we can still merge the two adds even with
> the intermediate extension.
Yes, as you mentioned here, it relates to number of latch iterations
For loop looks like : while (l < n) or for (i = 0; i < n; i++)
This kind of loop, the niter is used to be 'n - 1' after transformed
into 'do-while' form.
For this kind of loop, the max value for the number of iteration "n - 
1"

would be "max_value_type(n) - 1" which is wi::ltu than max_value_type.
This kind of loop is already common, and we could use wi::ltu (max,
max_value_type)
to check.

For loop looks like:
  do ;
  while (n-- > 0); /* while  (n-- > low); */

The niter_desc->max will wi::eq to max_value_type, and niter would be 
"n",

and then doloop.xx is 'n+1'.


I would see how to merge these two adds safely at this point
when generating doloop iv. (maybe range info, thanks!



For some cas

[PATCH v2] x86: Don't enable UINTR in 32-bit mode

2021-07-12 Thread H.J. Lu via Gcc-patches

UINTR is available only in 64-bit mode.  Since the codegen target is
unknown when the the gcc driver is processing -march=native, to properly
handle UINTR for -march=native:

1. Pass arch[32|64] and tune[32|64] to host_detect_local_cpu to indicate
32-bit and 64-bit codegen.
2. Change ix86_option_override_internal to enable UINTR only in 64-bit
mode for -march=CPU when PTA_CPU includes PTA_UINTR.

gcc/

PR target/101395
* config/i386/driver-i386.c (host_detect_local_cpu): Check
arch32/tune32 and arch64/tune64 for 32-bit and 64-bit codegen.
Enable UINTR only for 64-bit codegen.
* config/i386/i386-options.c
(ix86_option_override_internal::DEF_PTA): Skip PTA_UINTR if not
in 64-bit mode.
* config/i386/i386.h (CC1_CPU_SPEC): Pass arch32/tune32 for
32-bit codegen and arch64/tune64 for 64-bit codegen.

gcc/testsuite/

PR target/101395
* gcc.target/i386/pr101395-1.c: New test.
* gcc.target/i386/pr101395-2.c: Likewise.
* gcc.target/i386/pr101395-3.c: Likewise.
---
 gcc/config/i386/driver-i386.c  | 30 --
 gcc/config/i386/i386-options.c |  1 +
 gcc/config/i386/i386.h |  9 ---
 gcc/testsuite/gcc.target/i386/pr101395-1.c | 12 +
 gcc/testsuite/gcc.target/i386/pr101395-2.c | 22 
 gcc/testsuite/gcc.target/i386/pr101395-3.c |  6 +
 6 files changed, 69 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101395-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101395-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101395-3.c

diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index dd9236616b4..b84f86d8f43 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -370,9 +370,9 @@ detect_caches_intel (bool xeon_mp, unsigned max_level,
 }
 
 /* This will be called by the spec parser in gcc.c when it sees
-   a %:local_cpu_detect(args) construct.  Currently it will be called
-   with either "arch" or "tune" as argument depending on if -march=native
-   or -mtune=native is to be substituted.
+   a %:local_cpu_detect(args) construct.  Currently it will be
+   called with either "arch[32|64]" or "tune[32|64]" as argument
+   depending on if -march=native or -mtune=native is to be substituted.
 
It returns a string containing new command line parameters to be
put at the place of the above two options, depending on what CPU
@@ -404,9 +404,18 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
   if (argc < 1)
 return NULL;
 
-  arch = !strcmp (argv[0], "arch");
+  arch = !strncmp (argv[0], "arch", 4);
 
-  if (!arch && strcmp (argv[0], "tune"))
+  if (!arch && strncmp (argv[0], "tune", 4))
+return NULL;
+
+  bool codegen_x86_64;
+
+  if (!strcmp (argv[0] + 4, "32"))
+codegen_x86_64 = false;
+  else if (!strcmp (argv[0] + 4, "64"))
+codegen_x86_64 = true;
+  else
 return NULL;
 
   struct __processor_model cpu_model = { };
@@ -804,8 +813,12 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
if (isa_names_table[i].option)
  {
if (has_feature (isa_names_table[i].feature))
- options = concat (options, " ",
-   isa_names_table[i].option, NULL);
+ {
+   if (codegen_x86_64
+   || isa_names_table[i].feature != FEATURE_UINTR)
+ options = concat (options, " ",
+   isa_names_table[i].option, NULL);
+ }
else
  options = concat (options, neg_option,
isa_names_table[i].option + 2, NULL);
@@ -813,7 +826,8 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
 }
 
 done:
-  return concat (cache, "-m", argv[0], "=", cpu, options, NULL);
+  const char *moption = arch ? "-march=" : "-mtune=";
+  return concat (cache, moption, cpu, options, NULL);
 }
 #else
 
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 7a35c468da3..7cba655595e 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2109,6 +2109,7 @@ ix86_option_override_internal (bool main_args_p,
 #define DEF_PTA(NAME) \
if (((processor_alias_table[i].flags & PTA_ ## NAME) != 0) \
&& PTA_ ## NAME != PTA_64BIT \
+   && (TARGET_64BIT || PTA_ ## NAME != PTA_UINTR) \
&& !TARGET_EXPLICIT_ ## NAME ## _P (opts)) \
  SET_TARGET_ ## NAME (opts);
 #include "i386-isa.def"
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 8c3eace56da..ae9f455c48d 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -577,9 +577,12 @@ extern const char *host_detect_local_cpu (int argc, const 
char **argv);
 #define CC1_CPU_SPEC CC1_CPU_SPEC_1
 #else
 #define CC1_CPU_SPEC CC1_CPU_SPEC_1 \
-"%{march=nativ

Re: [PATCH, rs6000] fix execution failure of parity_1.f90 on P10 [PR100952]

2021-07-12 Thread HAO CHEN GUI via Gcc-patches


Hi,

   I refined the patch according to Segher's advice. Is this okay for 
trunk? Any recommendations? Thanks a lot.


On 6/7/2021 上午 11:01, HAO CHEN GUI wrote:

Hi,

   The patch fixed the wrong "if" fall through in "cstore4" 
expand, which causes comparison pattern expanded twice on P10.


   The attachments are the patch diff and change log file.

    Bootstrapped and tested on powerpc64le-linux with no regressions. 
Is this okay for trunk? Any recommendations? Thanks a lot.


PR target/100952
* config/rs6000/rs6000.md (cstore4): Fix wrong fall through.
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3f59b544f6a..d7c13d4e79d 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11623,7 +11623,10 @@ (define_expand "cstore4"
 {
   /* Everything is best done with setbc[r] if available.  */
   if (TARGET_POWER10 && TARGET_ISEL)
-rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx);
+{
+  rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx);
+  DONE;
+}
 
   /* Expanding EQ and NE directly to some machine instructions does not help
  but does hurt combine.  So don't.  */

[COMMITTED] Add relation processing to ubsan builtins, and EVRP equality.

2021-07-12 Thread Andrew MacLeod via Gcc-patches

Ubsan builtins use the plus,minus and multiply operations under the 
covers, but they are not relation aware.  This patch queries for a 
relation between the 2 operands and passes that on to range-ops.


This resolves gcc.dg/pr97505.c when operating in ranger-only mode, and 
with that, ranger-only mode runs the gcc testsuite clean on 
x86_64-pc-linux-gnu.



If there are no objections, I'd like to change the default mode to 
ranger only, and see if something shows up on any other architectures, 
or in any other places.  We can easily switch back if need be.


I'll keep running things in hybrid mode locally to ensure we don't start 
losing parity with legacy EVRP.  Does this seem reasonable?


Andrew

commit 9693ecdf7ed5dde9618d06560697ff8ee5e1e6b7
Author: Andrew MacLeod 
Date:   Mon Jul 12 14:38:42 2021 -0400

Add relation processing to ubsan builtins.

Ubsan builtins call the plus/minus/multiple fold routines, but did not
use any relation information between the 2 operands that is available.
query and pass any relations.
This resolves gcc.dg/pr97505.c when operating in ranger-only mode.

* gimple-range-fold.cc (fold_using_range::range_of_builtin_ubsan_call):
Query relation between the 2 operands and use it.

diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index 1fa4ace32b9..eff5d1f89f2 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -825,12 +825,14 @@ fold_using_range::range_of_builtin_ubsan_call (irange &r, gcall *call,
   tree arg1 = gimple_call_arg (call, 1);
   src.get_operand (ir0, arg0);
   src.get_operand (ir1, arg1);
+  // Check for any relation between arg0 and arg1.
+  relation_kind relation = src.query_relation (arg0, arg1);
 
   bool saved_flag_wrapv = flag_wrapv;
   // Pretend the arithmetic is wrapping.  If there is any overflow,
   // we'll complain, but will actually do wrapping operation.
   flag_wrapv = 1;
-  op->fold_range (r, type, ir0, ir1);
+  op->fold_range (r, type, ir0, ir1, relation);
   flag_wrapv = saved_flag_wrapv;
 
   // If for both arguments vrp_valueize returned non-NULL, this should

Re: [PATCH 4/4] rs6000: Add tests for SSE4.1 "blend" intrinsics

2021-07-12 Thread Segher Boessenkool

On Sun, Jul 11, 2021 at 11:19:56AM -0500, Bill Schmidt wrote:
> Please resubmit this when you resubmit 3/4, in case any adjustments are 
> needed.

It is testing if elsewhere-defined functions work according to its
specification -- let's hope that doesn't change ;-)


Segher

Re: [PATCH 2/4] rs6000: Add tests for SSE4.1 "test" intrinsics

2021-07-12 Thread Segher Boessenkool

On Sun, Jul 11, 2021 at 10:49:27AM -0500, Bill Schmidt wrote:
> LGTM.  I can't approve, but recommend approval as is.

Okay for trunk.  Thanks!


Segher


> >2021-06-29  Paul A. Clarke  
> >
> >gcc/testsuite/ChangeLog:
> > * gcc.target/powerpc/sse4_1-ptest.c: Copy from
> > gcc/testsuite/gcc.target/i386.

Re: [PATCH 1/4] rs6000: Add support for SSE4.1 "test" intrinsics

2021-07-12 Thread Segher Boessenkool

Hi!

On Sun, Jul 11, 2021 at 10:45:45AM -0500, Bill Schmidt wrote:
> On 6/29/21 1:08 PM, Paul A. Clarke via Gcc-patches wrote:
> >--- a/gcc/config/rs6000/smmintrin.h
> >+++ b/gcc/config/rs6000/smmintrin.h
> >@@ -116,4 +116,54 @@ _mm_blendv_epi8 (__m128i __A, __m128i __B, __m128i 
> >__mask)
> >return (__m128i) vec_sel ((__v16qu) __A, (__v16qu) __B, __lmask);
> >  }
> >
> >+extern __inline int __attribute__((__gnu_inline__, __always_inline__, 
> >__artificial__))
> Line too long, please fix here and below.  (Existing cases can be left.)

I wouldn't bother in this case.  There is no way to write these
attribute lines in a reasonable way, it doesn't overflow 80 char by that
much, and there isn't anything interesting at the end of line.

You could put it on a line by itself, which helps for now because it
won't get too long until you add another attribute ;-)

There should be a space before (( though, and "extern" on definitions is
superfluous.  But I do not care much about that either -- this isn't a
part of the compiler proper anyway :-)

> LGTM; I can't approve, but recommend approval with line lengths fixed.  

It is okay for trunk with whatever changes you want to do.  Thanks!

Segher

Re: Repost: [PATCH] PR 100167: Fix vector long long multiply/divide tests on power10

2021-07-12 Thread Segher Boessenkool

On Mon, Jul 12, 2021 at 12:47:43PM -0500, Bill Schmidt wrote:
> On 7/12/21 12:16 PM, Michael Meissner wrote:
> >>Shouldn't this last be { lp64 && has_arch_pwr10 } ?
> >Nope.  Because the power10 vector multiply is done in the vector unit, it 
> >can
> >generate the vmulld instruction.
> 
> Please document this, then.

And this really should not use lp64 but powerpc64, *but* that test is
broken still (it does not test if the *compiler* allows generating
these insns, *only* if the machine can execute the insns, and it is
otherwise broken on AIX, etc.)  We really should impprove that :-(

If it had used powerpc64 here, there would not have been this confusion.

The problem in fixing this of course is the historical uses of this
test.  Bah.


Segher

Re: [PATCH 3/4] rs6000: Add support for SSE4.1 "blend" intrinsics

2021-07-12 Thread Paul A. Clarke via Gcc-patches

On Sun, Jul 11, 2021 at 11:29:24AM -0500, Bill Schmidt wrote:
> On 7/11/21 11:17 AM, Bill Schmidt wrote:
> > On 6/29/21 1:08 PM, Paul A. Clarke via Gcc-patches wrote:
> > > _mm_blend_epi16 and _mm_blendv_epi8 were added earlier.
> > > Add these four to complete the set.
> > > 
> > > 2021-06-29  Paul A. Clarke  
> > > 
> > > gcc/ChangeLog:
> > > * config/rs6000/smmintrin.h (_mm_blend_pd, _mm_blendv_pd,
> > > _mm_blend_ps, _mm_blendv_ps): New.
> > > ---
> > >   gcc/config/rs6000/smmintrin.h | 46 +++
> > >   1 file changed, 46 insertions(+)
> > > 
> > > diff --git a/gcc/config/rs6000/smmintrin.h
> > > b/gcc/config/rs6000/smmintrin.h
> > > index 1b8cad135ed0..fa17a8b2f478 100644
> > > --- a/gcc/config/rs6000/smmintrin.h
> > > +++ b/gcc/config/rs6000/smmintrin.h
> > > @@ -116,6 +116,52 @@ _mm_blendv_epi8 (__m128i __A, __m128i __B,
> > > __m128i __mask)
> > >     return (__m128i) vec_sel ((__v16qu) __A, (__v16qu) __B, __lmask);
> > >   }
> > > 
> > > +extern __inline __m128d __attribute__((__gnu_inline__,
> > > __always_inline__, __artificial__))
> > Usual line length complaint. :)  Here and below...
> > > +_mm_blend_pd (__m128d __A, __m128d __B, const int __imm8)
> > > +{
> > > +  const signed char __tmp = (__imm8 & 0b10) * 0b0000 |
> > > +    (__imm8 & 0b01) * 0b;
> > > +  __v16qi __charmask = vec_splats ((signed char) __tmp);
> > > +  __charmask = vec_gb (__charmask);
> > > +  __v8hu __shortmask = (__v8hu) vec_unpackh (__charmask);
> > > +  #ifdef __BIG_ENDIAN__
> > > +  __shortmask = vec_reve (__shortmask);
> > > +  #endif
> > > +  return (__m128d) vec_sel ((__v2du) __A, (__v2du) __B, (__v2du)
> > > __shortmask);
> > 
> > This seems way too complex, and needs commentary to explain what you're
> > doing.  Doesn't this instruction just translate into some form of
> > xxpermdi?  Different ones for BE and LE, but still just xxpermdi, I
> > think.

xxpermdi won't work because the operation here is different.
blend_pd is "for each element, select a value from A or B"
(more like a "select" operation than a permute, such that the result may be
entirely identical to an input), whereas xxpermdi always takes one value from
each input.

Your suggestion, below, to use vperm can also be used here.

> > > +}
> > > +
> > > +extern __inline __m128d __attribute__((__gnu_inline__,
> > > __always_inline__, __artificial__))
> > > +_mm_blendv_pd (__m128d __A, __m128d __B, __m128d __mask)
> > > +{
> > > +  const __v2di __zero = {0};
> > > +  const vector __bool long long __boolmask = vec_cmplt ((__v2di)
> > > __mask, __zero);
> > > +  return (__m128d) vec_sel ((__v2du) __A, (__v2du) __B, (__v2du)
> > > __boolmask);
> > > +}
> > 
> > Okay.
> > 
> > > +
> > > +extern __inline __m128 __attribute__((__gnu_inline__,
> > > __always_inline__, __artificial__))
> > > +_mm_blend_ps (__m128 __A, __m128 __B, const int __imm8)
> > > +{
> > > +  const signed char __mask = (__imm8 & 0b1000) * 0b00011000 |
> > > + (__imm8 & 0b0100) * 0b1100 |
> > > + (__imm8 & 0b0010) * 0b0110 |
> > > + (__imm8 & 0b0001) * 0b0011;
> > > +  __v16qi __charmask = vec_splats ( __mask);
> > > +  __charmask = vec_gb (__charmask);
> > > +  __v8hu __shortmask = (__v8hu) vec_unpackh (__charmask);
> > 
> > This is a good trick, but you need comments to explain what you're
> > doing, including how you build __mask.  I recommend you include
> > alternate code for P10, where you can just use vec_genwm to expand from
> > __mask to a mask of word elements.
> > 
> > I don't understand how you're getting away with a v8hu mask for word
> > elements.  This seems wrong to me.  Adequate testing?

chars unpack to shorts. If you construct the char vector carefully, it all
just works.  In the end, it's a long bitmask.

Regardless, I'll go with your alternate approach, below...

> As an alternate approach, I suppose you could use vec_perm / vec_permr with
> one of sixteen possible masks, which would seem faster than the
> splat/gather/unpack/select approach.  Something to consider.

vperm will work, and is a bit more straightforward, if just a bit more code.

> > > +  #ifdef __BIG_ENDIAN__
> > > +  __shortmask = vec_reve (__shortmask);
> > > +  #endif
> > > +  return (__m128) vec_sel ((__v8hu) __A, (__v8hu) __B, __shortmask);
> > > +}
> > > +
> > > +extern __inline __m128 __attribute__((__gnu_inline__,
> > > __always_inline__, __artificial__))
> > > +_mm_blendv_ps (__m128 __A, __m128 __B, __m128 __mask)
> > > +{
> > > +  const __v4si __zero = {0};
> > > +  const vector __bool int __boolmask = vec_cmplt ((__v4si) __mask,
> > > __zero);
> > > +  return (__m128) vec_sel ((__v4su) __A, (__v4su) __B, (__v4su)
> > > __boolmask);
> > > +}
> > > +
> > 
> > Okay.
> > 
> > Please have a look at the above issues and resubmit.  Thanks!

Will do. Thanks for the reviews!

PC

Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514

2021-07-12 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

> On 10.06.21 20:52, Harald Anlauf via Fortran wrote:
> > +static bool
> > +substring_has_constant_len (gfc_expr *e)
> > +{
> > +  ptrdiff_t istart, iend;
> > +  size_t length;
> > +  bool equal_length = false;
> > +
> > +  if (e->ts.type != BT_CHARACTER
> > +  || !e->ref
> > +  || e->ref->type != REF_SUBSTRING
> 
> Is there a reason why you do not handle:
> 
> type t
>character(len=5) :: str1
>character(len=:), allocatable :: str2
> end type
> type(t) :: x
> 
> allocate(x%str2, source="abd")
> if (len (x%str)) /= 1) ...
> if (len (x%str2(1:2) /= 2) ...
> etc.
> 
> Namely: Search the last_ref = expr->ref->next->next ...?
> and then check that lastref?

I was assuming that the argument passed to LEN() is already the ultimate
component for the case of substrings, and I was unable to find a case which
requires implementing that iteration.  The cases you provided do not seem
to apply here:

- derived type component str1, which is a string of given length, poses no
  problem.  I added a case to the testcase, see attached updated patch.

- derived type component str2 has deferred length.  I do not see that the
  simplification can be applied here, as the allocation could lead to str2
  being too short, and we do not want to simplify invalid code, such as:

type t
   character(len=:), allocatable :: str2
end type
type(t) :: x

allocate(x%str2, source="z")
if (len (x%str2(1:2)) /= 2) stop 1
end

If we want this to be catchable by bounds checking, we need to punt
at simplification of this.  The updated patch skips deferred strings.

>* * *
> 
> Slightly unrelated: I think the following does not violate
> F2018's R916 / C923 – but is rejected, namely:
>R916  type-param-inquiry  is  designator % type-param-name
> the latter is 'len' or 'kind' for intrinsic types. And:
>R901  designator is ...
> or substring
> But
> 
> character(len=5) :: str
> print *, str(1:3)%len
> end
> 
> fails with
> 
>  2 | print *, str(1:3)%len
>|  1
> Error: Syntax error in PRINT statement at (1)
> 
> 
> Assuming you don't want to handle it, can you open a new PR?
> Thanks!

Good point.  I'd rather open a separate PR for this, though.

> > +  istart = gfc_mpz_get_hwi (e->ref->u.ss.start->value.integer);
> > +  iend = gfc_mpz_get_hwi (e->ref->u.ss.end->value.integer);
> > +  length = gfc_mpz_get_hwi (e->ref->u.ss.length->length->value.integer);
> > +
> > +  if (istart <= iend)
> > +{
> > +  if (istart < 1)
> > + {
> > +   gfc_error ("Substring start index (%ld) at %L below 1",
> > +  (long) istart, &e->ref->u.ss.start->where);
> 
> As mentioned by Bernhard, you could use HOST_WIDE_INT_PRINT_DEC.
> 
> (It probably only matters on Windows which uses long == int = 32bit for
> strings longer than INT_MAX.)

I am not familiar enough with Windows.  What is HOST_WIDE_INT
on that system?  (As compared to e.g. size_t, ptrdiff_t).

The (slightly) updated patch regtests fine.

Thanks,
Harald
diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index c27b47aa98f..7f22372afec 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -4512,6 +4512,62 @@ gfc_simplify_leadz (gfc_expr *e)
 }


+/* Check for constant length of a substring.  */
+
+static bool
+substring_has_constant_len (gfc_expr *e)
+{
+  ptrdiff_t istart, iend;
+  size_t length;
+  bool equal_length = false;
+
+  if (e->ts.type != BT_CHARACTER
+  || e->ts.deferred
+  || !e->ref
+  || e->ref->type != REF_SUBSTRING
+  || !e->ref->u.ss.start
+  || e->ref->u.ss.start->expr_type != EXPR_CONSTANT
+  || !e->ref->u.ss.end
+  || e->ref->u.ss.end->expr_type != EXPR_CONSTANT
+  || !e->ref->u.ss.length
+  || !e->ref->u.ss.length->length
+  || e->ref->u.ss.length->length->expr_type != EXPR_CONSTANT)
+return false;
+
+  /* Basic checks on substring starting and ending indices.  */
+  if (!gfc_resolve_substring (e->ref, &equal_length))
+return false;
+
+  istart = gfc_mpz_get_hwi (e->ref->u.ss.start->value.integer);
+  iend = gfc_mpz_get_hwi (e->ref->u.ss.end->value.integer);
+  length = gfc_mpz_get_hwi (e->ref->u.ss.length->length->value.integer);
+
+  if (istart <= iend)
+{
+  if (istart < 1)
+	{
+	  gfc_error ("Substring start index (%ld) at %L below 1",
+		 (long) istart, &e->ref->u.ss.start->where);
+	  return false;
+	}
+  if (iend > (ssize_t) length)
+	{
+	  gfc_error ("Substring end index (%ld) at %L exceeds string "
+		 "length", (long) iend, &e->ref->u.ss.end->where);
+	  return false;
+	}
+  length = iend - istart + 1;
+}
+  else
+length = 0;
+
+  /* Fix substring length.  */
+  e->value.character.length = length;
+
+  return true;
+}
+
+
 gfc_expr *
 gfc_simplify_len (gfc_expr *e, gfc_expr *kind)
 {
@@ -4521,7 +4577,11 @@ gfc_simplify_len (gfc_expr *e, gfc_expr *kind)
   if (k == -1)
 return &gfc_bad_expr;

-  if (e->expr_type == EXPR_CONSTANT)
+  if (e->ts.deferred)
+return

Re: [patch] PR jit/87808: Allow libgccjit to work without an external gcc driver

2021-07-12 Thread Matthias Klose

On 3/26/19 12:52 PM, Matthias Klose wrote:
> On 22.03.19 23:00, David Malcolm wrote:
>> On Thu, 2019-03-21 at 12:26 +0100, Matthias Klose wrote:
>>> Fix PR jit/87808, the embedded driver still needing the external gcc
>>> driver to
>>> find the gcc_lib_dir. This can happen in a packaging context when
>>> libgccjit
>>> doesn't depend on the gcc package, but just on binutils and libgcc-
>>> dev packages.
>>> libgccjit probably could use /proc/self/maps to find the gcc_lib_dir,
>>> but that
>>> doesn't seem to be very portable.
>>>
>>> Ok for the trunk and the branches?
>>>
>>> Matthias
>>
>> [CCing the jit list]
>>
>> I've been trying to reproduce this bug in a working copy, and failing.
>>
>> Matthias, do you have a recipe you've been using to reproduce this?
> 
> the JIT debug log shows the driver names that it wants to call.  Are you sure
> that this driver isn't available anywhere?  I configure the gcc build with
> --program-suffix=-8 --program-prefix=x86_64-linux-gnu-, and that one was only
> available in one place, /usr/bin.
> 
> Matthias

David, the bug report now has two more comments from people that the current
behavior is broken.  Please could you review the patch?

Thanks, Matthias

Re: Repost: [PATCH] Deal with prefixed loads/stores in tests, PR testsuite/100166

2021-07-12 Thread Segher Boessenkool

On Wed, Jul 07, 2021 at 04:03:20PM -0400, Michael Meissner wrote:
> This patch updates the various tests in the testsuite to treat plxv
> and pstxv as being vector loads/stores.

(That is a not a very good description)

The reason it is hard to review this patch is there was no rationale
whatsoever for any of these changes, and the approach is different for
every case as well.  This means every case needs separate checking
whether it is correct, which is more work than doing the changes in the
first place.

Mechanically just changing things so the test cases pass is **wrong**.

Segher

Re: Repost: [PATCH] Deal with prefixed loads/stores in tests, PR testsuite/100166

2021-07-12 Thread Segher Boessenkool

On Mon, Jul 12, 2021 at 10:32:47AM -0500, Bill Schmidt wrote:
> > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c:

There should be something after the ":".  If you want to say the same
thing for many files, you can say somethin like "ditto" for all but the
first.

> >-/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv)" } } */
> >+/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv|plxv|pstxv)" } } */

Please use p?lxv etc.

> >-/* { dg-final { scan-assembler-times 
> >{\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mlvx\M} 12 } } */
> >+/* { dg-final { scan-assembler-times 
> >{\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mlvx\M|\mplxv\M} 12 } } */

Same here.  p?lxvx? I suppose.  Unless there is a reason plain lxv
would be bad here?

> >-/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxvx\M|\mlvx\M} 6 } } 
> >*/
> >+/* { dg-final { scan-assembler-times 
> >{\mlxvd2x\M|\mlxvx\M|\mlvx\M|\mplxv\M} 6 } } */

Maybe you even want

/* { dg-final { scan-assembler-times {\m(?:p?lxv|lvx\M)} 6 } } */

or such?

> >-/* { dg-final { scan-assembler-times 
> >{\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mlvx\M} 6 } } */
> >+/* { dg-final { scan-assembler-times 
> >{\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mlvx\M|\mplxv\M} 6 } } */

(Something caused these lines to be wrapped btw, please fix that in the
mail config -- not sure where it happened)

> >--- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
> >+++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
> >@@ -30,5 +30,5 @@ vector signed long long splats4(void)
> >  
> >  /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */
> >  /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */
> >-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */
> >+/* { dg-final { scan-assembler-times 
> >{\mlvx\M|\mlxv\M|\mlxvd2x\M|\mplvx\M} 2 } } */
> 
> Oopsie.  I think you mean "plxv" for this one.

So why did testing not catch it?

Thanks for the review Bill!  :-)


Segher

Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-12 Thread Qing Zhao via Gcc-patches

Hi, Kees,

Thanks a lot for your testing on kernel testing cases. 

I have some question in below:

> On Jul 12, 2021, at 12:56 PM, Kees Cook  wrote:
> 
> On Wed, Jul 07, 2021 at 05:38:02PM +, Qing Zhao wrote:
>> Hi, 
>> 
>> This is the 4th version of the patch for the new security feature for GCC.
>> 
>> I have tested it with bootstrap on both x86 and aarch64, regression testing 
>> on both x86 and aarch64.
>> Also compile and run CPU2017, without any issue.
>> 
>> Please take a look and let me know your comments and suggestions.
> 
> Thanks for the update!
> 
> It looks like padding initialization has regressed to where things where
> in version 1[1] (it was, however, working in version 2[2]). I'm seeing
> these failures again in the kernel self-test:
> 
> test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
> test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
> test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
> test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
> test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
> test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)
 
Are the above failures for -ftrivial-auto-var-init=zero or 
-ftrivial-auto-var-init=pattern?  Or both?

For the current implementation, I believe that all paddings should be 
initialized with this option, 
for -ftrivial-auto-var-init=zero, the padding will be initialized to zero as 
before, however, for
-ftrivial-auto-var-init=pattern, the padding will be initialized to 0xFE 
byte-repeatable patterns.

> 
> In looking at the gcc test cases, I think the wrong thing is
> being checked: we want to verify the padding itself. For example,
> in auto-init-17.c, the actual bytes after "four" need to be checked,
> rather than "four" itself.

**For the current auto-init-17.c

  1 /* Verify zero initialization for array type with structure element with
  2padding.  */
  3 /* { dg-do compile } */
  4 /* { dg-options "-ftrivial-auto-var-init=zero" } */
  5 
  6 struct test_trailing_hole {
  7 int one;
  8 int two;
  9 int three;
 10 char four;
 11 /* "sizeof(unsigned long) - 1" byte padding hole here. */
 12 };
 13 
 14 
 15 int foo ()
 16 {
 17   struct test_trailing_hole var[10];
 18   return var[2].four;
 19 }
 20 
 21 /* { dg-final { scan-assembler "movl\t\\\$0," } } */
 22 /* { dg-final { scan-assembler "movl\t\\\$20," } } */
 23 /* { dg-final { scan-assembler "rep stosq" } } */
~  
**We have the assembly as: (-ftrivial-auto-var-init=zero)

.file   "auto-init-17.c"
.text
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq%rsp, %rbp
.cfi_def_cfa_register 6
subq$40, %rsp
leaq-160(%rbp), %rax
movq%rax, %rsi
movl$0, %eax
movl$20, %edx
movq%rsi, %rdi
movq%rdx, %rcx
rep stosq
movzbl  -116(%rbp), %eax
movsbl  %al, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size   foo, .-foo
.section.note.GNU-stack,"",@progbits

From the above, we can see,  “zero” will be used to initialize 8 * 20 = 16 * 10 
bytes of memory starting from the beginning of “var”, that include all the 
padding holes inside
This array of structure. 

I didn’t see issue with padding initialization here.

Do I miss anything here?

For pattern initialization, since we currently initialize all paddings with 
0xFE byte-repeatable patterns, if the testing case still assumes zero 
initialization, then the testing cases
Need to be updated with this fact.

> For example, something like this:
> 
> struct test_trailing_hole {
>int one;
>int two;
>int three;
>char four;
>/* "sizeof(unsigned long) - 1" byte padding hole here. */
> };
> 
> #define offsetofend(STRUCT, MEMBER) \
>  (__builtin_offsetof(STRUCT, MEMBER) + sizeofSTRUCT *)0)->MEMBER)))
> 
> int foo ()
> { 
>  struct test_trailing_hole var[10];
>  unsigned char *ptr = (unsigned char *)&var[2];
>  int i;
> 
>  for (i = 0; i < sizeof(var[2]) - offsetofend(typeof(var[2]), four); i++) {
>if (ptr[i] != 0)
>  return 1;
>  } 
>  return 0;
> }
> 
> But this isn't actually sufficient because they may _accidentally_
> be zero already. The kernel tests specifically make sure to fill the
> about-to-be-used stack with 0xff before calling a function like foo()
> above.
> 
> (And as an aside, it seems like naming the test cases with some details
> about what is being tested in the filename would be nice -- it was
> a little weird having to dig through their numeric names to find the
> padding tests.)

Yes, I will fix the testing names to more reflect the testing details. 

thanks.

Qing
> 
> Otherwise, this looks like it's coming along; I remain very excited!
> Thank y

Re: Repost: [PATCH] Fix vec-splati-runnable.c test.

2021-07-12 Thread David Edelsohn via Gcc-patches

On Wed, Jul 7, 2021 at 4:01 PM Michael Meissner  wrote:
>
> [PATCH] Fix vec-splati-runnable.c test.
>
> I noticed that the vec-splati-runnable.c did not have an abort after one
> of the tests.  If the test was run with optimization, the optimizer could
> delete some of the tests and throw off the count.  However, due to the
> fact that the value being loaded in that test is undefined, I did not
> check what value was loaded, but I just stored it into a volatile global
> variable.
>
> 2021-07-07  Michael Meissner  
>
> gcc/testsuite/
> * gcc.target/powerpc/vec-splati-runnable.c: Run test with -O2
> optimization.  Do not check what XXSPLTIDP generates if the value
> is undefined.

Okay.

Thanks, David

Re: contracts library support (was Re: [PATCH] PING implement pre-c++20 contracts)

2021-07-12 Thread Jonathan Wakely via Gcc-patches

On Mon, 5 Jul 2021 at 20:07, Jason Merrill  wrote:
>
> On 6/26/21 10:23 AM, Andrew Sutton wrote:
> >
> > I ended up taking over this work from Jeff (CC'd on his existing email
> > address). I scraped all the contracts changes into one big patch
> > against master. See attached. The ChangeLog.contracts files list the
> > sum of changes for the patch, not the full history of the work.
>
> Jonathan, can you advise where the library support should go?
>
> In N4820  was part of the language-support clause, which makes
> sense, but it uses string_view, which brings in a lot of the rest of the
> library.  Did LWG talk about this when contracts went in?  How are
> freestanding implementations expected to support contracts?

I don't recall that being discussed, but I think I was in another room
for much of the contracts review.

If necessary we could make the std::char_traits specialization
available freestanding, without the primary template (or the other
specializations). But since C++20 std::string_view also depends on
quite a lot of ranges, which depends on iterators, which is not
freestanding. Some of those dependencies were added more recently than
contracts was reviewed and then yanked out, so maybe wasn't considered
a big problem back then. In any case, depending on std::string_view
(even without the rest of std::basic_string_view) is not currently
possible for freestanding.

> I imagine the header should be  for now.

Agreed.

Re: Repost: [PATCH] PR 100167: Fix vector long long multiply/divide tests on power10

2021-07-12 Thread David Edelsohn via Gcc-patches

On Mon, Jul 12, 2021 at 1:47 PM Bill Schmidt  wrote:
>
> On 7/12/21 12:16 PM, Michael Meissner wrote:
> > On Sun, Jul 11, 2021 at 02:55:04PM -0500, Bill Schmidt wrote:
> >> Hi Mike,
> >>
> >> On 7/7/21 3:04 PM, Michael Meissner wrote:
> >>> [PATCH] PR 100167: Fix vector long long multiply/divide tests on power10.
> >>>
> >>> This patch updates the vector long long multiply and divide tests to
> >>> supply the correct code information if power10 code generation is used.
> >>>
> >>> 2021-07-07  Michael Meissner  
> >>>
> >>> gcc/testsuite/
> >>> PR testsuite/100167
> >>> * gcc.target/powerpc/fold-vec-div-longlong.c:
> >> Missing information after colon.
> > Because all of the changes were the same thing, and the line is long 
> > enough.  I
> > just grouped all of the files together, and put the change line as the last
> > entry.
> But that's not accepted style.  Put it after the first one and use
> "Likewise" is the usual thing.  This looks like an omission.

Mike, that's a very creative idea, but let's stick to the precedent of
using Likewise or Same for repeated changes.

> >
> >>> * gcc.target/powerpc/fold-vec-mult-longlong.c: Fix expected code
> >>> generation on power10.
> >>> ---
> >>>   gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c  | 7 +--
> >>>   gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c | 3 ++-
> >>>   2 files changed, 7 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c 
> >>> b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
> >>> index 312e984d3cc..f6a9b290ae5 100644
> >>> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
> >>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
> >>> @@ -19,5 +19,8 @@ test6 (vector unsigned long long x, vector unsigned 
> >>> long long y)
> >>>   {
> >>> return vec_div (x, y);
> >>>   }
> >>> -/* { dg-final { scan-assembler-times {\mdivd\M} 2 } } */
> >>> -/* { dg-final { scan-assembler-times {\mdivdu\M} 2 } } */
> >>> +
> >>> +/* { dg-final { scan-assembler-times {\mdivd\M}   2 { target { ! 
> >>> has_arch_pwr10 } } } } */
> >>> +/* { dg-final { scan-assembler-times {\mdivdu\M}  2 { target { ! 
> >>> has_arch_pwr10 } } } } */
> >>> +/* { dg-final { scan-assembler-times {\mvdivsd\M} 1 { target {   
> >>> has_arch_pwr10 } } } } */
> >>> +/* { dg-final { scan-assembler-times {\mvdivud\M} 1 { target {   
> >>> has_arch_pwr10 } } } } */
> >>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c 
> >>> b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
> >>> index 38dba9f5023..bd210e34801 100644
> >>> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
> >>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
> >>> @@ -20,5 +20,6 @@ test6 (vector unsigned long long x, vector unsigned 
> >>> long long y)
> >>> return vec_mul (x, y);
> >>>   }
> >>> -/* { dg-final { scan-assembler-times "\[ \t\]mulld " 4 { target lp64 } } 
> >>> } */
> >>> +/* { dg-final { scan-assembler-times {\mmulld\M}  4 { target { lp64 && { 
> >>> ! has_arch_pwr10 } } } } } */
> >>> +/* { dg-final { scan-assembler-times {\mvmulld\M} 2 { target { 
> >>> has_arch_pwr10 } } } } */
> >> Shouldn't this last be { lp64 && has_arch_pwr10 } ?
> > Nope.  Because the power10 vector multiply is done in the vector unit, it 
> > can
> > generate the vmulld instruction.
>
> Please document this, then.
>
> Thanks,
> Bill
>
> >
> >> Otherwise LGTM.  I can't approve, but recommend approval with those 
> >> changes.

Okay with a correct ChangeLog and a comment on the vmulld line.

Thanks, David

Re: [PATCH] c++: permit deduction guides at class scope [PR79501]

2021-07-12 Thread Jason Merrill via Gcc-patches


On 7/9/21 6:41 PM, Patrick Palka wrote:

On Fri, 9 Jul 2021, Jason Merrill wrote:


On 7/9/21 4:18 PM, Patrick Palka wrote:

On Fri, 9 Jul 2021, Patrick Palka wrote:


On Fri, 9 Jul 2021, Jason Merrill wrote:


On 7/9/21 3:18 PM, Patrick Palka wrote:

This adds support for declaring (class-scope) deduction guides for a
member class template.  Fortunately it seems only a couple of changes
are needed in order for the existing CTAD machinery to handle them
like
any other deduction guide: we need to make sure to give them a
FUNCTION_TYPE instead of a METHOD_TYPE, and we need to avoid using a
BASELINK when looking them up.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for
trunk?

PR c++/79501

gcc/cp/ChangeLog:

* decl.c (grokfndecl): Don't require that deduction guides are
declared at namespace scope.  Check that class-scope deduction
guides have the same access as the member class template.
(grokdeclarator): Pretend class-scope deduction guides are
static.
* name-lookup.c (lookup_qualified_name): Don't use a BASELINK
for class-scope deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction92.C: New test.
* g++.dg/cpp1z/class-deduction93.C: New test.
* g++.dg/cpp1z/class-deduction94.C: New test.
---
gcc/cp/decl.c | 17 -
gcc/cp/name-lookup.c  | 11 +---
.../g++.dg/cpp1z/class-deduction92.C  | 16 
.../g++.dg/cpp1z/class-deduction93.C  | 25
+++
.../g++.dg/cpp1z/class-deduction94.C  | 19 ++
5 files changed, 79 insertions(+), 9 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction92.C
create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction93.C
create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction94.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index ebe1318d38d..8b8ffb7de83 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -10040,12 +10040,6 @@ grokfndecl (tree ctype,
if (deduction_guide_p (decl))
{
-  if (!DECL_NAMESPACE_SCOPE_P (decl))
-   {
- error_at (location, "deduction guide %qD must be declared at
"
-   "namespace scope", decl);
- return NULL_TREE;
-   }


Do we still reject deduction guides at function scope?


Yes, it looks like the parser doesn't even recognize them at function
scope:

template struct A;

int main() {
  A(int) -> A;
}

:4:4: error: missing template arguments before ‘(’ token
:4:5: error: expected primary-expression before ‘int’
:4:13: error: invalid use of ‘struct A’

Deduction guide templates are also still rejected (as with all templates
at
function scope).




  tree type = TREE_TYPE (DECL_NAME (decl));
  if (in_namespace == NULL_TREE
  && CP_DECL_CONTEXT (decl) != CP_TYPE_CONTEXT (type))
@@ -10055,6 +10049,13 @@ grokfndecl (tree ctype,
  inform (location_of (type), "  declared here");
  return NULL_TREE;
}
+  if (DECL_CLASS_SCOPE_P (decl)
+ && current_access_specifier != declared_access (TYPE_NAME
(type)))
+   {
+ error_at (location, "deduction guide %qD must have the same
access "
+ "as %qT", decl, type);
+ inform (location_of (type), "  declared here");
+   }
  if (funcdef_flag)
error_at (location,
  "deduction guide %qD must not have a function body",
decl);
@@ -12035,6 +12036,10 @@ grokdeclarator (const cp_declarator
*declarator,
  storage_class = declspecs->storage_class;
  if (storage_class == sc_static)
staticp = 1 + (decl_context == FIELD);
+  else if (decl_context == FIELD && sfk == sfk_deduction_guide)
+/* Treat class-scope deduction guides as static member functions
+   so that they get a FUNCTION_TYPE instead of a METHOD_TYPE.  */
+staticp = 2;
if (virtualp)
{
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 1be5f3da6d5..089bca1d471 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7110,9 +7110,14 @@ lookup_qualified_name (tree scope, tree name,
LOOK_want want, bool complain)
  else if (cxx_dialect != cxx98 && TREE_CODE (scope) ==
ENUMERAL_TYPE)
t = lookup_enumerator (scope, name);
  else if (is_class_type (scope, complain))
-t = lookup_member (scope, name, 2, bool (want & LOOK_want::TYPE),
-  tf_warning_or_error);
-
+{
+  t = lookup_member (scope, name, 2, bool (want &
LOOK_want::TYPE),
+tf_warning_or_error);
+  if (t && dguide_name_p (name))
+   /* Since class-scope deduction guides aren't really member
functions,
+  don't use a BASELINK for them.  */
+   t = MAYBE_BASELINK_FUNCTIONS (t);
+}


On second thought, this seems to be an awkward spot to do this
adjustment

Re: [PATCH V2] coroutines: Adjust outlined function names [PR95520].

2021-07-12 Thread Jason Merrill via Gcc-patches


On 7/11/21 9:03 AM, Iain Sandoe wrote:

Hi Jason,


On 9 Jul 2021, at 22:40, Jason Merrill  wrote:

On 7/9/21 2:18 PM, Iain Sandoe wrote:



How about handling this in write_encoding, along the lines of the 
devel/c++-contracts branch?


OK, so I took a look at this and implemented as below.


Oh, sorry, I didn't expect it to be such a large change!


  Some small differences from your contracts impl described here.

recalling

the original function becomes the ramp - it is called directly by the user-code.
the resumer (actor) contains the outlined code wrapped in synthesized logic as 
dictated by the std
the destroy function effectively calls the actor with a flag that says “take 
the DTOR path” (since the DTOR path has to be available in the case of resume 
too).

this means that is is possible for the actor to be partially (or completely for 
a generator-style coro) inlined into either the ramp or the destroyer.

1. using DECL_ABSTRACT_ORIGIN didn’t work with optimisation and debug since the 
inlining of the outlining confuses the issue (the actor/destory helpers are not 
real clones).


Hmm, I wonder if that will bite my use in contracts as well.  Can you 
elaborate?



  - there hasn’t been any specific reason to know “which” coroutine function 
was being lowered in the middle or back ends to date - so I had to add some 
book-keeping to allow that to be queried from write_encoding.

2. I had to cater for lambda coroutines; that meant recognising that we have a 
lambda coro helper and picking up the base mangling for the ramp (original 
lambda)

3. I made a minor adjustment to the string handling so that it can account for 
targets that don’t support ‘.’ or ‘$’ in symbols.


Speaking of which, I wonder if you also want to do something similar to what I 
did there to put the ramp/actor/destroyer functions into into the same comdat 
group.


I looked through your code and agree that it should be possible to be more 
restrictive about the interfaces presented by the actor and destroy functions 
in coros.  The ramp obviously has to keep the visiblity with which the user 
wrote it.

As for comdat groups, I’d need to look harder.

please could these things be TODOs - the fix for 95520 doesn’t make them any 
worse (or better), and there are other bugs that are higher priority.


Of course.


tested on x86_64-Linux,Darwin and powerpc64-linux, also with cppcoro (but i 
would plan to test it on folly too before pushing)

OK for master / backports?

=

The mechanism used to date for uniquing the coroutine helper
functions (actor, destroy) was over-complicating things and
leading to the noted PR and also difficulties in setting
breakpoints on these functions (so this will help PR99215 as
well).

This implementation delegates the adjustment to the mangling
to write_encoding() which necessitates some book-keeping so
that it is possible to determine which of the coroutine
helper names is to be mangled.

Signed-off-by: Iain Sandoe 

PR c++/95520 - [coroutines] __builtin_FUNCTION() returns mangled .actor instead 
of original function name

PR c++/95520

gcc/cp/ChangeLog:

* coroutines.cc (struct coroutine_info): Add fields for
actor and destroy function decls.
(to_ramp): New.
(coro_get_ramp_function): New.
(coro_get_actor_function): New.
(coro_get_destroy_function): New.
(act_des_fn): Set up mapping between ramp, actor and
destroy functions.
(morph_fn_to_coro): Adjust interface to the builder for
helper function decls.
* cp-tree.h (DECL_ACTOR_FN, DECL_DESTROY_FN, DECL_RAMP_FN
DECL_IS_CORO_ACTOR_P, DECL_IS_CORO_DESTROY_P JOIN_STR): New.
* mangle.c (write_encoding): Handle coroutine helpers.
(write_unqualified_name): Handle lambda coroutine helpers.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr95520.C: New test.
---
  gcc/cp/coroutines.cc  | 87 +++
  gcc/cp/cp-tree.h  | 30 
  gcc/cp/mangle.c   | 18 -
  gcc/testsuite/g++.dg/coroutines/pr95520.C | 29 
  4 files changed, 150 insertions(+), 14 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95520.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 54ffdc8d062..a75f55427cb 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -82,11 +82,13 @@ static bool coro_promise_type_found_p (tree, location_t);
  struct GTY((for_user)) coroutine_info
  {
tree function_decl; /* The original function decl.  */
-  tree promise_type; /* The cached promise type for this function.  */
-  tree handle_type;  /* The cached coroutine handle for this function.  */
-  tree self_h_proxy; /* A handle instance that is used as the proxy for the
-   one that will eventually be allocated in the coroutine
-   frame.  */
+  tree actor_decl;/* The synthesized actor function

Re: Repost: [PATCH] Change rs6000_const_f32_to_i32 return type.

2021-07-12 Thread David Edelsohn via Gcc-patches

On Mon, Jul 12, 2021 at 12:07 PM Bill Schmidt  wrote:
>
> Hi Mike,
>
> On 7/7/21 2:59 PM, Michael Meissner wrote:
> > [PATCH] Change rs6000_const_f32_to_i32 return type.
> >
> > The function rs6000_const_f32_to_i32 called REAL_VALUE_TO_TARGET_SINGLE
> > with a long long type and returns it.  This patch changes the type to long
> > which is the proper type for REAL_VALUE_TO_TARGET_SINGLE.
> >
> > 2021-07-07  Michael Meissner  
> >
> > gcc/
> >   * config/rs6000/rs6000-protos.h (rs6000_const_f32_to_i32): Change
> >   return type to long.
> >   * config/rs6000/rs6000.c (rs6000_const_f32_to_i32): Change return
> >   type to long.
> > ---
> >   gcc/config/rs6000/rs6000-protos.h | 2 +-
> >   gcc/config/rs6000/rs6000.c| 6 --
> >   2 files changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/config/rs6000/rs6000-protos.h 
> > b/gcc/config/rs6000/rs6000-protos.h
> > index 9de294d3b28..94bf961c6b7 100644
> > --- a/gcc/config/rs6000/rs6000-protos.h
> > +++ b/gcc/config/rs6000/rs6000-protos.h
> > @@ -281,7 +281,7 @@ extern void rs6000_asm_output_dwarf_pcrel (FILE *file, 
> > int size,
> >  const char *label);
> >   extern void rs6000_asm_output_dwarf_datarel (FILE *file, int size,
> >const char *label);
> > -extern long long rs6000_const_f32_to_i32 (rtx operand);
> > +extern long rs6000_const_f32_to_i32 (rtx operand);
> >
> >   /* Declare functions in rs6000-c.c */
> >
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index 9a5db63d0ef..de11de5e079 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -27936,10 +27936,12 @@ rs6000_invalid_conversion (const_tree fromtype, 
> > const_tree totype)
> > return NULL;
> >   }
> >
> > -long long
> > +/* Convert a SFmode constant to the integer bit pattern.  */
> > +
> > +long
> >   rs6000_const_f32_to_i32 (rtx operand)
> >   {
> > -  long long value;
> > +  long value;
> > const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (operand);
> >
> > gcc_assert (GET_MODE (operand) == SFmode);
>
> These changes look OK.  Can you please also fix the expander for
> xxspltiw_v4sf, which incorrectly expects a long long?
>
> I can't approve, but recommend approval with that also fixed.

This is okay with the fix to xxspltiw_v4sf in altivec.md.  And please
update the ChangeLog appropriately.

Thanks, David

Re: Repost: [PATCH] Deal with prefixed loads/stores in tests, PR testsuite/100166

2021-07-12 Thread David Edelsohn via Gcc-patches

On Wed, Jul 7, 2021 at 4:03 PM Michael Meissner  wrote:
>
> [PATCH] Deal with prefixed loads/stores in tests, PR testsuite/100166
>
> This patch updates the various tests in the testsuite to treat plxv
> and pstxv as being vector loads/stores.  This shows up if you run the
> testsuite with a compiler configured with the option: --with-cpu=power10.
>
> I have verified that these tests now all pass when I build and test a compiler
> on a power10 system using --with-cpu=power10.  I have verified that they
> continue to run on power9 little endian and power8 big endian systems.
>
> Can I check this into the master branch?
>
> 2021-07-07  Michael Meissner  
>
> gcc/testsuite/
> PR testsuite/100166
> * 
> gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c:
> * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c:
> * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c:
> * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c:
> * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c:
> * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c:
> * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c:
> * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c:
> * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c:
> * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c:
> * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c:
> * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c:
> * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c:
> * gcc.target/powerpc/fold-vec-load-vec_xl-char.c:
> * gcc.target/powerpc/fold-vec-load-vec_xl-double.c:
> * gcc.target/powerpc/fold-vec-load-vec_xl-float.c:
> * gcc.target/powerpc/fold-vec-load-vec_xl-int.c:
> * gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c:
> * gcc.target/powerpc/fold-vec-load-vec_xl-short.c:
> * gcc.target/powerpc/fold-vec-splat-floatdouble.c:
> * gcc.target/powerpc/fold-vec-splat-longlong.c:
> * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c:
> * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c:
> * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c:
> * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c:
> * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-longlong.c:
> * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c:
> * gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c:
> * gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c:
> * gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c:
> * gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c:
> * gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c:
> * gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c:
> * gcc.target/powerpc/fold-vec-store-vec_xst-char.c:
> * gcc.target/powerpc/fold-vec-store-vec_xst-double.c:
> * gcc.target/powerpc/fold-vec-store-vec_xst-float.c:
> * gcc.target/powerpc/fold-vec-store-vec_xst-int.c:
> * gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c:
> * gcc.target/powerpc/fold-vec-store-vec_xst-short.c:
> * gcc.target/powerpc/lvsl-lvsr.c:
> * gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c:
> Update insn counts to account for power10 prefixed loads and
> stores.

This is okay, modulo the one plvx -> plxv typo mentioned by Bill.

Thanks, David

Ping [PATCH] PR fortran/101084 - [10/11/12 Regression] ICE in gfc_typenode_for_spec, at fortran/trans-types.c:1124

2021-07-12 Thread Harald Anlauf via Gcc-patches

*Ping*

> Gesendet: Dienstag, 15. Juni 2021 um 21:31 Uhr
> Von: "Harald Anlauf" 
> An: "fortran" , "gcc-patches" 
> Betreff: [PATCH] PR fortran/101084 - [10/11/12 Regression] ICE in 
> gfc_typenode_for_spec, at fortran/trans-types.c:1124
>
> A recent change to the checking of legacy FORMAT tags did not handle
> cases where the type is not set.  Adjust the check.
>
> Regtested on x86_64-pc-linux-gnu.
>
> OK for mainline / 11- / 10-branch?
>
> Thanks,
> Harald
>
>
> Fortran: reject FORMAT tag of unknown type.
>
> gcc/fortran/ChangeLog:
>
>   PR fortran/101084
>   * io.c (resolve_tag_format): Extend FORMAT check to unknown type.
>
> gcc/testsuite/ChangeLog:
>
>   PR fortran/101084
>   * gfortran.dg/fmt_nonchar_3.f90: New test.
>
>

[PATCH v3] IBM Z: Use @PLT symbols for local functions in 64-bit mode

2021-07-12 Thread Ilya Leoshkevich via Gcc-patches

Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?

v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573614.html
v1 -> v2: Do not use UNSPEC_PLT in 64-bit code and rename it to
  UNSPEC_PLT31 (Ulrich, Andreas).  Do not append @PLT only to
  weak symbols in non-PIC code (Ulrich).  Add TLS tests.

v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574646.html
v2 -> v3: Use %K in function_profiler() and s390_output_mi_thunk(),
  add tests for these cases.



This helps with generating code for kernel hotpatches, which contain
individual functions and are loaded more than 2G away from vmlinux.
This should not create performance regressions for the normal use
cases, because for local functions ld replaces @PLT calls with direct
calls.

gcc/ChangeLog:

* config/s390/predicates.md (bras_sym_operand): Accept all
functions in 64-bit mode, use UNSPEC_PLT31.
(larl_operand): Use UNSPEC_PLT31.
* config/s390/s390.c (s390_loadrelative_operand_p): Likewise.
(legitimize_pic_address): Likewise.
(s390_emit_tls_call_insn): Mark __tls_get_offset as function,
use UNSPEC_PLT31.
(s390_delegitimize_address): Use UNSPEC_PLT31.
(s390_output_addr_const_extra): Likewise.
(print_operand): Add @PLT to TLS calls, handle %K.
(s390_function_profiler): Mark __fentry__/_mcount as function,
use %K, use UNSPEC_PLT31.
(s390_output_mi_thunk): Use only UNSPEC_GOT, use %K.
(s390_emit_call): Use UNSPEC_PLT31.
(s390_emit_tpf_eh_return): Mark __tpf_eh_return as function.
* config/s390/s390.md (UNSPEC_PLT31): Rename from UNSPEC_PLT.
(*movdi_64): Use %K.
(reload_base_64): Likewise.
(*sibcall_brc): Likewise.
(*sibcall_brcl): Likewise.
(*sibcall_value_brc): Likewise.
(*sibcall_value_brcl): Likewise.
(*bras): Likewise.
(*brasl): Likewise.
(*bras_r): Likewise.
(*brasl_r): Likewise.
(*bras_tls): Likewise.
(*brasl_tls): Likewise.
(main_base_64): Likewise.
(reload_base_64): Likewise.
(@split_stack_call): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/visibility/noPLT.C: Skip on s390x.
* g++.target/s390/mi-thunk.C: New test.
* gcc.target/s390/nodatarel-1.c: Move foostatic to the new
tests.
* gcc.target/s390/pr80080-4.c: Allow @PLT suffix.
* gcc.target/s390/risbg-ll-3.c: Likewise.
* gcc.target/s390/call.h: Common code for the new tests.
* gcc.target/s390/call-z10-pic-nodatarel.c: New test.
* gcc.target/s390/call-z10-pic.c: New test.
* gcc.target/s390/call-z10.c: New test.
* gcc.target/s390/call-z9-pic-nodatarel.c: New test.
* gcc.target/s390/call-z9-pic.c: New test.
* gcc.target/s390/call-z9.c: New test.
* gcc.target/s390/mfentry-m64-pic.c: New test.
* gcc.target/s390/tls.h: Common code for the new TLS tests.
* gcc.target/s390/tls-pic.c: New test.
* gcc.target/s390/tls.c: New test.
---
 gcc/config/s390/predicates.md |  9 ++-
 gcc/config/s390/s390.c| 81 +--
 gcc/config/s390/s390.md   | 32 
 gcc/testsuite/g++.dg/ext/visibility/noPLT.C   |  2 +-
 gcc/testsuite/g++.target/s390/mi-thunk.C  | 23 ++
 .../gcc.target/s390/call-z10-pic-nodatarel.c  | 20 +
 gcc/testsuite/gcc.target/s390/call-z10-pic.c  | 20 +
 gcc/testsuite/gcc.target/s390/call-z10.c  | 20 +
 .../gcc.target/s390/call-z9-pic-nodatarel.c   | 18 +
 gcc/testsuite/gcc.target/s390/call-z9-pic.c   | 18 +
 gcc/testsuite/gcc.target/s390/call-z9.c   | 20 +
 gcc/testsuite/gcc.target/s390/call.h  | 40 +
 .../gcc.target/s390/mfentry-m64-pic.c |  9 +++
 gcc/testsuite/gcc.target/s390/nodatarel-1.c   | 26 +-
 gcc/testsuite/gcc.target/s390/pr80080-4.c |  2 +-
 gcc/testsuite/gcc.target/s390/risbg-ll-3.c|  6 +-
 gcc/testsuite/gcc.target/s390/tls-pic.c   | 14 
 gcc/testsuite/gcc.target/s390/tls.c   | 10 +++
 gcc/testsuite/gcc.target/s390/tls.h   | 23 ++
 19 files changed, 320 insertions(+), 73 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/s390/mi-thunk.C
 create mode 100644 gcc/testsuite/gcc.target/s390/call-z10-pic-nodatarel.c
 create mode 100644 gcc/testsuite/gcc.target/s390/call-z10-pic.c
 create mode 100644 gcc/testsuite/gcc.target/s390/call-z10.c
 create mode 100644 gcc/testsuite/gcc.target/s390/call-z9-pic-nodatarel.c
 create mode 100644 gcc/testsuite/gcc.target/s390/call-z9-pic.c
 create mode 100644 gcc/testsuite/gcc.target/s390/call-z9.c
 create mode 100644 gcc/testsuite/gcc.target/s390/call.h
 create mode 100644 gcc/testsuite/gcc.target/s390/mfentry-m64-pic.c
 create mode 100644 gcc/testsuite/gcc.target/s390/tls-pic.c
 create mode 100644 gcc/testsuite/gcc.target/s390/tls.c
 create mode 1006

[PATCH] i386: Fix vec_set expanders [PR101424]

2021-07-12 Thread Uros Bizjak via Gcc-patches

AVX does not support 32-byte integer compares, required by
ix86_expand_vector_set_var.  The following patch fixes vec_set
expanders by introducing new vec_setm_avx2_operand predicate for AVX
vector modes.

gcc/

2021-07-12  Uroš Bizjak  

PR target/101424
* config/i386/predicates.md (vec_setm_sse41_operand):
Rename from vec_setm_operand.
(vec_setm_avx2_operand): New predicate.
* config/i386/sse.md (vec_set): Use V_128 mode iterator.
Use vec_setm_sse41_operand as operand 2 predicate.
(vec_set

PR target/101424
* gcc.target/i386/pr101424.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 986b758396a..0984f7cc44d 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -3604,7 +3604,7 @@ (define_insn "*pextrb_zext"
 (define_expand "vec_setv2hi"
   [(match_operand:V2HI 0 "register_operand")
(match_operand:HI 1 "register_operand")
-   (match_operand 2 "vec_setm_operand")]
+   (match_operand 2 "vec_setm_sse41_operand")]
   "TARGET_SSE2"
 {
   if (CONST_INT_P (operands[2]))
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 9488632ce24..6aa1ea32627 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1021,11 +1021,16 @@ (define_predicate "incdec_operand"
 })
 
 ;; True for registers, or const_int_operand, used to vec_setm expander.
-(define_predicate "vec_setm_operand"
+(define_predicate "vec_setm_sse41_operand"
   (ior (and (match_operand 0 "register_operand")
(match_test "TARGET_SSE4_1"))
(match_code "const_int")))
 
+(define_predicate "vec_setm_avx2_operand"
+  (ior (and (match_operand 0 "register_operand")
+   (match_test "TARGET_AVX2"))
+   (match_code "const_int")))
+
 (define_predicate "vec_setm_mmx_operand"
   (ior (and (match_operand 0 "register_operand")
(match_test "TARGET_SSE4_1")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 17c9e571d5d..ab2023d 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8486,9 +8486,9 @@ (define_insn "vec_setv2df_0"
(set_attr "mode" "DF")])
 
 (define_expand "vec_set"
-  [(match_operand:V 0 "register_operand")
+  [(match_operand:V_128 0 "register_operand")
(match_operand: 1 "register_operand")
-   (match_operand 2 "vec_setm_operand")]
+   (match_operand 2 "vec_setm_sse41_operand")]
   "TARGET_SSE"
 {
   if (CONST_INT_P (operands[2]))
@@ -8499,6 +8499,20 @@ (define_expand "vec_set"
   DONE;
 })
 
+(define_expand "vec_set"
+  [(match_operand:V_256_512 0 "register_operand")
+   (match_operand: 1 "register_operand")
+   (match_operand 2 "vec_setm_avx2_operand")]
+  "TARGET_AVX"
+{
+  if (CONST_INT_P (operands[2]))
+ix86_expand_vector_set (false, operands[0], operands[1],
+   INTVAL (operands[2]));
+  else
+ix86_expand_vector_set_var (operands[0], operands[1], operands[2]);
+  DONE;
+})
+
 (define_insn_and_split "*vec_extractv4sf_0"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=v,m,f,r")
(vec_select:SF
diff --git a/gcc/testsuite/gcc.target/i386/pr101424.c 
b/gcc/testsuite/gcc.target/i386/pr101424.c
new file mode 100644
index 000..28bb7230e47
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101424.c
@@ -0,0 +1,15 @@
+/* PR target/101424 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx" } */
+
+typedef int v4df __attribute__((vector_size(32)));
+
+int foo_v4df_b, foo_v4df_c;
+
+v4df
+__attribute__foo_v4df ()
+{
+  v4df a;
+  a[foo_v4df_c] = foo_v4df_b;
+  return a;
+}

[COMMITTED] tree-optimization/101335 - Do not register a cast as an equivalence.

2021-07-12 Thread Andrew MacLeod via Gcc-patches

Registering an equivalence between objects of the same size in a cast 
can cause other registered relations to be incorrect. Detailed in the 
PR.  This was an older attempt to solve a problem which has since been 
resolved by recomputation in the GORI engine.


Bootstrapped on  x86_64-pc-linux-gnu with no regressions. Pushed.

Andrew

commit a1539b797a06e03b08e1f1de28ad0d19a3956616
Author: Andrew MacLeod 
Date:   Mon Jul 12 11:38:17 2021 -0400

Do not register a cast as an equivalence.

Registering an equivalence between objects of the same size in a cast can
cause other relations to be incorrect.

gcc/
PR tree-optimization/101335
* range-op.cc (operator_cast::lhs_op1_relation): Delete.

gcc/testsuite/
* gcc.dg/tree-ssa/pr101335.c: New.

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index f8e4c6d4e49..08000465fd9 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -2159,10 +2159,6 @@ public:
 			  const irange &lhs,
 			  const irange &op2,
 			  relation_kind rel = VREL_NONE) const;
-  virtual enum tree_code lhs_op1_relation (const irange &lhs,
-	   const irange &op1,
-	   const irange &op2) const;
-
 private:
   bool truncating_cast_p (const irange &inner, const irange &outer) const;
   bool inside_domain_p (const wide_int &min, const wide_int &max,
@@ -2171,27 +2167,6 @@ private:
 			   const irange &outer) const;
 } op_convert;
 
-// Determine if there is a relationship between LHS and OP1.
-
-enum tree_code
-operator_cast::lhs_op1_relation (const irange &lhs,
- const irange &op1,
- const irange &op2 ATTRIBUTE_UNUSED) const
-{
-  if (op1.undefined_p ())
-return VREL_NONE;
-  // We can't make larger types equivalent to smaller types because we can
-  // miss sign extensions in a chain of casts.
-  // u32 = 0xf
-  // s32 = (s32) u32
-  // s64 = (s64) s32
-  // we cant simply "convert" s64 = (s64)u32  or we get positive 0x
-  // value instead of sign extended negative value.
-  if (TYPE_PRECISION (lhs.type ()) == TYPE_PRECISION (op1.type ()))
-return EQ_EXPR;
-  return VREL_NONE;
-}
-
 // Return TRUE if casting from INNER to OUTER is a truncating cast.
 
 inline bool
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr101335.c b/gcc/testsuite/gcc.dg/tree-ssa/pr101335.c
new file mode 100644
index 000..921362c2954
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr101335.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+unsigned a = 0x;
+int b;
+int main()
+{
+  int c = ~a;
+  unsigned d = c - 10;
+  if (d > c)
+c = 20;
+  b = -(c | 0);
+  if (b > -8)
+__builtin_abort ();
+  return 0;
+}
+

Re: [PATCH] [wwwdocs] Update description of GM2 and document branch

2021-07-12 Thread Gerald Pfeifer

On Mon, 12 Jul 2021, Gaius Mulley wrote:
>> Usually I'd just say "subject", which is a header in our mail systems;
>> the term "subject line" isn't widely used.
> feel free to overrule and use "subject".  I copied the text from other
> branch descriptions :-) (there are 38 uses).  I guess there should be
> consistency on the web page - perhaps they could all be changed though -
> what do you think?

Ah, in that case suggestion withdrawn, and I'll separately see whether
we can simplify/unify anything there.

> thanks for the suggestions and maintaining the pages.  Below are the
> proposed updated patches

As maintainer, you don't need to seek approval for doc or web patches 
related to your area of maintainership, though of course I'm always 
happy to have a look.

> +is fully operational with the GCC 10 and GCC 11 (on

Here I'd omit "the", though I cannot (linguistically) explain why
and have to refer to established practice.

Cheers,
Gerald

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Koning, Paul via Gcc-patches

> On Jul 12, 2021, at 12:36 PM, David Malcolm via Gcc-patches 
>  wrote:
> 
> On Mon, 2021-07-12 at 15:25 +0200, Martin Liška wrote:
>> ...
> 
> I think the output formats we need to support are:
> - HTML
> - PDF
> - man page (hardly "modern", but still used)

Also info format (for the Emacs info reader).  And ebook formats (epub and/or 
mobi).  Having good quality ebook output is a major benefit in my view; it 
would be very good for the standard makefiles to offer make targets for these 
formats.

paul

Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-12 Thread Qing Zhao via Gcc-patches

Hi, Martin,

Thanks a lot for your experiments and examples, they are really helpful.

So, based on your study, I will delete the code that handle 
grp_to_be_debug_replaced accesses in generate_subtree_deferred_init.

Let me know if you have further comments on this.

Qing


> On Jul 12, 2021, at 12:06 PM, Martin Jambor  wrote:
> 
> Hi,
> 
> On Mon, Jul 12 2021, Qing Zhao wrote:
>>> On Jul 12, 2021, at 2:51 AM, Richard Sandiford  
>>> wrote:
>>> 
>>> Martin Jambor  writes:
 On Thu, Jul 08 2021, Qing Zhao wrote:
> (Resend this email since the previous one didn’t quote, I changed one
> setting in my mail client, hopefully that can fix this issue).
> 
> Hi, Martin,
> 
> Thank you for the review and comment.
> 
>> On Jul 8, 2021, at 8:29 AM, Martin Jambor  wrote:
>>> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>>> index c05d22f3e8f1..35051d7c6b96 100644
>>> --- a/gcc/tree-sra.c
>>> +++ b/gcc/tree-sra.c
>>> @@ -384,6 +384,13 @@ static struct
>>> 
>>> /* Numbber of components created when splitting aggregate parameters.  
>>> */
>>> int param_reductions_created;
>>> +
>>> +  /* Number of deferred_init calls that are modified.  */
>>> +  int deferred_init;
>>> +
>>> +  /* Number of deferred_init calls that are created by
>>> + generate_subtree_deferred_init.  */
>>> +  int subtree_deferred_init;
>>> } sra_stats;
>>> 
>>> static void
>>> @@ -4096,6 +4103,110 @@ get_repl_default_def_ssa_name (struct access 
>>> *racc, tree reg_type)
>>> return get_or_create_ssa_default_def (cfun, racc->replacement_decl);
>>> }
>>> 
>>> +
>>> +/* Generate statements to call .DEFERRED_INIT to initialize scalar 
>>> replacements
>>> +   of accesses within a subtree ACCESS; all its children, siblings and 
>>> their
>>> +   children are to be processed.
>>> +   GSI is a statement iterator used to place the new statements.  */
>>> +static void
>>> +generate_subtree_deferred_init (struct access *access,
>>> +   tree init_type,
>>> +   tree is_vla,
>>> +   gimple_stmt_iterator *gsi,
>>> +   location_t loc)
>>> +{
>>> +  do
>>> +{
>>> +  if (access->grp_to_be_replaced)
>>> +   {
>>> + tree repl = get_access_replacement (access);
>>> + gimple *call
>>> +   = gimple_build_call_internal (IFN_DEFERRED_INIT, 3,
>>> + TYPE_SIZE_UNIT (TREE_TYPE 
>>> (repl)),
>>> + init_type, is_vla);
>>> + gimple_call_set_lhs (call, repl);
>>> + gsi_insert_before (gsi, call, GSI_SAME_STMT);
>>> + update_stmt (call);
>>> + gimple_set_location (call, loc);
>>> + sra_stats.subtree_deferred_init++;
>>> +   }
>>> +  else if (access->grp_to_be_debug_replaced)
>>> +   {
>>> + tree drepl = get_access_replacement (access);
>>> + tree call = build_call_expr_internal_loc
>>> +(UNKNOWN_LOCATION, IFN_DEFERRED_INIT,
>>> + TREE_TYPE (drepl), 3,
>>> + TYPE_SIZE_UNIT (TREE_TYPE (drepl)),
>>> + init_type, is_vla);
>>> + gdebug *ds = gimple_build_debug_bind (drepl, call,
>>> +   gsi_stmt (*gsi));
>>> + gsi_insert_before (gsi, ds, GSI_SAME_STMT);
>> 
>> Is handling of grp_to_be_debug_replaced accesses necessary here?  If so,
>> why?  grp_to_be_debug_replaced accesses are there only to facilitate
>> debug information about a part of an aggregate decl is that is likely
>> going to be entirely removed - so that debuggers can sometimes show to
>> users information about what they would contain had they not removed.
>> It seems strange you need to mark them as uninitialized because they
>> should not have any consumers.  (But perhaps it is also harmless.)
> 
> This part has been discussed during the 2nd version of the patch, but
> I think that more discussion might be necessary.
> 
> In the previous discussion, Richard Sandiford mentioned:
> (https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568620.html):
> 
> =
> 
> I guess the thing we need to decide here is whether 
> -ftrivial-auto-var-init
> should affect debug-only constructs too.  If it doesn't, exmaining removed
> components in a debugger might show uninitialised values in cases where
> the user was expecting initialised ones.  There would be no security
> concern, but it might be surprising.
> 
> I think in principle the DRHS can contain a call to DEFERRED_INIT.
> Doing that would probably require

Re: Repost: [PATCH] PR 100167: Fix vector long long multiply/divide tests on power10

2021-07-12 Thread Bill Schmidt via Gcc-patches


On 7/12/21 12:47 PM, Bill Schmidt via Gcc-patches wrote:


On 7/12/21 12:16 PM, Michael Meissner wrote:

On Sun, Jul 11, 2021 at 02:55:04PM -0500, Bill Schmidt wrote:

Hi Mike,

On 7/7/21 3:04 PM, Michael Meissner wrote:
[PATCH] PR 100167: Fix vector long long multiply/divide tests on 
power10.


This patch updates the vector long long multiply and divide tests to
supply the correct code information if power10 code generation is 
used.


2021-07-07  Michael Meissner  

gcc/testsuite/
PR testsuite/100167
* gcc.target/powerpc/fold-vec-div-longlong.c:

Missing information after colon.
Because all of the changes were the same thing, and the line is long 
enough.  I
just grouped all of the files together, and put the change line as 
the last

entry.
But that's not accepted style.  Put it after the first one and use 
"Likewise" is the usual thing.  This looks like an omission.



* gcc.target/powerpc/fold-vec-mult-longlong.c: Fix expected code
generation on power10.
---
  gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c | 7 +--
  gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c | 3 ++-
  2 files changed, 7 insertions(+), 3 deletions(-)

diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c

index 312e984d3cc..f6a9b290ae5 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
@@ -19,5 +19,8 @@ test6 (vector unsigned long long x, vector 
unsigned long long y)

  {
    return vec_div (x, y);
  }
-/* { dg-final { scan-assembler-times {\mdivd\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mdivdu\M} 2 } } */
+
+/* { dg-final { scan-assembler-times {\mdivd\M}   2 { target { ! 
has_arch_pwr10 } } } } */
+/* { dg-final { scan-assembler-times {\mdivdu\M}  2 { target { ! 
has_arch_pwr10 } } } } */
+/* { dg-final { scan-assembler-times {\mvdivsd\M} 1 { target {   
has_arch_pwr10 } } } } */
+/* { dg-final { scan-assembler-times {\mvdivud\M} 1 { target {   
has_arch_pwr10 } } } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c

index 38dba9f5023..bd210e34801 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
@@ -20,5 +20,6 @@ test6 (vector unsigned long long x, vector 
unsigned long long y)

    return vec_mul (x, y);
  }
-/* { dg-final { scan-assembler-times "\[ \t\]mulld " 4 { target 
lp64 } } } */
+/* { dg-final { scan-assembler-times {\mmulld\M}  4 { target { 
lp64 && { ! has_arch_pwr10 } } } } } */
+/* { dg-final { scan-assembler-times {\mvmulld\M} 2 { target { 
has_arch_pwr10 } } } } */

Shouldn't this last be { lp64 && has_arch_pwr10 } ?
Nope.  Because the power10 vector multiply is done in the vector 
unit, it can

generate the vmulld instruction.


Please document this, then.

Well, never mind, that's relatively obvious, sorry. :-)
Bill


Thanks,
Bill



Otherwise LGTM.  I can't approve, but recommend approval with those 
changes.


Thanks,
Bill

PING [PATCH] handle sanitizer built-ins in -Wuninitialized (PR 101300)

2021-07-12 Thread Martin Sebor via Gcc-patches


Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574385.html

On 7/2/21 1:21 PM, Martin Sebor wrote:

To avoid a class of false negatives for sanitized code
-Wuninitialized recognizes the ASAN_MARK internal function
doesn't modify its argument.  But the warning code doesn't do
the same for any sanitizer built-ins even though they don't
modify user-supplied arguments either.  This leaves another
class of false negatives unresolved.

The attached fix enhances the warning logic to recognize all
sanitizer built-ins as well and treat them as non-modifying.

Tested on x86_64-linux.

Martin

Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-12 Thread Kees Cook via Gcc-patches

On Wed, Jul 07, 2021 at 05:38:02PM +, Qing Zhao wrote:
> Hi, 
> 
> This is the 4th version of the patch for the new security feature for GCC.
> 
> I have tested it with bootstrap on both x86 and aarch64, regression testing 
> on both x86 and aarch64.
> Also compile and run CPU2017, without any issue.
> 
> Please take a look and let me know your comments and suggestions.

Thanks for the update!

It looks like padding initialization has regressed to where things where
in version 1[1] (it was, however, working in version 2[2]). I'm seeing
these failures again in the kernel self-test:

test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)

In looking at the gcc test cases, I think the wrong thing is
being checked: we want to verify the padding itself. For example,
in auto-init-17.c, the actual bytes after "four" need to be checked,
rather than "four" itself. For example, something like this:

struct test_trailing_hole {
int one;
int two;
int three;
char four;
/* "sizeof(unsigned long) - 1" byte padding hole here. */
};

#define offsetofend(STRUCT, MEMBER) \
  (__builtin_offsetof(STRUCT, MEMBER) + sizeofSTRUCT *)0)->MEMBER)))

int foo ()
{ 
  struct test_trailing_hole var[10];
  unsigned char *ptr = (unsigned char *)&var[2];
  int i;

  for (i = 0; i < sizeof(var[2]) - offsetofend(typeof(var[2]), four); i++) {
if (ptr[i] != 0)
  return 1;
  } 
  return 0;
}

But this isn't actually sufficient because they may _accidentally_
be zero already. The kernel tests specifically make sure to fill the
about-to-be-used stack with 0xff before calling a function like foo()
above.

(And as an aside, it seems like naming the test cases with some details
about what is being tested in the filename would be nice -- it was
a little weird having to dig through their numeric names to find the
padding tests.)

Otherwise, this looks like it's coming along; I remain very excited!
Thank you for sticking with it. :)

-Kees

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565840.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2021-April/567754.html

-- 
Kees Cook

Re: [PATCH 10/10] vect: Reuse reduction accumulators between loops

2021-07-12 Thread Richard Sandiford via Gcc-patches

Richard Biener via Gcc-patches  writes:
> On Fri, Jul 9, 2021 at 3:12 PM Richard Sandiford
>  wrote:
>>
>> Thanks for the review.
>>
>> Richard Biener  writes:
>> >> @@ -588,6 +600,23 @@ public:
>> >>/* Unrolling factor  */
>> >>poly_uint64 vectorization_factor;
>> >>
>> >> +  /* If this loop is an epilogue loop whose main loop can be skipped,
>> >> + MAIN_LOOP_EDGE is the edge from the main loop to this loop's
>> >> + preheader.  SKIP_MAIN_LOOP_EDGE is then the edge that skips the
>> >> + main loop and goes straight to this loop's preheader.
>> >> +
>> >> + Both fields are null otherwise.  */
>> >> +  edge main_loop_edge;
>> >> +  edge skip_main_loop_edge;
>> >> +
>> >> +  /* If this loop is an epilogue loop that might be skipped after 
>> >> executing
>> >> + the main loop, this edge is the one that skips the epilogue.  */
>> >> +  edge skip_this_loop_edge;
>> >> +
>> >> +  /* After vectorization, maps live-out SSA names to information about
>> >> + the reductions that generated them.  */
>> >> +  hash_map reusable_accumulators;
>> >
>> > Is that the LC PHI node defs or the definition inside of the loop?
>> > If the latter we could attach the info directly to its stmt-info?
>>
>> Ah, yeah, I should improve the comment there.  It's the vectoriser's
>> replacement for the original LC PHI node, i.e. the final scalar result
>> after the reduction has taken place.
>
> OK
>
>> >> @@ -1186,6 +1215,21 @@ public:
>> >>/* The vector type for performing the actual reduction.  */
>> >>tree reduc_vectype;
>> >>
>> >> +  /* If IS_REDUC_INFO is true and if the reduction is operating on N
>> >> + elements in parallel, this vector gives the initial values of these
>> >> + N elements.  */
>> >
>> > That's N scalar elements or N vector elements?  I suppose it's for
>> > SLP reductions (rather than SLP reduction chains) and never non-SLP
>> > reductions?
>>
>> Yeah, poor wording again, sorry.  I meant something closer to:
>>
>>   /* If IS_REDUC_INFO is true and if the vector code is performing
>>  N scalar reductions in parallel, this vector gives the initial
>>  scalar values of those N reductions.  */
>>
>> >> +  vec reduc_initial_values;
>> >> +
>> >> +  /* If IS_REDUC_INFO is true and if the reduction is operating on N
>> >> + elements in parallel, this vector gives the scalar result of each
>> >> + reduction.  */
>> >> +  vec reduc_scalar_results;
>>
>> Same change here.
>>
>> >> […]
>> >> diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
>> >> index 2909e8a0fc3..b7b0523e3c8 100644
>> >> --- a/gcc/tree-vect-loop-manip.c
>> >> +++ b/gcc/tree-vect-loop-manip.c
>> >> @@ -2457,6 +2457,31 @@ vect_update_epilogue_niters (loop_vec_info 
>> >> epilogue_vinfo,
>> >>return vect_determine_partial_vectors_and_peeling (epilogue_vinfo, 
>> >> true);
>> >>  }
>> >>
>> >> +/* LOOP_VINFO is an epilogue loop and MAIN_LOOP_VALUE is available on 
>> >> exit
>> >> +   from the corresponding main loop.  Return a value that is available in
>> >> +   LOOP_VINFO's preheader, using SKIP_VALUE if the main loop is skipped.
>> >> +   Passing a null SKIP_VALUE is equivalent to passing zero.  */
>> >> +
>> >> +tree
>> >> +vect_get_main_loop_result (loop_vec_info loop_vinfo, tree 
>> >> main_loop_value,
>> >> +  tree skip_value)
>> >> +{
>> >> +  if (!loop_vinfo->main_loop_edge)
>> >> +return main_loop_value;
>> >> +
>> >> +  if (!skip_value)
>> >> +skip_value = build_zero_cst (TREE_TYPE (main_loop_value));
>> >
>> > shouldn't that be the initial value?
>>
>> For the current use case, the above two conditions are never true.
>> I wrote it like this because I had a follow-on patch (which might
>> not go anywhere) that needed this function for 0-based IVs.
>>
>> Maybe that's a bad risk/reward trade-off though.  Not having to pass
>> zero makes things only slightly simpler for the follow-on patch,
>> and I guess could be dangerous in other cases.
>>
>> Perhaps in that case though I should change loop_vinfo->main_loop_edge
>> into a gcc_assert as well.
>
> Yeah, I think asserts (and comments in case it's because we don't handle
> some specific cases yet) are better than possibly wrong behavior.

OK.

>> >> +  tree phi_result = make_ssa_name (TREE_TYPE (main_loop_value));
>> >> +  basic_block bb = loop_vinfo->main_loop_edge->dest;
>> >> +  gphi *new_phi = create_phi_node (phi_result, bb);
>> >> +  add_phi_arg (new_phi, main_loop_value, loop_vinfo->main_loop_edge,
>> >> +  UNKNOWN_LOCATION);
>> >> +  add_phi_arg (new_phi, skip_value,
>> >> +  loop_vinfo->skip_main_loop_edge, UNKNOWN_LOCATION);
>> >> +  return phi_result;
>> >> +}
>> >> +
>> >>  /* Function vect_do_peeling.
>> >>
>> >> Input:
>> >> […]
>> >> @@ -4823,6 +4842,100 @@ info_for_reduction (vec_info *vinfo, 
>> >> stmt_vec_info stmt_info)
>> >>return stmt_info;
>> >>  }
>> >>
>> >> +/* PHI is a reduction in LOOP_VINFO that we are going to vectorize

[r12-2245 Regression] FAIL: g++.dg/vect/slp-pr87105.cc -std=c++2a scan-tree-dump-times slp2 "optimized: basic block part" 1 on Linux/x86_64

2021-07-12 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

c03cae4e066066278c8435c409829a9bf851e49f is the first bad commit
commit c03cae4e066066278c8435c409829a9bf851e49f
Author: Richard Biener 
Date:   Wed Jul 7 11:45:43 2021 +0200

Display the number of components BB vectorized

caused

FAIL: g++.dg/vect/slp-pr87105.cc  -std=c++14  scan-tree-dump-times slp2 
"optimized: basic block part" 1
FAIL: g++.dg/vect/slp-pr87105.cc  -std=c++17  scan-tree-dump-times slp2 
"optimized: basic block part" 1
FAIL: g++.dg/vect/slp-pr87105.cc  -std=c++2a  scan-tree-dump-times slp2 
"optimized: basic block part" 1

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-2245/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=g++.dg/vect/slp-pr87105.cc --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=g++.dg/vect/slp-pr87105.cc --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: Repost: [PATCH] PR 100167: Fix vector long long multiply/divide tests on power10

2021-07-12 Thread Bill Schmidt via Gcc-patches




On 7/12/21 12:16 PM, Michael Meissner wrote:

On Sun, Jul 11, 2021 at 02:55:04PM -0500, Bill Schmidt wrote:

Hi Mike,

On 7/7/21 3:04 PM, Michael Meissner wrote:

[PATCH] PR 100167: Fix vector long long multiply/divide tests on power10.

This patch updates the vector long long multiply and divide tests to
supply the correct code information if power10 code generation is used.

2021-07-07  Michael Meissner  

gcc/testsuite/
PR testsuite/100167
* gcc.target/powerpc/fold-vec-div-longlong.c:

Missing information after colon.

Because all of the changes were the same thing, and the line is long enough.  I
just grouped all of the files together, and put the change line as the last
entry.
But that's not accepted style.  Put it after the first one and use 
"Likewise" is the usual thing.  This looks like an omission.



* gcc.target/powerpc/fold-vec-mult-longlong.c: Fix expected code
generation on power10.
---
  gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c  | 7 +--
  gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c | 3 ++-
  2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
index 312e984d3cc..f6a9b290ae5 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
@@ -19,5 +19,8 @@ test6 (vector unsigned long long x, vector unsigned long long 
y)
  {
return vec_div (x, y);
  }
-/* { dg-final { scan-assembler-times {\mdivd\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mdivdu\M} 2 } } */
+
+/* { dg-final { scan-assembler-times {\mdivd\M}   2 { target { ! 
has_arch_pwr10 } } } } */
+/* { dg-final { scan-assembler-times {\mdivdu\M}  2 { target { ! 
has_arch_pwr10 } } } } */
+/* { dg-final { scan-assembler-times {\mvdivsd\M} 1 { target {   
has_arch_pwr10 } } } } */
+/* { dg-final { scan-assembler-times {\mvdivud\M} 1 { target {   
has_arch_pwr10 } } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
index 38dba9f5023..bd210e34801 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
@@ -20,5 +20,6 @@ test6 (vector unsigned long long x, vector unsigned long long 
y)
return vec_mul (x, y);
  }
-/* { dg-final { scan-assembler-times "\[ \t\]mulld " 4 { target lp64 } } } */
+/* { dg-final { scan-assembler-times {\mmulld\M}  4 { target { lp64 && { ! 
has_arch_pwr10 } } } } } */
+/* { dg-final { scan-assembler-times {\mvmulld\M} 2 { target { has_arch_pwr10  
   } } } } */

Shouldn't this last be { lp64 && has_arch_pwr10 } ?

Nope.  Because the power10 vector multiply is done in the vector unit, it can
generate the vmulld instruction.


Please document this, then.

Thanks,
Bill




Otherwise LGTM.  I can't approve, but recommend approval with those changes.

Thanks,
Bill

[committed] libstdc++: Constrain std::as_writable_bytes [PR101411]

2021-07-12 Thread Jonathan Wakely via Gcc-patches

The std::as_writable_bytes function should be constrained to only accept
writable spans. Currently it can be called but then gives an error in
the function body.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/101411
* include/std/span (as_writable_bytes): Add requires-clause.
* testsuite/23_containers/span/101411.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

I'll backport this too.

commit 9d4393af9d2b37b78eb5b1f84f5d4da3a6f7fba6
Author: Jonathan Wakely 
Date:   Mon Jul 12 16:09:34 2021

libstdc++: Constrain std::as_writable_bytes [PR101411]

The std::as_writable_bytes function should be constrained to only accept
writable spans. Currently it can be called but then gives an error in
the function body.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/101411
* include/std/span (as_writable_bytes): Add requires-clause.
* testsuite/23_containers/span/101411.cc: New test.

diff --git a/libstdc++-v3/include/std/span b/libstdc++-v3/include/std/span
index 63f0a8f6279..21d8f6a43a6 100644
--- a/libstdc++-v3/include/std/span
+++ b/libstdc++-v3/include/std/span
@@ -425,6 +425,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
+requires (!is_const_v<_Type>)
 inline
 span
diff --git a/libstdc++-v3/testsuite/23_containers/span/101411.cc 
b/libstdc++-v3/testsuite/23_containers/span/101411.cc
new file mode 100644
index 000..05bdd3badbd
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/span/101411.cc
@@ -0,0 +1,15 @@
+// { dg-options "-std=gnu++20" }
+// { dg-do compile { xfail c++20 } }
+#include 
+
+// PR libstdc++/101411
+
+void f(std::span s)
+{
+  std::as_writable_bytes(s); // { dg-error "no matching function" }
+}
+
+void f1(std::span s)
+{
+  std::as_writable_bytes(s); // { dg-error "no matching function" }
+}

Re: Repost: [PATCH] Fix vec-splati-runnable.c test.

2021-07-12 Thread Bill Schmidt via Gcc-patches


On 7/12/21 12:11 PM, Michael Meissner wrote:

On Mon, Jul 12, 2021 at 10:49:26AM -0500, Bill Schmidt wrote:

Hi Mike,

On 7/7/21 3:00 PM, Michael Meissner wrote:

[PATCH] Fix vec-splati-runnable.c test.

I noticed that the vec-splati-runnable.c did not have an abort after one
of the tests.  If the test was run with optimization, the optimizer could
delete some of the tests and throw off the count.  However, due to the
fact that the value being loaded in that test is undefined, I did not
check what value was loaded, but I just stored it into a volatile global
variable.

2021-07-07  Michael Meissner  

gcc/testsuite/
* gcc.target/powerpc/vec-splati-runnable.c: Run test with -O2
optimization.  Do not check what XXSPLTIDP generates if the value
is undefined.
---
  .../gcc.target/powerpc/vec-splati-runnable.c  | 29 ++-
  1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
index e84ce77a21d..a135279b1d7 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
@@ -1,7 +1,7 @@
  /* { dg-do run { target { power10_hw } } } */
  /* { dg-do link { target { ! power10_hw } } } */
  /* { dg-require-effective-target power10_ok } */
-/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps -O2" } */

Why did you restrict optimization here?  The tests should be run at
various opt levels if you don't specify this, right?

The test changes are otherwise okay, but I'd like to understand this first.

When doing tests with instruction counts, you always want to specify the
optimization level (with -O2 being the standard).  Otherwise, depending on the
other optimizations, the instruction counts might not line up.

In this particular case, because it didn't have an optimization flag, the
original code was compiled with -O0.  If you compile it with -O1, it deletes
the if statement with the empty then statement, and deletes the corresponding
XXSPLTIDP instruction.

Sorry, yes.  I thought this set of tests was run with all optimization 
flags, but it is not.  Objection withdrawn. :-)


I can't approve, but recommend approval as is.

Thanks,
Bill

Re: [PATCH] rs6000: Fix restored rs6000_long_double_type_size.

2021-07-12 Thread Segher Boessenkool

On Mon, Jun 28, 2021 at 02:19:03PM +0200, Martin Liška wrote:
> On 6/24/21 12:46 AM, Segher Boessenkool wrote:
> >>As mentioned in the "Fallout: save/restore target options in
> >>handle_optimize_attribute"
> >>thread, we need to support target option restore of
> >>rs6000_long_double_type_size == FLOAT_PRECISION_TFmode.
> >
> >I have no idea?  Could you explain please?
> 
> Sure. Few weeks ago, we started using cl_target_option_{save,restore} calls
> even for optimize attributes (and pragma). Motivation was that optimize 
> options
> can influence target options (and vice versa).
> 
> Doing that, FLOAT_PRECISION_TFmode must be accepted as a valid option value
> for rs6000_long_double_type_size.


> >>--- /dev/null
> >>+++ b/gcc/testsuite/gcc.target/powerpc/pragma-optimize.c
> >>@@ -0,0 +1,14 @@
> >>+/* { dg-do compile { target { powerpc*-*-linux* } } } */
> >
> >Why on Linux only?  That doesn't sound right.  Do you need some other
> >selector(s)?
> 
> Sorry, I copied the test-case.

Ugh.  Yes, the status quo is no good either :-(

> >>+/* { dg-options "-O2 -mlong-double-128 -mabi=ibmlongdouble" } */
> >>+
> >>+extern unsigned long int x;
> >>+extern float f (float);
> >>+extern __typeof (f) f_power8;
> >>+extern __typeof (f) f_power9;
> >>+extern __typeof (f) f __attribute__ ((ifunc ("f_ifunc")));
> >>+static __attribute__ ((optimize ("-fno-stack-protector"))) __typeof (f) *
> >
> >-fno-stack-protector is default.
> 
> Yes, but one needs an optimize attribute in order to trigger 
> cl_target_option_save/restore
> mechanism.

So it behaves differently if you select the default than if you do not
select anything?  That is wrong, no?

> >From 1632939853fbf193f72ace3d1024a137d549fef4 Mon Sep 17 00:00:00 2001
> From: Martin Liska 
> Date: Tue, 1 Jun 2021 15:39:14 +0200
> Subject: [PATCH] rs6000: Fix restored rs6000_long_double_type_size.

(No full stop at end of subject please)

Missing patch description here.  This should be suitable as commit
message when you eventually commit the patch.

Please send with that, as a separate mail, not as attachment to another
thread.

> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): When
>   a target option is restored, it can have
>   rs6000_long_double_type_size set to FLOAT_PRECISION_TFmode.

That does not say what changed?

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4185,6 +4185,8 @@ rs6000_option_override_internal (bool global_init_p)
>else
>   rs6000_long_double_type_size = default_long_double_size;
>  }
> +  else if (rs6000_long_double_type_size == FLOAT_PRECISION_TFmode)
> +; /* The option can be restored with cl_target_option_restore.  */
>else if (rs6000_long_double_type_size == 128)
>  rs6000_long_double_type_size = FLOAT_PRECISION_TFmode;
>else if (global_options_set.x_rs6000_ieeequad)

"The option can be restored" is more confusing than helpful.  *Will* be
restored by it, maybe?  Not that I understand what that means :-/

Does it make more sense to merge the 128 and FLOAT_PRECISION_TFmode
cases?

> diff --git a/gcc/testsuite/gcc.target/powerpc/pragma-optimize.c 
> b/gcc/testsuite/gcc.target/powerpc/pragma-optimize.c
> new file mode 100644
> index 000..2455fb57138
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pragma-optimize.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile { target { powerpc*-*-* } } } */

No target powerpc*-*-* in gcc.target/powerpc please.  This is enforced
for everything in there by powerpc.exp already.

Thanks,


Segher

Re: [PATCH] Port GCC documentation to Sphinx

2021-07-12 Thread Martin Sebor via Gcc-patches


On 6/29/21 4:09 AM, Martin Liška wrote:

On 6/28/21 5:33 PM, Joseph Myers wrote:

Are formatted manuals (HTML, PDF, man, info) corresponding to this patch
version also available for review?


I've just uploaded them here:
https://splichal.eu/gccsphinx-final/

Martin



I think listing the -Wfoo and -Wno-foo (and analogously the -fbar
and -fno-bar) options is an improvement over prior revisions but when
the positive form is the default the text reads funny.  For example:

  -fno-inline

  Do not expand any functions inline apart from those marked
  with the always_inline attribute. This is the default when
  not optimizing.

  Single functions can be exempted from inlining by marking
  them with the noinline attribute.

  -finline

  Default option value for -fno-inline.


I.e., -finline is not what I would describe as a default value for
-fno-inline.

I would suggest to drop the option name from the text describing
the default, and also replace value with setting (it's really not
a value).  It could be as simple as:

  -finline

  Default setting.

Alternatively, to preserve the connection to the alternate setting:

  -finline

  Default setting; overrides -fno-inline.

At some point we talked about also making attributes hyperlinks
(like always_inline and noinline above) but I don't remember
the conclusion.  Are you planning to do that?  (Would handling
it as part of the transition be easier than doing it later?)

Martin

Re: Repost: [PATCH] PR 100167: Fix vector long long multiply/divide tests on power10

2021-07-12 Thread Michael Meissner via Gcc-patches

On Sun, Jul 11, 2021 at 02:55:04PM -0500, Bill Schmidt wrote:
> Hi Mike,
> 
> On 7/7/21 3:04 PM, Michael Meissner wrote:
> >[PATCH] PR 100167: Fix vector long long multiply/divide tests on power10.
> >
> >This patch updates the vector long long multiply and divide tests to
> >supply the correct code information if power10 code generation is used.
> >
> >2021-07-07  Michael Meissner  
> >
> >gcc/testsuite/
> > PR testsuite/100167
> > * gcc.target/powerpc/fold-vec-div-longlong.c:
> Missing information after colon.

Because all of the changes were the same thing, and the line is long enough.  I
just grouped all of the files together, and put the change line as the last
entry.

> > * gcc.target/powerpc/fold-vec-mult-longlong.c: Fix expected code
> > generation on power10.
> >---
> >  gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c  | 7 +--
> >  gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c | 3 ++-
> >  2 files changed, 7 insertions(+), 3 deletions(-)
> >
> >diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c 
> >b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
> >index 312e984d3cc..f6a9b290ae5 100644
> >--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
> >+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
> >@@ -19,5 +19,8 @@ test6 (vector unsigned long long x, vector unsigned long 
> >long y)
> >  {
> >return vec_div (x, y);
> >  }
> >-/* { dg-final { scan-assembler-times {\mdivd\M} 2 } } */
> >-/* { dg-final { scan-assembler-times {\mdivdu\M} 2 } } */
> >+
> >+/* { dg-final { scan-assembler-times {\mdivd\M}   2 { target { ! 
> >has_arch_pwr10 } } } } */
> >+/* { dg-final { scan-assembler-times {\mdivdu\M}  2 { target { ! 
> >has_arch_pwr10 } } } } */
> >+/* { dg-final { scan-assembler-times {\mvdivsd\M} 1 { target {   
> >has_arch_pwr10 } } } } */
> >+/* { dg-final { scan-assembler-times {\mvdivud\M} 1 { target {   
> >has_arch_pwr10 } } } } */
> >diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c 
> >b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
> >index 38dba9f5023..bd210e34801 100644
> >--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
> >+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
> >@@ -20,5 +20,6 @@ test6 (vector unsigned long long x, vector unsigned long 
> >long y)
> >return vec_mul (x, y);
> >  }
> >-/* { dg-final { scan-assembler-times "\[ \t\]mulld " 4 { target lp64 } } } 
> >*/
> >+/* { dg-final { scan-assembler-times {\mmulld\M}  4 { target { lp64 && { ! 
> >has_arch_pwr10 } } } } } */
> >+/* { dg-final { scan-assembler-times {\mvmulld\M} 2 { target { 
> >has_arch_pwr10 } } } } */
> 
> Shouldn't this last be { lp64 && has_arch_pwr10 } ?

Nope.  Because the power10 vector multiply is done in the vector unit, it can
generate the vmulld instruction.

> Otherwise LGTM.  I can't approve, but recommend approval with those changes.
> 
> Thanks,
> Bill
> 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Eli Zaretskii via Gcc-patches

> From: Jonathan Wakely 
> Date: Mon, 12 Jul 2021 15:54:49 +0100
> Cc: Martin Liška , 
>   "g...@gcc.gnu.org" , gcc-patches 
> , 
>   "Joseph S. Myers" 
> 
> You like texinfo. We get it.

Would you please drop the attitude?

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Eli Zaretskii via Gcc-patches

> Cc: h...@bitrange.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org,
>  jos...@codesourcery.com
> From: Martin Liška 
> Date: Mon, 12 Jul 2021 16:37:00 +0200
> 
> >   4) The need to learn yet another markup language.
> >  While this is not a problem for simple text, it does require a
> >  serious study of RST and Sphinx to use the more advanced features.
> 
> No, majority of the documentation is pretty simple: basic formatting, links, 
> tables and
> code examples.

We also have documentation of APIs (a.k.a. "functions").  I actually
tried to find in the Sphinx docs how to do that and got lost.  So, not
really "very simple".

Re: Repost: [PATCH] Fix vec-splati-runnable.c test.

2021-07-12 Thread Michael Meissner via Gcc-patches

On Mon, Jul 12, 2021 at 10:49:26AM -0500, Bill Schmidt wrote:
> Hi Mike,
> 
> On 7/7/21 3:00 PM, Michael Meissner wrote:
> >[PATCH] Fix vec-splati-runnable.c test.
> >
> >I noticed that the vec-splati-runnable.c did not have an abort after one
> >of the tests.  If the test was run with optimization, the optimizer could
> >delete some of the tests and throw off the count.  However, due to the
> >fact that the value being loaded in that test is undefined, I did not
> >check what value was loaded, but I just stored it into a volatile global
> >variable.
> >
> >2021-07-07  Michael Meissner  
> >
> >gcc/testsuite/
> > * gcc.target/powerpc/vec-splati-runnable.c: Run test with -O2
> > optimization.  Do not check what XXSPLTIDP generates if the value
> > is undefined.
> >---
> >  .../gcc.target/powerpc/vec-splati-runnable.c  | 29 ++-
> >  1 file changed, 9 insertions(+), 20 deletions(-)
> >
> >diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c 
> >b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> >index e84ce77a21d..a135279b1d7 100644
> >--- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> >+++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> >@@ -1,7 +1,7 @@
> >  /* { dg-do run { target { power10_hw } } } */
> >  /* { dg-do link { target { ! power10_hw } } } */
> >  /* { dg-require-effective-target power10_ok } */
> >-/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> >+/* { dg-options "-mdejagnu-cpu=power10 -save-temps -O2" } */
> 
> Why did you restrict optimization here?  The tests should be run at
> various opt levels if you don't specify this, right?
> 
> The test changes are otherwise okay, but I'd like to understand this first.

When doing tests with instruction counts, you always want to specify the
optimization level (with -O2 being the standard).  Otherwise, depending on the
other optimizations, the instruction counts might not line up.

In this particular case, because it didn't have an optimization flag, the
original code was compiled with -O0.  If you compile it with -O1, it deletes
the if statement with the empty then statement, and deletes the corresponding
XXSPLTIDP instruction.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Eli Zaretskii via Gcc-patches

> Cc: g...@gcc.gnu.org, gcc-patches@gcc.gnu.org, jos...@codesourcery.com
> From: Martin Liška 
> Date: Mon, 12 Jul 2021 16:34:11 +0200
> 
> > "Texinfo must go" is one possible conclusion from your description.
> > But it isn't the only one.  An alternative is "the Texinfo source of
> > the GCC manual must be improved to fix this problem."  And yes, this
> > problem does have a solution in Texinfo.
> 
> No, the alternative is more powerful output given by Texinfo, in particular
> more modern HTML pages.

Please see the response by Gavin: it sounds like at least some of that
was resolved in Texinfo, sometimes long ago.

Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-12 Thread Martin Jambor

Hi,

On Mon, Jul 12 2021, Qing Zhao wrote:
>> On Jul 12, 2021, at 2:51 AM, Richard Sandiford  
>> wrote:
>>
>> Martin Jambor  writes:
>>> On Thu, Jul 08 2021, Qing Zhao wrote:
 (Resend this email since the previous one didn’t quote, I changed one
 setting in my mail client, hopefully that can fix this issue).

 Hi, Martin,

 Thank you for the review and comment.

> On Jul 8, 2021, at 8:29 AM, Martin Jambor  wrote:
>> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>> index c05d22f3e8f1..35051d7c6b96 100644
>> --- a/gcc/tree-sra.c
>> +++ b/gcc/tree-sra.c
>> @@ -384,6 +384,13 @@ static struct
>>
>>  /* Numbber of components created when splitting aggregate parameters.  
>> */
>>  int param_reductions_created;
>> +
>> +  /* Number of deferred_init calls that are modified.  */
>> +  int deferred_init;
>> +
>> +  /* Number of deferred_init calls that are created by
>> + generate_subtree_deferred_init.  */
>> +  int subtree_deferred_init;
>> } sra_stats;
>>
>> static void
>> @@ -4096,6 +4103,110 @@ get_repl_default_def_ssa_name (struct access 
>> *racc, tree reg_type)
>>  return get_or_create_ssa_default_def (cfun, racc->replacement_decl);
>> }
>>
>> +
>> +/* Generate statements to call .DEFERRED_INIT to initialize scalar 
>> replacements
>> +   of accesses within a subtree ACCESS; all its children, siblings and 
>> their
>> +   children are to be processed.
>> +   GSI is a statement iterator used to place the new statements.  */
>> +static void
>> +generate_subtree_deferred_init (struct access *access,
>> +tree init_type,
>> +tree is_vla,
>> +gimple_stmt_iterator *gsi,
>> +location_t loc)
>> +{
>> +  do
>> +{
>> +  if (access->grp_to_be_replaced)
>> +{
>> +  tree repl = get_access_replacement (access);
>> +  gimple *call
>> += gimple_build_call_internal (IFN_DEFERRED_INIT, 3,
>> +  TYPE_SIZE_UNIT (TREE_TYPE 
>> (repl)),
>> +  init_type, is_vla);
>> +  gimple_call_set_lhs (call, repl);
>> +  gsi_insert_before (gsi, call, GSI_SAME_STMT);
>> +  update_stmt (call);
>> +  gimple_set_location (call, loc);
>> +  sra_stats.subtree_deferred_init++;
>> +}
>> +  else if (access->grp_to_be_debug_replaced)
>> +{
>> +  tree drepl = get_access_replacement (access);
>> +  tree call = build_call_expr_internal_loc
>> + (UNKNOWN_LOCATION, IFN_DEFERRED_INIT,
>> +  TREE_TYPE (drepl), 3,
>> +  TYPE_SIZE_UNIT (TREE_TYPE (drepl)),
>> +  init_type, is_vla);
>> +  gdebug *ds = gimple_build_debug_bind (drepl, call,
>> +gsi_stmt (*gsi));
>> +  gsi_insert_before (gsi, ds, GSI_SAME_STMT);
>
> Is handling of grp_to_be_debug_replaced accesses necessary here?  If so,
> why?  grp_to_be_debug_replaced accesses are there only to facilitate
> debug information about a part of an aggregate decl is that is likely
> going to be entirely removed - so that debuggers can sometimes show to
> users information about what they would contain had they not removed.
> It seems strange you need to mark them as uninitialized because they
> should not have any consumers.  (But perhaps it is also harmless.)

 This part has been discussed during the 2nd version of the patch, but
 I think that more discussion might be necessary.

 In the previous discussion, Richard Sandiford mentioned:
 (https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568620.html):

 =

 I guess the thing we need to decide here is whether -ftrivial-auto-var-init
 should affect debug-only constructs too.  If it doesn't, exmaining removed
 components in a debugger might show uninitialised values in cases where
 the user was expecting initialised ones.  There would be no security
 concern, but it might be surprising.

 I think in principle the DRHS can contain a call to DEFERRED_INIT.
 Doing that would probably require further handling elsewhere though.

 =

 I am still not very confident now for this part of the change.
>>>
>>> I see.  I still tend to think that with or without the generation of
>>> gimple_build_debug_binds, the debugger would still not display any value
>>> for the component in question.  Without it there would be no information
>>> about the component at a any place in code affected by this, with it the
>

Re: [PATCH] rs6000: Fix restored rs6000_long_double_type_size.

2021-07-12 Thread Segher Boessenkool

On Mon, Jul 12, 2021 at 06:19:28AM +0200, Martin Liška wrote:
> PING^1

I did not notice you attached a new patch.  It works a lot better if
every patch series is a new thread.


Segher

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread David Malcolm via Gcc-patches

On Mon, 2021-07-12 at 15:25 +0200, Martin Liška wrote:
> Hello.
> 
> Let's make it a separate sub-thread where we can discuss motivation
> why
> do I want moving to Sphinx format.
> 
> Benefits:
> 1) modern looking HTML output (before: [1], after: [2]):

"modern looking" is rather subjective; I'd rate Sphinx's output as
looking like it's from 2010s (last decade), whereas Texinfos' looks
like it's from the 1990s.  In theory this ought not to matter, but
every time I look at our documentation it gives me a depressing
feeling, reminiscent of a graveyard, that discourages me from fixing
things.

>     a) syntax highlighting for examples (code, shell commands, etc.)

...with support for multiple programming languages, potentially on the
same page.  For example, in the libgccjit docs:
  https://gcc.gnu.org/onlinedocs/jit/intro/tutorial02.html
we can have a mixture of C, assembler and shell on one page, and each
example is syntax-highlighted accordingly.  It's not clear to me how to
do that in texinfo, since there needs to be a way to express what
language an example is in.

>     b) precise anchors, the current Texinfo anchors are not displayed
> (start with first line of an option)

...and the URLs are sane and stable (so e.g. there is a reliable,
guessable, readable URL for the docs for say, "-Wall").

>     c) one can easily copy a link to an anchor (displayed as ¶)
>     d) internal links are working, e.g. one can easily jump from
> listing of options
>     e) left menu navigation provides better orientation in the manual
>     f) Sphinx provides internal search capability: [3]

...also (quoting myself in places here from 2015
  https://gcc.gnu.org/pipermail/gcc-patches/2015-November/434055.html 
):

* the ability to include fragments of files: libgccjit's documentation
uses directives to include code from the test suite, so that all of the
code examples are also part of the test suite, and are thus known to
compile), allowing for (almost) literate programming.  [That said, the
build of libgccjit's docs on gcc.gnu.org seems to be missing those
fragments; I wonder if there's a path or version issue?]

* a page-splitting structure that make sense, to me, at least (I have
never fathomed the way texinfo's navigation works, for HTML, at least,
and I believe I'm not the only one; I generally pick the all-in-one-
HTML-page option when viewing texinfo-html docs and do textual
searches, since otherwise I usually can't find the thing I'm looking
for (or have to resort to a brute-force depth-first search of clicking
through the links).)

* much more use of markup, with restrained and well-chosen CSS
(texinfo's HTML seems to ignore much of the inline markup in
the .texinfo file)

> 2) internal links are also provided in PDF version of the manual
> 3) some existing GCC manuals are already written in Sphinx (GNAT
> manuals and libgccjit)
> 4) support for various output formats, some people are interested in
> ePUB format
> 5) Sphinx is using RST which is quite minimal semantic markup language

Sphinx is also used by many high-profile FLOSS projects (e.g. the Linux
kernel, LLVM, and the Python community), so it reduces the barrier to
entry for new contributors, relative to texinfo.

> 6) TOC is automatically generated - no need for manual navigation
> like seen here: [5]
> 
> Disadvantages:
> 
> 1) info pages are currently missing Page description in TOC
> 2) rich formatting is leading to extra wrapping in info output -
> beings partially addresses in [4]
> 3) one needs e.g. Emacs support for inline links (rendered as notes)
> 
> I'm willing to address issue 1) in next weeks and I tend to skip
> emission of links as mentioned in 3).
> Generally speaking, I'm aware that some people still use Info, but I
> think we should more focus
> on more modern documentation formats. That's HTML (and partially
> PDF).

I think the output formats we need to support are:
- HTML
- PDF
- man page (hardly "modern", but still used)

I regared "info" as merely "nice to have" - I don't know anyone who
uses it other than some core GNU contributors.

Dave

> 
> Martin
> 
> [1]
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict-aliasing
> [2]
> https://splichal.eu/gccsphinx-final/html/gcc/gcc-command-options/options-that-control-optimization.html#cmdoption-fstrict-aliasing
> [3]
> https://splichal.eu/gccsphinx-final/html/gcc/search.html?q=-fipa-icf&check_keywords=yes&area=default#
> [4] https://github.com/sphinx-doc/sphinx/pull/9391
> [5] @comment node-name, next,  previous, up
>  @node    Installing GCC, Binaries, , Top
>

Re: [RFA] Some libgcc headers are missing the runtime exception

2021-07-12 Thread Richard Sandiford via Gcc-patches

David Edelsohn  writes:
> On Mon, Jul 12, 2021 at 11:58 AM Richard Sandiford
>  wrote:
>>
>> David Edelsohn  writes:
>> > On Fri, Jul 9, 2021 at 1:31 PM Richard Sandiford
>> >  wrote:
>> >>
>> >> David Edelsohn  writes:
>> >> > On Fri, Jul 9, 2021 at 12:53 PM Richard Sandiford via Gcc
>> >> >  wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> It was pointed out to me off-list that config/aarch64/value-unwind.h
>> >> >> is missing the runtime exception.  It looks like a few other files
>> >> >> are too; a fuller list is:
>> >> >>
>> >> >> libgcc/config/aarch64/value-unwind.h
>> >> >> libgcc/config/frv/frv-abi.h
>> >> >> libgcc/config/i386/value-unwind.h
>> >> >> libgcc/config/pa/pa64-hpux-lib.h
>> >> >>
>> >> >> Certainly for the aarch64 file this was simply a mistake;
>> >> >> it seems to have been copied from the i386 version, both of which
>> >> >> reference the runtime exception but don't actually include it.
>> >> >>
>> >> >> What's the procedure for fixing this?  Can we treat it as a textual
>> >> >> error or do the files need to be formally relicensed?
>> >> >
>> >> > I'm unsure what you mean by "formally relicensed".
>> >>
>> >> It seemed like there were two possibilities: the licence of the files
>> >> is actually GPL + exception despite what the text says (the textual
>> >> error case), or the licence of the files is plain GPL because the text
>> >> has said so since the introduction of the files.  In the latter case
>> >> I'd have imagined that someone would need to relicense the code so
>> >> that it is GPL + exception.
>> >>
>> >> > It generally is considered a textual omission.  The runtime library
>> >> > components of GCC are intended to be licensed under the runtime
>> >> > exception, which was granted and approved at the time of introduction.
>> >>
>> >> OK, thanks.  So would a patch to fix at least the i386 and aarch64 header
>> >> files be acceptable?  (I'm happy to fix the other two as well if that's
>> >> definitely the right thing to do.  It's just that there's more history
>> >> involved there…)
>> >
>> > Please correct the text in the files. The files in libgcc used in the
>> > GCC runtime are intended to be licensed with the runtime exception and
>> > GCC previously was granted approval for that licensing and purpose.
>> >
>> > As you are asking the question, I sincerely doubt that ARM and Cavium
>> > intended to apply a license without the exception to those files.  And
>> > similarly for Intel and FRV.
>>
>> FTR, I think only Linaro (rather than Arm) touched the aarch64 file.
>>
>> > The runtime exception explicitly was intended for this purpose and
>> > usage at the time that GCC received approval to apply the exception.
>>
>> Ack.  Is the patch below OK for trunk and branches?
>
> I'm not certain whom you are asking for approval,

I was assuming it would need a global reviewer.

> but it looks good to me.

Thanks.

> It would be nice to add SPDX License Identifier at the top of the
> files as well, but that's not required.

Yeah, I agree that might a good thing to have, but TBH I try to keep
my involvement with licensing stuff to the bare minimum :-)

Richard

Re: disable -Warray-bounds in libgo (PR 101374)

2021-07-12 Thread Martin Sebor via Gcc-patches


On 7/9/21 7:26 AM, Rainer Orth wrote:

Hi Martin,


Yesterday's enhancement to -Warray-bounds has exposed a couple of
issues in libgo where the code writes into an invalid constant
address that the warning is designed to flag.

On the assumption that those invalid addresses are deliberate,
the attached patch suppresses these instances by using #pragma
GCC diagnostic but I don't think I'm supposed to commit it (at
least Git won't let me).  To avoid Go bootstrap failures please
either apply the patch or otherwise suppress the warning (e.g.,
by using a volatile pointer temporary).


while this patch does fix the libgo bootstrap failure, Go is completely
broken: almost 1000 go.test failures and all libgo tests FAIL as well.
Seen on both i386-pc-solaris2.11 and sparc-sun-solaris2.11.


FWIW, I see exactly the same failures on x86_64-pc-linux-gnu, so nothing
Solaris-specific here.


I don't normally test Go because of PR 91992, but I see just
the three test failures below on x86_64-linux with the latest trunk:

FAIL: go.test/test/fixedbugs/issue10441.go   -O  (test for excess errors)
FAIL: ./index0-out.go execution,  -O0 -g -fno-var-tracking-assignments
FAIL: runtime/pprof

The excess errors don't look related to my changes:

FAIL: go.test/test/fixedbugs/issue10441.go   -O  (test for excess errors)
Excess errors:
/usr/bin/ld: 
/ssd/test/build/gcc-trunk/x86_64-pc-linux-gnu/./libgo/.libs/libgo.so: 
undefined reference to `__go_init_main'
/usr/bin/ld: 
/ssd/test/build/gcc-trunk/x86_64-pc-linux-gnu/./libgo/.libs/libgo.so: 
undefined reference to `main.main'


My libgo.log shows the FAILs below.  I don't know how to interpret
that but nothing in the file suggests that my change is the cause
of these failures

--- FAIL: ExampleFrames (0.00s)
FAIL
FAIL: runtime
--- FAIL: TestConvertCPUProfile (0.00s)
--- FAIL: TestConvertMemProfile (0.00s)
--- FAIL: TestConvertMemProfile/heap (0.00s)
--- FAIL: TestConvertMemProfile/allocs (0.00s)
FAIL
FAIL: runtime/pprof
--- FAIL: ExampleFrames (0.00s)
FAIL
FAIL: runtime
--- FAIL: TestDurationSeconds (0.00s)
--- FAIL: ExampleParseDuration (0.00s)
FAIL
FAIL: time

If you see different failures in your build that look like they
might be caused by them then please show what those are.

Martin

Re: [RFA] Some libgcc headers are missing the runtime exception

2021-07-12 Thread David Edelsohn via Gcc-patches

On Mon, Jul 12, 2021 at 11:58 AM Richard Sandiford
 wrote:
>
> David Edelsohn  writes:
> > On Fri, Jul 9, 2021 at 1:31 PM Richard Sandiford
> >  wrote:
> >>
> >> David Edelsohn  writes:
> >> > On Fri, Jul 9, 2021 at 12:53 PM Richard Sandiford via Gcc
> >> >  wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> It was pointed out to me off-list that config/aarch64/value-unwind.h
> >> >> is missing the runtime exception.  It looks like a few other files
> >> >> are too; a fuller list is:
> >> >>
> >> >> libgcc/config/aarch64/value-unwind.h
> >> >> libgcc/config/frv/frv-abi.h
> >> >> libgcc/config/i386/value-unwind.h
> >> >> libgcc/config/pa/pa64-hpux-lib.h
> >> >>
> >> >> Certainly for the aarch64 file this was simply a mistake;
> >> >> it seems to have been copied from the i386 version, both of which
> >> >> reference the runtime exception but don't actually include it.
> >> >>
> >> >> What's the procedure for fixing this?  Can we treat it as a textual
> >> >> error or do the files need to be formally relicensed?
> >> >
> >> > I'm unsure what you mean by "formally relicensed".
> >>
> >> It seemed like there were two possibilities: the licence of the files
> >> is actually GPL + exception despite what the text says (the textual
> >> error case), or the licence of the files is plain GPL because the text
> >> has said so since the introduction of the files.  In the latter case
> >> I'd have imagined that someone would need to relicense the code so
> >> that it is GPL + exception.
> >>
> >> > It generally is considered a textual omission.  The runtime library
> >> > components of GCC are intended to be licensed under the runtime
> >> > exception, which was granted and approved at the time of introduction.
> >>
> >> OK, thanks.  So would a patch to fix at least the i386 and aarch64 header
> >> files be acceptable?  (I'm happy to fix the other two as well if that's
> >> definitely the right thing to do.  It's just that there's more history
> >> involved there…)
> >
> > Please correct the text in the files. The files in libgcc used in the
> > GCC runtime are intended to be licensed with the runtime exception and
> > GCC previously was granted approval for that licensing and purpose.
> >
> > As you are asking the question, I sincerely doubt that ARM and Cavium
> > intended to apply a license without the exception to those files.  And
> > similarly for Intel and FRV.
>
> FTR, I think only Linaro (rather than Arm) touched the aarch64 file.
>
> > The runtime exception explicitly was intended for this purpose and
> > usage at the time that GCC received approval to apply the exception.
>
> Ack.  Is the patch below OK for trunk and branches?

I'm not certain whom you are asking for approval, but it looks good to me.

It would be nice to add SPDX License Identifier at the top of the
files as well, but that's not required.

Thanks, David

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Jonathan Wakely via Gcc-patches

On Mon, 12 Jul 2021 at 17:01, Gavin Smith wrote:
>
> On Mon, Jul 12, 2021 at 4:04 PM Jonathan Wakely via Gcc  
> wrote:
> > GNU Hello has the same problem with its docs:
> > https://www.gnu.org/software/hello/manual/hello.html#index-_002dg
> > That URL is garbage because of the URL-encoded %2d character, and the
> > fact it links to the wrong place (the description of the option, not
> > the option itself). The former is no longer an issue for GCC (it was
> > for many years) but the latter is still a problem.
> >
> > If you don't know where to find it yourself, the source is visible here:
> > https://github.com/yugui/example/blob/master/doc/hello.texi#L208
>
> I downloaded the source for the "hello" manual and recreated it with
> Texinfo 6.8 (running " texi2any --html hello.texi --no-split"). I've
> attached the results. The current output doesn't exhibit the problem
> with the scrolling being at the wrong place - this problem has
> evidently resolved itself since the time when the online "hello"
> manual was generated. (I don't remember many complaints about it on
> the mailing list, though: if we don't know about problems, we can't
> fix them.)

The "copyable link" does work as I would expect. The #index-_002dg
anchor still seems to be in the "wrong" place, i.e. in the 
element not the  element. But the addition of the copyable link
nicely solves the problem of needing to easily obtain a link to the
right position.

> The URL is mangled because index entries can have more characters in
> them than what is suitable for a URL. A space character becomes a "-",
> so a "-" has to become something else.

Yes, I understand the reason.

Re: Repost: [PATCH] Change rs6000_const_f32_to_i32 return type.

2021-07-12 Thread Bill Schmidt via Gcc-patches


Hi Mike,

On 7/7/21 2:59 PM, Michael Meissner wrote:

[PATCH] Change rs6000_const_f32_to_i32 return type.

The function rs6000_const_f32_to_i32 called REAL_VALUE_TO_TARGET_SINGLE
with a long long type and returns it.  This patch changes the type to long
which is the proper type for REAL_VALUE_TO_TARGET_SINGLE.

2021-07-07  Michael Meissner  

gcc/
* config/rs6000/rs6000-protos.h (rs6000_const_f32_to_i32): Change
return type to long.
* config/rs6000/rs6000.c (rs6000_const_f32_to_i32): Change return
type to long.
---
  gcc/config/rs6000/rs6000-protos.h | 2 +-
  gcc/config/rs6000/rs6000.c| 6 --
  2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 9de294d3b28..94bf961c6b7 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -281,7 +281,7 @@ extern void rs6000_asm_output_dwarf_pcrel (FILE *file, int 
size,
   const char *label);
  extern void rs6000_asm_output_dwarf_datarel (FILE *file, int size,
 const char *label);
-extern long long rs6000_const_f32_to_i32 (rtx operand);
+extern long rs6000_const_f32_to_i32 (rtx operand);
  
  /* Declare functions in rs6000-c.c */
  
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c

index 9a5db63d0ef..de11de5e079 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -27936,10 +27936,12 @@ rs6000_invalid_conversion (const_tree fromtype, 
const_tree totype)
return NULL;
  }
  
-long long

+/* Convert a SFmode constant to the integer bit pattern.  */
+
+long
  rs6000_const_f32_to_i32 (rtx operand)
  {
-  long long value;
+  long value;
const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (operand);
  
gcc_assert (GET_MODE (operand) == SFmode);


These changes look OK.  Can you please also fix the expander for 
xxspltiw_v4sf, which incorrectly expects a long long?


I can't approve, but recommend approval with that also fixed.

Thanks!
Bill

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Gavin Smith via Gcc-patches

On Mon, Jul 12, 2021 at 4:04 PM Jonathan Wakely via Gcc  
wrote:
> GNU Hello has the same problem with its docs:
> https://www.gnu.org/software/hello/manual/hello.html#index-_002dg
> That URL is garbage because of the URL-encoded %2d character, and the
> fact it links to the wrong place (the description of the option, not
> the option itself). The former is no longer an issue for GCC (it was
> for many years) but the latter is still a problem.
>
> If you don't know where to find it yourself, the source is visible here:
> https://github.com/yugui/example/blob/master/doc/hello.texi#L208

I downloaded the source for the "hello" manual and recreated it with
Texinfo 6.8 (running " texi2any --html hello.texi --no-split"). I've
attached the results. The current output doesn't exhibit the problem
with the scrolling being at the wrong place - this problem has
evidently resolved itself since the time when the online "hello"
manual was generated. (I don't remember many complaints about it on
the mailing list, though: if we don't know about problems, we can't
fix them.)

The URL is mangled because index entries can have more characters in
them than what is suitable for a URL. A space character becomes a "-",
so a "-" has to become something else. They have to be distinguished
because there may be two separate index entries in different places
which wouldn't be distinguishable otherwise.

However, I find that adding an extra index entry means you can use
hello.html#index-greeting instead:

@item --greeting=@var{text}
@itemx -g @var{text}
@opindex greeting
@opindex --greeting
@opindex -g
Output @var{text} instead of the default greeting.

Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]

2021-07-12 Thread guojiufu via Gcc-patches


On 2021-07-12 23:53, guojiufu via Gcc-patches wrote:

On 2021-07-12 22:46, Richard Biener wrote:

On Mon, 12 Jul 2021, guojiufu wrote:


On 2021-07-12 18:02, Richard Biener wrote:
> On Mon, 12 Jul 2021, guojiufu wrote:
>
>> On 2021-07-12 16:57, Richard Biener wrote:
>> > On Mon, 12 Jul 2021, guojiufu wrote:
>> >
>> >> On 2021-07-12 14:20, Richard Biener wrote:
>> >> > On Fri, 9 Jul 2021, Segher Boessenkool wrote:
>> >> >
>> >> >> On Fri, Jul 09, 2021 at 08:43:59AM +0200, Richard Biener wrote:
>> >> >> > I wonder if there's a way to query the target what modes the doloop
>> >> >> > pattern can handle (not being too familiar with the doloop code).
>> >> >>
>> >> >> You can look what modes are allowed for operand 0 of doloop_end,
>> >> >> perhaps?  Although that is a define_expand, not a define_insn, so it
>> >> >> is
>> >> >> hard to introspect.
>> >> >>
>> >> >> > Why do you need to do any checks besides the new type being able to
>> >> >> > represent all IV values?  The original doloop IV will never wrap
>> >> >> > (OTOH if niter is U*_MAX then we compute niter + 1 which will
>> >> >> > become
>> >> >> > zero ... I suppose the doloop might still do the correct thing here
>> >> >> > but it also still will with a IV with larger type).
>> >>
>> >> The issue comes from U*_MAX (original short MAX), as you said: on which
>> >> niter + 1 becomes zero.  And because the step for doloop is -1; then, on
>> >> larger type 'zero - 1' will be a very large number on larger type
>> >> (e.g. 0xff...ff); but on the original short type 'zero - 1' is a small
>> >> value
>> >> (e.g. "0xff").
>> >
>> > But for the larger type the small type MAX + 1 fits and does not yield
>> > zero so it should still work exactly as before, no?  Of course you
>> > have to compute the + 1 in the larger type.
>> >
>> You are right, if compute the "+ 1" in the larger type it is ok, as below
>> code:
>> ```
>>/* Use type in word size may fast.  */
>> if (TYPE_PRECISION (ntype) < BITS_PER_WORD)
>>   {
>> ntype = lang_hooks.types.type_for_size (BITS_PER_WORD, 1);
>> niter = fold_convert (ntype, niter);
>>   }
>>
>> tree base = fold_build2 (PLUS_EXPR, ntype, unshare_expr (niter),
>>  build_int_cst (ntype, 1));
>>
>>
>> add_candidate (data, base, build_int_cst (ntype, -1), true, NULL, NULL,
>> true);
>> ```
>> The issue of this is, this code generates more stmt for doloop.xxx:
>>   _12 = (unsigned int) xx(D);
>>   _10 = _12 + 4294967295;
>>   _24 = (long unsigned int) _10;
>>   doloop.6_8 = _24 + 1;
>>
>> if use previous patch, "+ 1" on original type, then the stmts will looks
>> like:
>>   _12 = (unsigned int) xx(D);
>>   doloop.6_8 = (long unsigned int) _12;
>>
>> This is the reason for checking
>>wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE (ntype)))
>
> But this then only works when there's an upper bound on the number
> of iterations.  Note you should not use TYPE_MAX_VALUE here but
> you can instead use
>
>  wi::ltu_p (niter_desc->max, wi::to_widest (wi::max_value
> (TYPE_PRECISION (ntype), TYPE_SIGN (ntype;

Ok, Thanks!
I remember you mentioned that:
widest_int::from (wi::max_value (TYPE_PRECISION (ntype), TYPE_SIGN 
(ntype)),

TYPE_SIGN (ntype))
would be better than
wi::to_widest (TYPE_MAX_VALUE (ntype)).

It seems that:
"TYPE_MAX_VALUE (ntype)" is "NUMERICAL_TYPE_CHECK
(NODE)->type_non_common.maxval"
which do a numerical-check and return the field of maxval.  And then 
call to

wi::to_widest

The other code "widest_int::from (wi::max_value (..,..),..)" calls
wi::max_value
and widest_int::from.

I'm wondering if wi::to_widest (TYPE_MAX_VALUE (ntype)) is cheaper?


TYPE_MAX_VALUE can be "suprising", it does not necessarily match the
underlying modes precision.  At some point we've tried to eliminate
most of its uses, not sure what the situation/position is right now.

Ok, get it, thanks.
I will use "widest_int::from (wi::max_value (..,..),..)".




> I think the -1 above comes from number of latch iterations vs. header
> entries - it's a common source for this kind of issues.  range analysis
> might be able to prove that we can still merge the two adds even with
> the intermediate extension.
Yes, as you mentioned here, it relates to number of latch iterations
For loop looks like : while (l < n) or for (i = 0; i < n; i++)
This kind of loop, the niter is used to be 'n - 1' after transformed
into 'do-while' form.
For this kind of loop, the max value for the number of iteration "n - 
1"

would be "max_value_type(n) - 1" which is wi::ltu than max_value_type.
This kind of loop is already common, and we could use wi::ltu (max,
max_value_type)
to check.

For loop looks like:
  do ;
  while (n-- > 0); /* while  (n-- > low); */

The niter_desc->max will wi::eq to max_value_type, and niter would be 
"n",

and then doloop.xx is 'n+1'.


I would see how to merge these two adds safely at this point
when generating doloop iv. (maybe range info, thanks!

>
> Is this pr

[RFA] Some libgcc headers are missing the runtime exception

2021-07-12 Thread Richard Sandiford via Gcc-patches

David Edelsohn  writes:
> On Fri, Jul 9, 2021 at 1:31 PM Richard Sandiford
>  wrote:
>>
>> David Edelsohn  writes:
>> > On Fri, Jul 9, 2021 at 12:53 PM Richard Sandiford via Gcc
>> >  wrote:
>> >>
>> >> Hi,
>> >>
>> >> It was pointed out to me off-list that config/aarch64/value-unwind.h
>> >> is missing the runtime exception.  It looks like a few other files
>> >> are too; a fuller list is:
>> >>
>> >> libgcc/config/aarch64/value-unwind.h
>> >> libgcc/config/frv/frv-abi.h
>> >> libgcc/config/i386/value-unwind.h
>> >> libgcc/config/pa/pa64-hpux-lib.h
>> >>
>> >> Certainly for the aarch64 file this was simply a mistake;
>> >> it seems to have been copied from the i386 version, both of which
>> >> reference the runtime exception but don't actually include it.
>> >>
>> >> What's the procedure for fixing this?  Can we treat it as a textual
>> >> error or do the files need to be formally relicensed?
>> >
>> > I'm unsure what you mean by "formally relicensed".
>>
>> It seemed like there were two possibilities: the licence of the files
>> is actually GPL + exception despite what the text says (the textual
>> error case), or the licence of the files is plain GPL because the text
>> has said so since the introduction of the files.  In the latter case
>> I'd have imagined that someone would need to relicense the code so
>> that it is GPL + exception.
>>
>> > It generally is considered a textual omission.  The runtime library
>> > components of GCC are intended to be licensed under the runtime
>> > exception, which was granted and approved at the time of introduction.
>>
>> OK, thanks.  So would a patch to fix at least the i386 and aarch64 header
>> files be acceptable?  (I'm happy to fix the other two as well if that's
>> definitely the right thing to do.  It's just that there's more history
>> involved there…)
>
> Please correct the text in the files. The files in libgcc used in the
> GCC runtime are intended to be licensed with the runtime exception and
> GCC previously was granted approval for that licensing and purpose.
>
> As you are asking the question, I sincerely doubt that ARM and Cavium
> intended to apply a license without the exception to those files.  And
> similarly for Intel and FRV.

FTR, I think only Linaro (rather than Arm) touched the aarch64 file.

> The runtime exception explicitly was intended for this purpose and
> usage at the time that GCC received approval to apply the exception.

Ack.  Is the patch below OK for trunk and branches?

Thanks,
Richard

>From a601ac8ea9be14a898215456c22cd826e8fd92d9 Mon Sep 17 00:00:00 2001
From: Richard Sandiford 
Date: Mon, 12 Jul 2021 13:04:56 +0100
Subject: [PATCH] libgcc: Add missing runtime exception notices
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Quoting from https://gcc.gnu.org/pipermail/gcc/2021-July/236716.html:


It was pointed out to me off-list that config/aarch64/value-unwind.h
is missing the runtime exception.  It looks like a few other files
are too; a fuller list is:

libgcc/config/aarch64/value-unwind.h
libgcc/config/frv/frv-abi.h
libgcc/config/i386/value-unwind.h
libgcc/config/pa/pa64-hpux-lib.h

Certainly for the aarch64 file this was simply a mistake;
it seems to have been copied from the i386 version, both of which
reference the runtime exception but don't actually include it.


Similarly, frv-abi.h referenced the exception but didn't include it.
pa64-hpux-lib.h was missing any reference to the exception.

The decision was that this was simply a mistake
[https://gcc.gnu.org/pipermail/gcc/2021-July/236717.html]:


[…] It generally is
considered a textual omission.  The runtime library components of GCC
are intended to be licensed under the runtime exception, which was
granted and approved at the time of introduction.


and that we should simply change all of the files above
[https://gcc.gnu.org/pipermail/gcc/2021-July/236719.html]:


Please correct the text in the files. The files in libgcc used in the
GCC runtime are intended to be licensed with the runtime exception and
GCC previously was granted approval for that licensing and purpose.

[…]

The runtime exception explicitly was intended for this purpose and
usage at the time that GCC received approval to apply the exception.


libgcc/
	* config/aarch64/value-unwind.h: Add missing runtime exception
	paragraph.
	* config/frv/frv-abi.h: Likewise.
	* config/i386/value-unwind.h: Likewise.
	* config/pa/pa64-hpux-lib.h: Likewise.
---
 libgcc/config/aarch64/value-unwind.h | 4 
 libgcc/config/frv/frv-abi.h  | 4 
 libgcc/config/i386

Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]

2021-07-12 Thread guojiufu via Gcc-patches


On 2021-07-12 22:46, Richard Biener wrote:

On Mon, 12 Jul 2021, guojiufu wrote:


On 2021-07-12 18:02, Richard Biener wrote:
> On Mon, 12 Jul 2021, guojiufu wrote:
>
>> On 2021-07-12 16:57, Richard Biener wrote:
>> > On Mon, 12 Jul 2021, guojiufu wrote:
>> >
>> >> On 2021-07-12 14:20, Richard Biener wrote:
>> >> > On Fri, 9 Jul 2021, Segher Boessenkool wrote:
>> >> >
>> >> >> On Fri, Jul 09, 2021 at 08:43:59AM +0200, Richard Biener wrote:
>> >> >> > I wonder if there's a way to query the target what modes the doloop
>> >> >> > pattern can handle (not being too familiar with the doloop code).
>> >> >>
>> >> >> You can look what modes are allowed for operand 0 of doloop_end,
>> >> >> perhaps?  Although that is a define_expand, not a define_insn, so it
>> >> >> is
>> >> >> hard to introspect.
>> >> >>
>> >> >> > Why do you need to do any checks besides the new type being able to
>> >> >> > represent all IV values?  The original doloop IV will never wrap
>> >> >> > (OTOH if niter is U*_MAX then we compute niter + 1 which will
>> >> >> > become
>> >> >> > zero ... I suppose the doloop might still do the correct thing here
>> >> >> > but it also still will with a IV with larger type).
>> >>
>> >> The issue comes from U*_MAX (original short MAX), as you said: on which
>> >> niter + 1 becomes zero.  And because the step for doloop is -1; then, on
>> >> larger type 'zero - 1' will be a very large number on larger type
>> >> (e.g. 0xff...ff); but on the original short type 'zero - 1' is a small
>> >> value
>> >> (e.g. "0xff").
>> >
>> > But for the larger type the small type MAX + 1 fits and does not yield
>> > zero so it should still work exactly as before, no?  Of course you
>> > have to compute the + 1 in the larger type.
>> >
>> You are right, if compute the "+ 1" in the larger type it is ok, as below
>> code:
>> ```
>>/* Use type in word size may fast.  */
>> if (TYPE_PRECISION (ntype) < BITS_PER_WORD)
>>   {
>> ntype = lang_hooks.types.type_for_size (BITS_PER_WORD, 1);
>> niter = fold_convert (ntype, niter);
>>   }
>>
>> tree base = fold_build2 (PLUS_EXPR, ntype, unshare_expr (niter),
>>  build_int_cst (ntype, 1));
>>
>>
>> add_candidate (data, base, build_int_cst (ntype, -1), true, NULL, NULL,
>> true);
>> ```
>> The issue of this is, this code generates more stmt for doloop.xxx:
>>   _12 = (unsigned int) xx(D);
>>   _10 = _12 + 4294967295;
>>   _24 = (long unsigned int) _10;
>>   doloop.6_8 = _24 + 1;
>>
>> if use previous patch, "+ 1" on original type, then the stmts will looks
>> like:
>>   _12 = (unsigned int) xx(D);
>>   doloop.6_8 = (long unsigned int) _12;
>>
>> This is the reason for checking
>>wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE (ntype)))
>
> But this then only works when there's an upper bound on the number
> of iterations.  Note you should not use TYPE_MAX_VALUE here but
> you can instead use
>
>  wi::ltu_p (niter_desc->max, wi::to_widest (wi::max_value
> (TYPE_PRECISION (ntype), TYPE_SIGN (ntype;

Ok, Thanks!
I remember you mentioned that:
widest_int::from (wi::max_value (TYPE_PRECISION (ntype), TYPE_SIGN 
(ntype)),

TYPE_SIGN (ntype))
would be better than
wi::to_widest (TYPE_MAX_VALUE (ntype)).

It seems that:
"TYPE_MAX_VALUE (ntype)" is "NUMERICAL_TYPE_CHECK
(NODE)->type_non_common.maxval"
which do a numerical-check and return the field of maxval.  And then 
call to

wi::to_widest

The other code "widest_int::from (wi::max_value (..,..),..)" calls
wi::max_value
and widest_int::from.

I'm wondering if wi::to_widest (TYPE_MAX_VALUE (ntype)) is cheaper?


TYPE_MAX_VALUE can be "suprising", it does not necessarily match the
underlying modes precision.  At some point we've tried to eliminate
most of its uses, not sure what the situation/position is right now.

Ok, get it, thanks.
I will use "widest_int::from (wi::max_value (..,..),..)".




> I think the -1 above comes from number of latch iterations vs. header
> entries - it's a common source for this kind of issues.  range analysis
> might be able to prove that we can still merge the two adds even with
> the intermediate extension.
Yes, as you mentioned here, it relates to number of latch iterations
For loop looks like : while (l < n) or for (i = 0; i < n; i++)
This kind of loop, the niter is used to be 'n - 1' after transformed
into 'do-while' form.

For this kind of loop, the max value for the number of iteration "n - 1"
would be "max_value_type(n) - 1" which is wi::ltu than max_value_type.
This kind of loop is already common, and we could use wi::ltu (max, 
max_value_type)

to check.

For loop looks like:
  do ;
  while (n-- > 0); /* while  (n-- > low); */

The niter_desc->max will wi::eq to max_value_type, and niter would be 
"n",

and then doloop.xx is 'n+1'.


I would see how to merge these two adds safely at this point
when generating doloop iv. (maybe range info, thanks!

>
> Is this pre-loop extra add really offsetting the in-loop doloop

Re: Repost: [PATCH] Fix vec-splati-runnable.c test.

2021-07-12 Thread Bill Schmidt via Gcc-patches


Hi Mike,

On 7/7/21 3:00 PM, Michael Meissner wrote:

[PATCH] Fix vec-splati-runnable.c test.

I noticed that the vec-splati-runnable.c did not have an abort after one
of the tests.  If the test was run with optimization, the optimizer could
delete some of the tests and throw off the count.  However, due to the
fact that the value being loaded in that test is undefined, I did not
check what value was loaded, but I just stored it into a volatile global
variable.

2021-07-07  Michael Meissner  

gcc/testsuite/
* gcc.target/powerpc/vec-splati-runnable.c: Run test with -O2
optimization.  Do not check what XXSPLTIDP generates if the value
is undefined.
---
  .../gcc.target/powerpc/vec-splati-runnable.c  | 29 ++-
  1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
index e84ce77a21d..a135279b1d7 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
@@ -1,7 +1,7 @@
  /* { dg-do run { target { power10_hw } } } */
  /* { dg-do link { target { ! power10_hw } } } */
  /* { dg-require-effective-target power10_ok } */
-/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps -O2" } */


Why did you restrict optimization here?  The tests should be run at 
various opt levels if you don't specify this, right?


The test changes are otherwise okay, but I'd like to understand this first.

Thanks,
Bill


  #include 
  
  #define DEBUG 0

@@ -12,6 +12,8 @@
  
  extern void abort (void);
  
+volatile vector double vresult_d_undefined;

+
  int
  main (int argc, char *argv [])
  {
@@ -85,25 +87,12 @@ main (int argc, char *argv [])
  #endif
}
  
-  /* This test will generate a "note" to the user that the argument

- is subnormal.  It is not an error, but results are not defined.  */
-  vresult_d = (vector double) { 2.0, 3.0 };
-  expected_vresult_d = (vector double) { 6.6E-42f, 6.6E-42f };
-
-  vresult_d = vec_splatid (6.6E-42f);
-
-  /* Although the instruction says the results are not defined, it does seem
- to work, at least on Mambo.  But no guarentees!  */
-  if (!vec_all_eq (vresult_d,  expected_vresult_d)) {
-#if DEBUG
-printf("ERROR, vec_splati (6.6E-42f)\n");
-for(i = 0; i < 2; i++)
-  printf(" vresult_d[%i] = %e, expected_vresult_d[%i] = %e\n",
-i, vresult_d[i], i, expected_vresult_d[i]);
-#else
-;
-#endif
-  }
+  /* This test will generate a "note" to the user that the argument is
+ subnormal.  It is not an error, but results are not defined.  Because this
+ is undefined, we cannot check that any value is correct.  Just store it in
+ a volatile variable so the XXSPLTIDP instruction gets generated and the
+ warning message printed. */
+  vresult_d_undefined = vec_splatid (6.6E-42f);
  
/* Vector splat immediate */

vsrc_a_int = (vector int) { 2, 3, 4, 5 };

Re: Repost: [PATCH] Deal with prefixed loads/stores in tests, PR testsuite/100166

2021-07-12 Thread Bill Schmidt via Gcc-patches


Hi Mike,

On 7/7/21 3:03 PM, Michael Meissner wrote:

[PATCH] Deal with prefixed loads/stores in tests, PR testsuite/100166

This patch updates the various tests in the testsuite to treat plxv
and pstxv as being vector loads/stores.  This shows up if you run the
testsuite with a compiler configured with the option: --with-cpu=power10.

I have verified that these tests now all pass when I build and test a compiler
on a power10 system using --with-cpu=power10.  I have verified that they
continue to run on power9 little endian and power8 big endian systems.

Can I check this into the master branch?

2021-07-07  Michael Meissner  

gcc/testsuite/
PR testsuite/100166
* gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c:

Please drop the gcc/testsuite/ part from this line.

* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c:
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c:
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c:
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c:
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c:
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c:
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c:
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c:
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c:
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c:
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c:
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c:
* gcc.target/powerpc/fold-vec-load-vec_xl-char.c:
* gcc.target/powerpc/fold-vec-load-vec_xl-double.c:
* gcc.target/powerpc/fold-vec-load-vec_xl-float.c:
* gcc.target/powerpc/fold-vec-load-vec_xl-int.c:
* gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c:
* gcc.target/powerpc/fold-vec-load-vec_xl-short.c:
* gcc.target/powerpc/fold-vec-splat-floatdouble.c:
* gcc.target/powerpc/fold-vec-splat-longlong.c:
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c:
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c:
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c:
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c:
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-longlong.c:
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c:
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c:
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c:
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c:
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c:
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c:
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c:
* gcc.target/powerpc/fold-vec-store-vec_xst-char.c:
* gcc.target/powerpc/fold-vec-store-vec_xst-double.c:
* gcc.target/powerpc/fold-vec-store-vec_xst-float.c:
* gcc.target/powerpc/fold-vec-store-vec_xst-int.c:
* gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c:
* gcc.target/powerpc/fold-vec-store-vec_xst-short.c:
* gcc.target/powerpc/lvsl-lvsr.c:
* gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c:
Update insn counts to account for power10 prefixed loads and
stores.

Also here.

---
  .../vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c   | 2 +-
  .../gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c | 2 +-
  .../powerpc/fold-vec-load-builtin_vec_xl-double.c  | 2 +-
  .../powerpc/fold-vec-load-builtin_vec_xl-float.c   | 2 +-
  .../gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c  | 2 +-
  .../powerpc/fold-vec-load-builtin_vec_xl-longlong.c| 2 +-
  .../powerpc/fold-vec-load-builtin_vec_xl-short.c   | 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c | 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c   | 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c| 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c  | 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c | 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c| 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_xl-char.c | 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_xl-double.c   | 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_xl-float.c| 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_xl-int.c  | 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c | 2 +-
  .../gcc.target/powerpc/fold-vec-load-vec_xl-short.c| 2 +-
  .../gcc.target/powerpc/fold-vec-splat-floatdouble.c| 7 ---
  gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c | 2 +-
  .../powerpc/fold-vec-store-builtin_vec_xst-char.c  | 2 +-
  .../powerpc/fold-vec-store-builtin_vec_xst-doubl

Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-12 Thread Qing Zhao via Gcc-patches



> On Jul 12, 2021, at 2:51 AM, Richard Sandiford  
> wrote:
> 
> Martin Jambor  writes:
>> On Thu, Jul 08 2021, Qing Zhao wrote:
>>> (Resend this email since the previous one didn’t quote, I changed one
>>> setting in my mail client, hopefully that can fix this issue).
>>> 
>>> Hi, Martin,
>>> 
>>> Thank you for the review and comment.
>>> 
 On Jul 8, 2021, at 8:29 AM, Martin Jambor  wrote:
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index c05d22f3e8f1..35051d7c6b96 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -384,6 +384,13 @@ static struct
> 
>  /* Numbber of components created when splitting aggregate parameters.  */
>  int param_reductions_created;
> +
> +  /* Number of deferred_init calls that are modified.  */
> +  int deferred_init;
> +
> +  /* Number of deferred_init calls that are created by
> + generate_subtree_deferred_init.  */
> +  int subtree_deferred_init;
> } sra_stats;
> 
> static void
> @@ -4096,6 +4103,110 @@ get_repl_default_def_ssa_name (struct access 
> *racc, tree reg_type)
>  return get_or_create_ssa_default_def (cfun, racc->replacement_decl);
> }
> 
> +
> +/* Generate statements to call .DEFERRED_INIT to initialize scalar 
> replacements
> +   of accesses within a subtree ACCESS; all its children, siblings and 
> their
> +   children are to be processed.
> +   GSI is a statement iterator used to place the new statements.  */
> +static void
> +generate_subtree_deferred_init (struct access *access,
> + tree init_type,
> + tree is_vla,
> + gimple_stmt_iterator *gsi,
> + location_t loc)
> +{
> +  do
> +{
> +  if (access->grp_to_be_replaced)
> + {
> +   tree repl = get_access_replacement (access);
> +   gimple *call
> + = gimple_build_call_internal (IFN_DEFERRED_INIT, 3,
> +   TYPE_SIZE_UNIT (TREE_TYPE (repl)),
> +   init_type, is_vla);
> +   gimple_call_set_lhs (call, repl);
> +   gsi_insert_before (gsi, call, GSI_SAME_STMT);
> +   update_stmt (call);
> +   gimple_set_location (call, loc);
> +   sra_stats.subtree_deferred_init++;
> + }
> +  else if (access->grp_to_be_debug_replaced)
> + {
> +   tree drepl = get_access_replacement (access);
> +   tree call = build_call_expr_internal_loc
> +  (UNKNOWN_LOCATION, IFN_DEFERRED_INIT,
> +   TREE_TYPE (drepl), 3,
> +   TYPE_SIZE_UNIT (TREE_TYPE (drepl)),
> +   init_type, is_vla);
> +   gdebug *ds = gimple_build_debug_bind (drepl, call,
> + gsi_stmt (*gsi));
> +   gsi_insert_before (gsi, ds, GSI_SAME_STMT);
 
 Is handling of grp_to_be_debug_replaced accesses necessary here?  If so,
 why?  grp_to_be_debug_replaced accesses are there only to facilitate
 debug information about a part of an aggregate decl is that is likely
 going to be entirely removed - so that debuggers can sometimes show to
 users information about what they would contain had they not removed.
 It seems strange you need to mark them as uninitialized because they
 should not have any consumers.  (But perhaps it is also harmless.)
>>> 
>>> This part has been discussed during the 2nd version of the patch, but
>>> I think that more discussion might be necessary.
>>> 
>>> In the previous discussion, Richard Sandiford mentioned:
>>> (https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568620.html):
>>> 
>>> =
>>> 
>>> I guess the thing we need to decide here is whether -ftrivial-auto-var-init
>>> should affect debug-only constructs too.  If it doesn't, exmaining removed
>>> components in a debugger might show uninitialised values in cases where
>>> the user was expecting initialised ones.  There would be no security
>>> concern, but it might be surprising.
>>> 
>>> I think in principle the DRHS can contain a call to DEFERRED_INIT.
>>> Doing that would probably require further handling elsewhere though.
>>> 
>>> =
>>> 
>>> I am still not very confident now for this part of the change.
>> 
>> I see.  I still tend to think that with or without the generation of
>> gimple_build_debug_binds, the debugger would still not display any value
>> for the component in question.  Without it there would be no information
>> about the component at a any place in code affected by this, with it the
>> component would be explicitely uninitialized.  But OK.
> 
> FTR, I don't have a strong opinion here.  You know the code better
> than I do, so if you think not generating debug binds is better then
> let's do that.

I am okay with not generating debug binds here. 

Then I will just delete the part of code that guarded with  if 
(

Re: [PATCH V2] gcc: Add vec_select -> subreg RTL simplification

2021-07-12 Thread Richard Sandiford via Gcc-patches

Jonathan Wright  writes:
> Hi,
>
> Version 2 of this patch adds more code generation tests to show the
> benefit of this RTL simplification as well as adding a new helper function
> 'rtx_vec_series_p' to reduce code duplication.
>
> Patch tested as version 1 - ok for master?

Sorry for the slow reply.

> Regression tested and bootstrapped on aarch64-none-linux-gnu,
> x86_64-unknown-linux-gnu, arm-none-linux-gnueabihf and
> aarch64_be-none-linux-gnu - no issues.

I've also tested this on powerpc64le-unknown-linux-gnu, no issues again.

> diff --git a/gcc/combine.c b/gcc/combine.c
> index 
> 6476812a21268e28219d1e302ee1c979d528a6ca..0ff6ca87e4432cfeff1cae1dd219ea81ea0b73e4
>  100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -6276,6 +6276,26 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, 
> int in_dest,
> - 1,
> 0));
>break;
> +case VEC_SELECT:
> +  {
> + rtx trueop0 = XEXP (x, 0);
> + mode = GET_MODE (trueop0);
> + rtx trueop1 = XEXP (x, 1);
> + int nunits;
> + /* If we select a low-part subreg, return that.  */
> + if (GET_MODE_NUNITS (mode).is_constant (&nunits)
> + && targetm.can_change_mode_class (mode, GET_MODE (x), ALL_REGS))
> +   {
> + int offset = BYTES_BIG_ENDIAN ? nunits - XVECLEN (trueop1, 0) : 0;
> +
> + if (rtx_vec_series_p (trueop1, offset))
> +   {
> + rtx new_rtx = lowpart_subreg (GET_MODE (x), trueop0, mode);
> + if (new_rtx != NULL_RTX)
> +   return new_rtx;
> +   }
> +   }
> +  }

Since this occurs three times, I think it would be worth having
a new predicate:

/* Return true if, for all OP of mode OP_MODE:

 (vec_select:RESULT_MODE OP SEL)

   is equivalent to the lowpart RESULT_MODE of OP.  */

bool
vec_series_lowpart_p (machine_mode result_mode, machine_mode op_mode, rtx sel)

containing the GET_MODE_NUNITS (…).is_constant, can_change_mode_class
and rtx_vec_series_p tests.

I think the function belongs in rtlanal.[hc], even though subreg_lowpart_p
is in emit-rtl.c.

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 
> aef6da9732d45b3586bad5ba57dafa438374ac3c..f12a0bebd3d6dd3381ac8248cd3fa3f519115105
>  100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1884,15 +1884,16 @@
>  )
>  
>  (define_insn "*zero_extend2_aarch64"
> -  [(set (match_operand:GPI 0 "register_operand" "=r,r,w")
> -(zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" 
> "r,m,m")))]
> +  [(set (match_operand:GPI 0 "register_operand" "=r,r,w,r")
> +(zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" 
> "r,m,m,w")))]
>""
>"@
> and\t%0, %1, 
> ldr\t%w0, %1
> -   ldr\t%0, %1"
> -  [(set_attr "type" "logic_imm,load_4,f_loads")
> -   (set_attr "arch" "*,*,fp")]
> +   ldr\t%0, %1
> +   umov\t%w0, %1.[0]"
> +  [(set_attr "type" "logic_imm,load_4,f_loads,neon_to_gp")
> +   (set_attr "arch" "*,*,fp,fp")]

FTR (just to show I thought about it): I don't know whether the umov
can really be considered an fp operation rather than a simd operation,
but since we don't support fp without simd, this is already a distinction
without a difference.  So the pattern is IMO OK as-is.

> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
> index 
> 55b6c1ac585a4cae0789c3afc0fccfc05a6d3653..93e963696dad30f29a76025696670f8b31bf2c35
>  100644
> --- a/gcc/config/arm/vfp.md
> +++ b/gcc/config/arm/vfp.md
> @@ -224,7 +224,7 @@
>  ;; problems because small constants get converted into adds.
>  (define_insn "*arm_movsi_vfp"
>[(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,rk,m 
> ,*t,r,*t,*t, *Uv")
> -  (match_operand:SI 1 "general_operand" "rk, 
> I,K,j,mi,rk,r,*t,*t,*Uvi,*t"))]
> +  (match_operand:SI 1 "general_operand" "rk, 
> I,K,j,mi,rk,r,t,*t,*Uvi,*t"))]
>"TARGET_ARM && TARGET_HARD_FLOAT
> && (   s_register_operand (operands[0], SImode)
> || s_register_operand (operands[1], SImode))"

I'll assume that an Arm maintainer would have spoken up by now if
they didn't want this for some reason.

> diff --git a/gcc/rtl.c b/gcc/rtl.c
> index 
> aaee882f5ca3e37b59c9829e41d0864070c170eb..3e8b3628b0b76b41889b77bb0019f582ee6f5aaa
>  100644
> --- a/gcc/rtl.c
> +++ b/gcc/rtl.c
> @@ -736,6 +736,19 @@ rtvec_all_equal_p (const_rtvec vec)
>  }
>  }
>  
> +/* Return true if element-selection indices in VEC are in series.  */
> +
> +bool
> +rtx_vec_series_p (const_rtx vec, int start)

I think rtvec_series_p would be better, for consistency with
rtvec_all_equal_p.  Also, let's generalise it to:

/* Return true if VEC contains a linear series of integers
   { START, START+1, START+2, ... }.  */

bool
rtvec_series_p (rtvec vec, int start)
{
}

> +{
> +  for (int i = 0; i < XVECLEN (vec, 0); i++)
> +{
> +  if (i + start != INTVAL (XVECEXP (vec, 0, i)))
> + return false;
> +}

Re: [PATCH] libgomp: Include early to avoid link failure with glibc 2.34

2021-07-12 Thread Florian Weimer via Gcc-patches

* Florian Weimer:

> * Florian Weimer:
>
>> I tested this on csky-linux-gnuabiv2 with the glibc version that failed
>> before, and it works.  So I guess your version is fine, too.
>
> Build on powerpc64-linux-gnu and other targets now fails with:
>
> /home/bmg/src/gcc/libgomp/config/linux/affinity.c: In function 
> ‘gomp_init_affini
> ty’:
> /home/bmg/src/gcc/libgomp/config/linux/affinity.c:53:41: error: ‘ULONG_MAX’ 
> unde
> clared (first use in this function)
>53 |   if (!gomp_affinity_init_level (1, ULONG_MAX, true))
>   | ^
>
> So affinity.c will need to include .

I verifed that this change on top successfully builds GCC for all glibc
targets:

diff --git a/libgomp/config/linux/affinity.c b/libgomp/config/linux/affinity.c
index c5abdce23..1b636c613 100644
--- a/libgomp/config/linux/affinity.c
+++ b/libgomp/config/linux/affinity.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef HAVE_PTHREAD_AFFINITY_NP
 

Thanks,
Florian

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Jonathan Wakely via Gcc-patches

On Mon, 12 Jul 2021 at 15:52, Jonathan Wakely  wrote:
>
> On Mon, 12 Jul 2021 at 15:13, Eli Zaretskii  wrote:
> > I get it that you dislike the HTML produced by Texinfo, but without
> > some examples of such bad HTML it is impossible to know what exactly
> > do you dislike and why.
> >
> > > You can't find out the anchors without inspecting (and searching)
> > > the HTML source. That's utterly stupid.
> >
> > I don't think I follow: find out the anchors with which means and for
> > what purposes?
>
> I want to point a user at the documentation for the -c option. I can't
> do that without examining the HTML source to find the anchor, then
> manually editing the URL to append the anchor. It's a tedious process,
> and the result is an anchor that doesn't even point to the option but
> to the text following it. The process is unnecessarily difficult and
> the results are bad.
>
> You participated in a discussion about this very topic previously:
> https://lists.gnu.org/archive/html/help-texinfo/2019-02/msg0.html
>
> >
> > > And even after you do that, the anchor
> > > is at the wrong place:
> > > https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#index-c
> >
> > IME, the anchor is where you put it.  If you show me the source of
> > that HTMl, maybe we can have a more useful discussion of the issue.
>
> @item -c
> @opindex c
> Compile or assemble the source files, but do not link.  The linking
> stage simply is not done.  The ultimate output is in the form of an
> object file for each source file.
>
> Putting the @opindex before the @item causes the anchor to be placed
> on the previous item, which is not desirable.

GNU Hello has the same problem with its docs:
https://www.gnu.org/software/hello/manual/hello.html#index-_002dg
That URL is garbage because of the URL-encoded %2d character, and the
fact it links to the wrong place (the description of the option, not
the option itself). The former is no longer an issue for GCC (it was
for many years) but the latter is still a problem.

If you don't know where to find it yourself, the source is visible here:
https://github.com/yugui/example/blob/master/doc/hello.texi#L208

If GNU Hello and GCC can't get this right using texinfo, maybe texinfo
is not fit for purpose?

Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-07-12 Thread Richard Sandiford via Gcc-patches

Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Monday, July 12, 2021 11:26 AM
>> To: Tamar Christina 
>> Cc: Richard Biener ; nd ; gcc-
>> patc...@gcc.gnu.org
>> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
>> where the sign for the multiplicant changes.
>> 
>> Tamar Christina  writes:
>> >> -Original Message-
>> >> From: Richard Sandiford 
>> >> Sent: Monday, July 12, 2021 10:39 AM
>> >> To: Tamar Christina 
>> >> Cc: Richard Biener ; nd ; gcc-
>> >> patc...@gcc.gnu.org
>> >> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
>> >> where the sign for the multiplicant changes.
>> >>
>> >> Tamar Christina  writes:
>> >> > Hi,
>> >> >
>> >> >> Richard Sandiford  writes:
>> >> >> >> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info
>> >> >> *vinfo,
>> >> >> >>/* FORNOW.  Can continue analyzing the def-use chain when
>> >> >> >> this stmt in
>> >> >> a phi
>> >> >> >>   inside the loop (in case we are analyzing an outer-loop).  */
>> >> >> >>vect_unpromoted_value unprom0[2];
>> >> >> >> +  enum optab_subtype subtype = optab_vector;
>> >> >> >>if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
>> >> >> WIDEN_MULT_EXPR,
>> >> >> >> -false, 2, unprom0, &half_type))
>> >> >> >> +false, 2, unprom0, &half_type, &subtype))
>> >> >> >> +return NULL;
>> >> >> >> +
>> >> >> >> +  if (subtype == optab_vector_mixed_sign
>> >> >> >> +  && TYPE_UNSIGNED (unprom_mult.type)
>> >> >> >> +  && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION
>> >> >> >> + (unprom_mult.type))
>> >> >> >>  return NULL;
>> >> >> >
>> >> >> > Isn't the final condition here instead that TYPE1 is narrower than
>> TYPE2?
>> >> >> > I.e. we need to reject the case in which we multiply a signed
>> >> >> > and an unsigned value to get a (logically) signed result, but
>> >> >> > then zero-extend it (rather than sign-extend it) to the
>> >> >> > precision of the
>> >> addition.
>> >> >> >
>> >> >> > That would make the test:
>> >> >> >
>> >> >> >   if (subtype == optab_vector_mixed_sign
>> >> >> >   && TYPE_UNSIGNED (unprom_mult.type)
>> >> >> >   && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION
>> (type))
>> >> >> > return NULL;
>> >> >> >
>> >> >> > instead.
>> >> >>
>> >> >> And folding that into the existing test gives:
>> >> >>
>> >> >>   /* If there are two widening operations, make sure they agree on
>> >> >> the
>> >> sign
>> >> >>  of the extension.  The result of an optab_vector_mixed_sign
>> operation
>> >> >>  is signed; otherwise, the result has the same sign as the 
>> >> >> operands.
>> */
>> >> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>> >> >>   && (subtype == optab_vector_mixed_sign
>> >> >>  ? TYPE_UNSIGNED (unprom_mult.type)
>> >> >>  : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
>> >> >> return NULL;
>> >> >>
>> >> >
>> >> > I went with the first one which doesn't add the extra constraints
>> >> > for the normal dotproduct as that makes it too restrictive. It's
>> >> > the type of the multiplication that determines the operation so
>> >> > dotproduct can be used a bit more than where we currently do.
>> >> >
>> >> > This was relaxed in an earlier patch.
>> >>
>> >> I didn't mean that we should add extra constraints to the normal case
>> though.
>> >> The existing test I was referring to above was:
>> >>
>> >>   /* If there are two widening operations, make sure they agree on
>> >>  the sign of the extension.  */
>> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>> >>   && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
>> >> return NULL;
>> >
>> > But as I mentioned, this restriction is unneeded and has been removed
>> hence why it's not in my patchset's diff.
>> > It's removed by
>> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569851.html which
>> Richi conditioned on the rest of these patches being approved.
>> >
>> > This change needlessly blocks test vect-reduc-dot-[2,3,6,7].c from
>> > being dotproducts for instance
>> >
>> > It's also part of the deficiency between GCC codegen and Clang
>> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492#c6
>> 
>> Hmm, OK.  Just removing the check regresses:
>> 
>> unsigned long __attribute__ ((noipa))
>> f (signed short *x, signed short *y)
>> {
>>   unsigned long res = 0;
>>   for (int i = 0; i < 100; ++i)
>> res += (unsigned int) x[i] * (unsigned int) y[i];
>>   return res;
>> }
>> 
>> int
>> main (void)
>> {
>>   signed short x[100], y[100];
>>   for (int i = 0; i < 100; ++i)
>> {
>>   x[i] = -1;
>>   y[i] = 1;
>> }
>>   if (f (x, y) != 0x64ULL - 100)
>> __builtin_abort ();
>>   return 0;
>> }
>> 
>> on SVE.  We then use SDOT even though the result of the multiplication is
>> zero- rather than sign-extended to 64 bits.  Does something el

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Jonathan Wakely via Gcc-patches

On Mon, 12 Jul 2021 at 15:52, Jonathan Wakely  wrote:
>
> On Mon, 12 Jul 2021 at 15:13, Eli Zaretskii  wrote:
> >
> > > From: Jonathan Wakely 
> > > This is a problem with texinfo too.
> >
> > Not for someone who already knows Texinfo.  We are talking about
> > switching away of it, so I'm thinking about people who contributed
> > patches for the manual in the past.  They already know Texinfo, at
> > least to some extent, and some of them know it very well.
>
> I've contributed dozens of patches to the manual, and I don't want to
> have to use texinfo to do it in future.

And some of the people already know sphinx, and some know it very
well. And it seems likely that future contributors are more likely to
know a more modern tool than they are to know texinfo.

You like texinfo. We get it.

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Jonathan Wakely via Gcc-patches

On Mon, 12 Jul 2021 at 15:13, Eli Zaretskii  wrote:
>
> > From: Jonathan Wakely 
> > Date: Mon, 12 Jul 2021 14:53:44 +0100
> > Cc: Martin Liška ,
> >   "g...@gcc.gnu.org" , gcc-patches 
> > ,
> >   "Joseph S. Myers" 
> >
> > For me, these items are enough justification to switch away from
> > texinfo, which produces crap HTML pages with crap anchors.
>
> If we want to have a serious discussion with useful conclusions, I
> suggest to avoid "loaded" terminology.

But the results *are* crap.

>
> I get it that you dislike the HTML produced by Texinfo, but without
> some examples of such bad HTML it is impossible to know what exactly
> do you dislike and why.
>
> > You can't find out the anchors without inspecting (and searching)
> > the HTML source. That's utterly stupid.
>
> I don't think I follow: find out the anchors with which means and for
> what purposes?

I want to point a user at the documentation for the -c option. I can't
do that without examining the HTML source to find the anchor, then
manually editing the URL to append the anchor. It's a tedious process,
and the result is an anchor that doesn't even point to the option but
to the text following it. The process is unnecessarily difficult and
the results are bad.

You participated in a discussion about this very topic previously:
https://lists.gnu.org/archive/html/help-texinfo/2019-02/msg0.html

>
> > And even after you do that, the anchor
> > is at the wrong place:
> > https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#index-c
>
> IME, the anchor is where you put it.  If you show me the source of
> that HTMl, maybe we can have a more useful discussion of the issue.

@item -c
@opindex c
Compile or assemble the source files, but do not link.  The linking
stage simply is not done.  The ultimate output is in the form of an
object file for each source file.

Putting the @opindex before the @item causes the anchor to be placed
on the previous item, which is not desirable.

>
> > As somebody who spends a lot of time helping users on the mailing
> > list, IRC, stackoverflow, and elsewhere, this "feature" of the texinfo
> > HTML has angered me for many years.
>
> As somebody who spends a lot of time helping users on every possible
> forum, and as someone who has wrote a lot of Texinfo, I don't
> understand what angers you.  Please elaborate.

I don't know what part of my email you don't understand. The HTML
anchors that texinfo creates are in the wrong place, and not
"discoverable". If you don't understand that, then you're clearly not
using the GCC HTML docs, and so I'm not surprised you think there's no
reason to ditch texinfo. As a regular user of the HTML (for myself and
end users of GCC), the HTML output has major usability problems.

> > Yes, some people like texinfo, but some people also dislike it and
> > there are serious usability problems with the output. I support
> > replacing texinfo with anything that isn't texinfo.
>
> "Anything"?  Even plain text?  I hope not.

Plain text with a tool to generate good HTML might be better than texinfo.

> See, such "arguments" don't help to have a useful discussion.

Your insistence that texinfo is fine doesn't either. It's not fine.

> > >  4) The need to learn yet another markup language.
> > > While this is not a problem for simple text, it does require a
> > > serious study of RST and Sphinx to use the more advanced features.
> >
> > This is a problem with texinfo too.
>
> Not for someone who already knows Texinfo.  We are talking about
> switching away of it, so I'm thinking about people who contributed
> patches for the manual in the past.  They already know Texinfo, at
> least to some extent, and some of them know it very well.

I've contributed dozens of patches to the manual, and I don't want to
have to use texinfo to do it in future.

> > >  5) Lack of macros.
> > > AFAIK, only simple textual substitution is available, no macros
> > > with arguments.
> >
> > Is this a problem for GCC docs though?
>
> I don't know.  It could be, even if it isn't now.

So not a problem then.

Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]

2021-07-12 Thread Richard Biener

On Mon, 12 Jul 2021, guojiufu wrote:

> On 2021-07-12 18:02, Richard Biener wrote:
> > On Mon, 12 Jul 2021, guojiufu wrote:
> > 
> >> On 2021-07-12 16:57, Richard Biener wrote:
> >> > On Mon, 12 Jul 2021, guojiufu wrote:
> >> >
> >> >> On 2021-07-12 14:20, Richard Biener wrote:
> >> >> > On Fri, 9 Jul 2021, Segher Boessenkool wrote:
> >> >> >
> >> >> >> On Fri, Jul 09, 2021 at 08:43:59AM +0200, Richard Biener wrote:
> >> >> >> > I wonder if there's a way to query the target what modes the doloop
> >> >> >> > pattern can handle (not being too familiar with the doloop code).
> >> >> >>
> >> >> >> You can look what modes are allowed for operand 0 of doloop_end,
> >> >> >> perhaps?  Although that is a define_expand, not a define_insn, so it
> >> >> >> is
> >> >> >> hard to introspect.
> >> >> >>
> >> >> >> > Why do you need to do any checks besides the new type being able to
> >> >> >> > represent all IV values?  The original doloop IV will never wrap
> >> >> >> > (OTOH if niter is U*_MAX then we compute niter + 1 which will
> >> >> >> > become
> >> >> >> > zero ... I suppose the doloop might still do the correct thing here
> >> >> >> > but it also still will with a IV with larger type).
> >> >>
> >> >> The issue comes from U*_MAX (original short MAX), as you said: on which
> >> >> niter + 1 becomes zero.  And because the step for doloop is -1; then, on
> >> >> larger type 'zero - 1' will be a very large number on larger type
> >> >> (e.g. 0xff...ff); but on the original short type 'zero - 1' is a small
> >> >> value
> >> >> (e.g. "0xff").
> >> >
> >> > But for the larger type the small type MAX + 1 fits and does not yield
> >> > zero so it should still work exactly as before, no?  Of course you
> >> > have to compute the + 1 in the larger type.
> >> >
> >> You are right, if compute the "+ 1" in the larger type it is ok, as below
> >> code:
> >> ```
> >>/* Use type in word size may fast.  */
> >> if (TYPE_PRECISION (ntype) < BITS_PER_WORD)
> >>   {
> >> ntype = lang_hooks.types.type_for_size (BITS_PER_WORD, 1);
> >> niter = fold_convert (ntype, niter);
> >>   }
> >> 
> >> tree base = fold_build2 (PLUS_EXPR, ntype, unshare_expr (niter),
> >>  build_int_cst (ntype, 1));
> >> 
> >> 
> >> add_candidate (data, base, build_int_cst (ntype, -1), true, NULL, NULL,
> >> true);
> >> ```
> >> The issue of this is, this code generates more stmt for doloop.xxx:
> >>   _12 = (unsigned int) xx(D);
> >>   _10 = _12 + 4294967295;
> >>   _24 = (long unsigned int) _10;
> >>   doloop.6_8 = _24 + 1;
> >> 
> >> if use previous patch, "+ 1" on original type, then the stmts will looks
> >> like:
> >>   _12 = (unsigned int) xx(D);
> >>   doloop.6_8 = (long unsigned int) _12;
> >> 
> >> This is the reason for checking
> >>wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE (ntype)))
> > 
> > But this then only works when there's an upper bound on the number
> > of iterations.  Note you should not use TYPE_MAX_VALUE here but
> > you can instead use
> > 
> >  wi::ltu_p (niter_desc->max, wi::to_widest (wi::max_value
> > (TYPE_PRECISION (ntype), TYPE_SIGN (ntype;
> 
> Ok, Thanks!
> I remember you mentioned that:
> widest_int::from (wi::max_value (TYPE_PRECISION (ntype), TYPE_SIGN (ntype)),
> TYPE_SIGN (ntype))
> would be better than
> wi::to_widest (TYPE_MAX_VALUE (ntype)).
> 
> It seems that:
> "TYPE_MAX_VALUE (ntype)" is "NUMERICAL_TYPE_CHECK
> (NODE)->type_non_common.maxval"
> which do a numerical-check and return the field of maxval.  And then call to
> wi::to_widest
> 
> The other code "widest_int::from (wi::max_value (..,..),..)" calls
> wi::max_value
> and widest_int::from.
> 
> I'm wondering if wi::to_widest (TYPE_MAX_VALUE (ntype)) is cheaper?

TYPE_MAX_VALUE can be "suprising", it does not necessarily match the
underlying modes precision.  At some point we've tried to eliminate
most of its uses, not sure what the situation/position is right now.

> > I think the -1 above comes from number of latch iterations vs. header
> > entries - it's a common source for this kind of issues.  range analysis
> > might be able to prove that we can still merge the two adds even with
> > the intermediate extension.
> Yes, as you mentioned here, it relates to number of latch iterations
> For loop looks like : while (l < n) or for (i = 0; i < n; i++)
> This kind of loop, the niter is used to be 'n - 1' after transformed
> into 'do-while' form.
> I would see how to merge these two adds safely at this point
> when generating doloop iv. (maybe range info, thanks!
>
> > 
> > Is this pre-loop extra add really offsetting the in-loop doloop
> > improvements?
> I'm not catching this question too much, sorry.  I guess your concern
> is if the "+1" is an offset: it may not, "+1" may be just that doloop.xx
> is decreasing niter until 0 (all number >0).
> If misunderstand,  thanks for point out.

I'm questioning the argument that not being able to eliminate the +1-1
pair effect

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Martin Liška

On 7/12/21 3:39 PM, Eli Zaretskii wrote:

Cc: g...@gcc.gnu.org, gcc-patches@gcc.gnu.org, jos...@codesourcery.com
From: Martin Liška 
Date: Mon, 12 Jul 2021 15:25:47 +0200

Let's make it a separate sub-thread where we can discuss motivation why
do I want moving to Sphinx format.

Thanks for starting this discussion.

Benefits:
1) modern looking HTML output (before: [1], after: [2]):
 a) syntax highlighting for examples (code, shell commands, etc.)
 b) precise anchors, the current Texinfo anchors are not displayed (start 
with first line of an option)
 c) one can easily copy a link to an anchor (displayed as ¶)
 d) internal links are working, e.g. one can easily jump from listing of 
options
 e) left menu navigation provides better orientation in the manual
 f) Sphinx provides internal search capability: [3]
2) internal links are also provided in PDF version of the manual

How is this different from Texinfo?

Texinfo does not emit them. See e.g. links in option listing (-O2, -Os, ...).

3) some existing GCC manuals are already written in Sphinx (GNAT manuals and 
libgccjit)
4) support for various output formats, some people are interested in ePUB format

Texinfo likewise supports many output formats.  Someone presented a
very simple package to produce epub format from it.

Good to know.

5) Sphinx is using RST which is quite minimal semantic markup language

Is it more minimal than Texinfo?

I would say that's pretty easy to learn, similarly complex as Texinfo.

6) TOC is automatically generated - no need for manual navigation like seen 
here: [5]

That is not needed in Texinfo as well, since long ago.  Nowadays, you
just say

   @node Whatever

and the rest is done automatically, as long as the manual's structure
is a proper tree (which it normally is, I know of only one manual that
is an exception).

All right, then we likely do an extra work right now.

Disadvantages:

1) info pages are currently missing Page description in TOC
2) rich formatting is leading to extra wrapping in info output - beings 
partially addresses in [4]
3) one needs e.g. Emacs support for inline links (rendered as notes)

  4) The need to learn yet another markup language.
 While this is not a problem for simple text, it does require a
 serious study of RST and Sphinx to use the more advanced features.

No, majority of the documentation is pretty simple: basic formatting, links, 
tables and
code examples.

Martin

  5) Lack of macros.
 AFAIK, only simple textual substitution is available, no macros
 with arguments.

Re: [PATCH] libgomp: Include early to avoid link failure with glibc 2.34

2021-07-12 Thread Florian Weimer via Gcc-patches

* Florian Weimer:

> I tested this on csky-linux-gnuabiv2 with the glibc version that failed
> before, and it works.  So I guess your version is fine, too.

Build on powerpc64-linux-gnu and other targets now fails with:

/home/bmg/src/gcc/libgomp/config/linux/affinity.c: In function ‘gomp_init_affini
ty’:
/home/bmg/src/gcc/libgomp/config/linux/affinity.c:53:41: error: ‘ULONG_MAX’ unde
clared (first use in this function)
   53 |   if (!gomp_affinity_init_level (1, ULONG_MAX, true))
  | ^

So affinity.c will need to include .

Thanks,
Florian

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Martin Liška

On 7/12/21 4:16 PM, Eli Zaretskii wrote:

From: Jonathan Wakely 
Date: Mon, 12 Jul 2021 15:05:11 +0100
Cc: Martin Liška ,
"g...@gcc.gnu.org" , gcc-patches 
,
"Joseph S. Myers" 

To be clear, I give links to users frequently (several times a week,
every week, for decades) and prefer to give them a link to specific
options. Obviously I link to the online HTML docs rather than telling
them an 'info' command to run, because most people don't use info
pages or know how to navigate them. That means I can't provide decent
links, because the actual option name I'm trying to link to is always
off the top of the page. This is simply unacceptable IMHO. Texinfo
must go.

"Texinfo must go" is one possible conclusion from your description.
But it isn't the only one.  An alternative is "the Texinfo source of
the GCC manual must be improved to fix this problem."  And yes, this
problem does have a solution in Texinfo.

No, the alternative is more powerful output given by Texinfo, in particular
more modern HTML pages.

Martin

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Martin Liška

On 7/12/21 4:12 PM, Eli Zaretskii wrote:

From: Jonathan Wakely 
Date: Mon, 12 Jul 2021 14:53:44 +0100
Cc: Martin Liška ,
"g...@gcc.gnu.org" , gcc-patches 
,
"Joseph S. Myers" 

For me, these items are enough justification to switch away from
texinfo, which produces crap HTML pages with crap anchors.

If we want to have a serious discussion with useful conclusions, I
suggest to avoid "loaded" terminology.

I get it that you dislike the HTML produced by Texinfo, but without
some examples of such bad HTML it is impossible to know what exactly
do you dislike and why.

Please follow my 1) from Benefits and *read* bullet points a) to f). That will
give you an answer.

You can't find out the anchors without inspecting (and searching)
the HTML source. That's utterly stupid.

I don't think I follow: find out the anchors with which means and for
what purposes?

Benefits, 1c).

And even after you do that, the anchor
is at the wrong place:
https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#index-c

IME, the anchor is where you put it.  If you show me the source of
that HTMl, maybe we can have a more useful discussion of the issue.

Problem is that Texinfo emits poor HTML where # link points to a wrong place.
Open the given page, view source and search for .

As somebody who spends a lot of time helping users on the mailing
list, IRC, stackoverflow, and elsewhere, this "feature" of the texinfo
HTML has angered me for many years.

As somebody who spends a lot of time helping users on every possible
forum, and as someone who has wrote a lot of Texinfo, I don't
understand what angers you.  Please elaborate.

You can't point to an option documentation.

Yes, some people like texinfo, but some people also dislike it and
there are serious usability problems with the output. I support
replacing texinfo with anything that isn't texinfo.

"Anything"?  Even plain text?  I hope not.

See, such "arguments" don't help to have a useful discussion.

  4) The need to learn yet another markup language.
 While this is not a problem for simple text, it does require a
 serious study of RST and Sphinx to use the more advanced features.

This is a problem with texinfo too.

Not for someone who already knows Texinfo.  We are talking about
switching away of it, so I'm thinking about people who contributed
patches for the manual in the past.  They already know Texinfo, at
least to some extent, and some of them know it very well.

Yes, people will have to learn a new syntax. Similarly to transition of SVN,
people also had to learn with a more modern tool.

  5) Lack of macros.
 AFAIK, only simple textual substitution is available, no macros
 with arguments.

Is this a problem for GCC docs though?

I don't know.  It could be, even if it isn't now.

Then it's not an argument, sorry.

Martin

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Eli Zaretskii via Gcc-patches

> From: Jonathan Wakely 
> Date: Mon, 12 Jul 2021 15:05:11 +0100
> Cc: Martin Liška , 
>   "g...@gcc.gnu.org" , gcc-patches 
> , 
>   "Joseph S. Myers" 
> 
> To be clear, I give links to users frequently (several times a week,
> every week, for decades) and prefer to give them a link to specific
> options. Obviously I link to the online HTML docs rather than telling
> them an 'info' command to run, because most people don't use info
> pages or know how to navigate them. That means I can't provide decent
> links, because the actual option name I'm trying to link to is always
> off the top of the page. This is simply unacceptable IMHO. Texinfo
> must go.

"Texinfo must go" is one possible conclusion from your description.
But it isn't the only one.  An alternative is "the Texinfo source of
the GCC manual must be improved to fix this problem."  And yes, this
problem does have a solution in Texinfo.

Re: [PATCH] [wwwdocs] Update description of GM2 and document branch

2021-07-12 Thread Gaius Mulley via Gcc-patches

Gerald Pfeifer  writes:

> I realize this predates your patch (which merely changes version numbers),
> but a reference to back ends could be misunderstood. I assume GNU Modula-2
> doesn't just use the back ends (x86, aarch64,...), but also the middle-end
> and tree optimizers etc.?
>
> What do you think about just saying "with GCC 10 and GCC 11".

Hi Gerald,

yes indeed this sounds more accurate.

>>  Work is in progress to move the front end to
>> +the GCC trunk.  The front end is mostly written in Modula-2, but
>> +includes a bootstrap procedure using mc.
>
> On my system mc refers to Midnight Commander :-), whereas I guess mc
> here is about "Modula Compiler"?  Can you rephrase this for the sake
> of those not so closely involved?

ah yes will do!

> Usually I'd just say "subject", which is a header in our mail systems;
> the term "subject line" isn't widely used.

feel free to overrule and use "subject".  I copied the text from other
branch descriptions :-) (there are 38 uses).  I guess there should be
consistency on the web page - perhaps they could all be changed though -
what do you think?

> Thanks (and okay considering the above),
> Gerald

thanks for the suggestions and maintaining the pages.  Below are the
proposed updated patches

regards,
Gaius



possible ChangeLog/commit entry:

htdocs/frontends.html: Update the description of GNU Modula-2.
htdocs/git.html: Document the new devel/modula-2 branch.

updated patches:

diff --git a/htdocs/frontends.html b/htdocs/frontends.html
index bec33b7b..7c8d84bc 100644
--- a/htdocs/frontends.html
+++ b/htdocs/frontends.html
@@ -42,10 +42,10 @@ has a back end that generates assembler directly, using the 
GCC back end.

 http://www.nongnu.org/gm2/";>GNU Modula-2 implements
 the PIM2, PIM3, PIM4 and ISO dialects of the language.  The compiler
-is fully operational with the GCC 4.1.2 back end (on GNU/Linux x86
-systems).  Work is in progress to move the front end to the GCC trunk.
-The front end is mostly written in Modula-2, but includes a bootstrap
-procedure via a heavily modified version of p2c.
+is fully operational with the GCC 10 and GCC 11 (on
+GNU/Linux x86 systems).  Work is in progress to move the front end to
+the GCC trunk.  The front end is mostly written in Modula-2 and it
+includes a bootstrap tool which translates Modula-2 into C/C++.

 Modula-3 (for links see http://www.modula3.org/";>www.modula3.org); SRC M3 is based on an old
diff --git a/htdocs/git.html b/htdocs/git.html
index 2bbfc334..c112980b 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -471,6 +471,17 @@ in Git.
   Further information can be found on the
   https://github.com/Intrepid/GUPC";>GNU UPC page.

+  modula-2
+  This branch is for the
+http://nongnu.org/gm2/homepage.html";>GNU Modula-2
+front end to GCC prior to its integration with the mainline.  The
+branch will be regularly rebased against the mainline.  It is
+maintained by
+mailto:gaius.mul...@southwales.ac.uk";>Gaius Mulley.
+Patches should be
+prefixed with [modula-2] in the subject line.
+  
+
   pph
   This branch implements https://gcc.gnu.org/wiki/pph";> Pre-Parsed
   Headers for C++.  It is maintained by

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Eli Zaretskii via Gcc-patches

> From: Jonathan Wakely 
> Date: Mon, 12 Jul 2021 14:53:44 +0100
> Cc: Martin Liška , 
>   "g...@gcc.gnu.org" , gcc-patches 
> , 
>   "Joseph S. Myers" 
> 
> For me, these items are enough justification to switch away from
> texinfo, which produces crap HTML pages with crap anchors.

If we want to have a serious discussion with useful conclusions, I
suggest to avoid "loaded" terminology.

I get it that you dislike the HTML produced by Texinfo, but without
some examples of such bad HTML it is impossible to know what exactly
do you dislike and why.

> You can't find out the anchors without inspecting (and searching)
> the HTML source. That's utterly stupid.

I don't think I follow: find out the anchors with which means and for
what purposes?

> And even after you do that, the anchor
> is at the wrong place:
> https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#index-c

IME, the anchor is where you put it.  If you show me the source of
that HTMl, maybe we can have a more useful discussion of the issue.

> As somebody who spends a lot of time helping users on the mailing
> list, IRC, stackoverflow, and elsewhere, this "feature" of the texinfo
> HTML has angered me for many years.

As somebody who spends a lot of time helping users on every possible
forum, and as someone who has wrote a lot of Texinfo, I don't
understand what angers you.  Please elaborate.

> Yes, some people like texinfo, but some people also dislike it and
> there are serious usability problems with the output. I support
> replacing texinfo with anything that isn't texinfo.

"Anything"?  Even plain text?  I hope not.

See, such "arguments" don't help to have a useful discussion.

> >  4) The need to learn yet another markup language.
> > While this is not a problem for simple text, it does require a
> > serious study of RST and Sphinx to use the more advanced features.
> 
> This is a problem with texinfo too.

Not for someone who already knows Texinfo.  We are talking about
switching away of it, so I'm thinking about people who contributed
patches for the manual in the past.  They already know Texinfo, at
least to some extent, and some of them know it very well.

> >  5) Lack of macros.
> > AFAIK, only simple textual substitution is available, no macros
> > with arguments.
> 
> Is this a problem for GCC docs though?

I don't know.  It could be, even if it isn't now.

Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]

2021-07-12 Thread guojiufu via Gcc-patches

On 2021-07-12 18:02, Richard Biener wrote:

On Mon, 12 Jul 2021, guojiufu wrote:

On 2021-07-12 16:57, Richard Biener wrote:
> On Mon, 12 Jul 2021, guojiufu wrote:
>
>> On 2021-07-12 14:20, Richard Biener wrote:
>> > On Fri, 9 Jul 2021, Segher Boessenkool wrote:
>> >
>> >> On Fri, Jul 09, 2021 at 08:43:59AM +0200, Richard Biener wrote:
>> >> > I wonder if there's a way to query the target what modes the doloop
>> >> > pattern can handle (not being too familiar with the doloop code).
>> >>
>> >> You can look what modes are allowed for operand 0 of doloop_end,
>> >> perhaps?  Although that is a define_expand, not a define_insn, so it is
>> >> hard to introspect.
>> >>
>> >> > Why do you need to do any checks besides the new type being able to
>> >> > represent all IV values?  The original doloop IV will never wrap
>> >> > (OTOH if niter is U*_MAX then we compute niter + 1 which will become
>> >> > zero ... I suppose the doloop might still do the correct thing here
>> >> > but it also still will with a IV with larger type).
>>
>> The issue comes from U*_MAX (original short MAX), as you said: on which
>> niter + 1 becomes zero.  And because the step for doloop is -1; then, on
>> larger type 'zero - 1' will be a very large number on larger type
>> (e.g. 0xff...ff); but on the original short type 'zero - 1' is a small
>> value
>> (e.g. "0xff").
>
> But for the larger type the small type MAX + 1 fits and does not yield
> zero so it should still work exactly as before, no?  Of course you
> have to compute the + 1 in the larger type.
>
You are right, if compute the "+ 1" in the larger type it is ok, as 
below

code:
```
   /* Use type in word size may fast.  */
if (TYPE_PRECISION (ntype) < BITS_PER_WORD)
  {
ntype = lang_hooks.types.type_for_size (BITS_PER_WORD, 1);
niter = fold_convert (ntype, niter);
  }

tree base = fold_build2 (PLUS_EXPR, ntype, unshare_expr (niter),
 build_int_cst (ntype, 1));

add_candidate (data, base, build_int_cst (ntype, -1), true, NULL, 
NULL,

true);
```
The issue of this is, this code generates more stmt for doloop.xxx:
  _12 = (unsigned int) xx(D);
  _10 = _12 + 4294967295;
  _24 = (long unsigned int) _10;
  doloop.6_8 = _24 + 1;

if use previous patch, "+ 1" on original type, then the stmts will 
looks like:

  _12 = (unsigned int) xx(D);
  doloop.6_8 = (long unsigned int) _12;

This is the reason for checking
   wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE (ntype)))

But this then only works when there's an upper bound on the number
of iterations.  Note you should not use TYPE_MAX_VALUE here but
you can instead use

 wi::ltu_p (niter_desc->max, wi::to_widest (wi::max_value
(TYPE_PRECISION (ntype), TYPE_SIGN (ntype;

Ok, Thanks!
I remember you mentioned that:
widest_int::from (wi::max_value (TYPE_PRECISION (ntype), TYPE_SIGN 
(ntype)), TYPE_SIGN (ntype))

would be better than
wi::to_widest (TYPE_MAX_VALUE (ntype)).

It seems that:
"TYPE_MAX_VALUE (ntype)" is "NUMERICAL_TYPE_CHECK 
(NODE)->type_non_common.maxval"
which do a numerical-check and return the field of maxval.  And then 
call to

wi::to_widest

The other code "widest_int::from (wi::max_value (..,..),..)" calls 
wi::max_value

and widest_int::from.

I'm wondering if wi::to_widest (TYPE_MAX_VALUE (ntype)) is cheaper?

I think the -1 above comes from number of latch iterations vs. header
entries - it's a common source for this kind of issues.  range analysis
might be able to prove that we can still merge the two adds even with
the intermediate extension.

Yes, as you mentioned here, it relates to number of latch iterations
For loop looks like : while (l < n) or for (i = 0; i < n; i++)
This kind of loop, the niter is used to be 'n - 1' after transformed
into 'do-while' form.
I would see how to merge these two adds safely at this point
when generating doloop iv. (maybe range info, thanks!

Is this pre-loop extra add really offsetting the in-loop doloop
improvements?

I'm not catching this question too much, sorry.  I guess your concern
is if the "+1" is an offset: it may not, "+1" may be just that doloop.xx
is decreasing niter until 0 (all number >0).
If misunderstand,  thanks for point out.

>> >>
>> >> doloop_valid_p guarantees it is simple and doesn't wrap.
>> >>
>> >> > I'd have expected sth like
>> >> >
>> >> >ntype = lang_hooks.types.type_for_mode (word_mode, TYPE_UNSIGNED
>> >> > (ntype));
>> >> >
>> >> > thus the decision made using a mode - which is also why I wonder
>> >> > if there's a way to query the target for this.  As you say,
>> >> > it _may_ be fast, so better check (somehow).
>>
>>
>> I was also thinking of using hooks like type_for_size/type_for_mode.
>> /* Use type in word size may fast.  */
>> if (TYPE_PRECISION (ntype) < BITS_PER_WORD
>> && Wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE
>> (ntype
>>   {
>> ntype = lang_hooks.types.type_for_size (BITS_PER_WORD, 1);

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Jonathan Wakely via Gcc-patches

On Mon, 12 Jul 2021 at 14:53, Jonathan Wakely wrote:
> For me, these items are enough justification to switch away from
> texinfo, which produces crap HTML pages with crap anchors. You can't
> find out the anchors without inspecting (and searching) the HTML
> source. That's utterly stupid. And even after you do that, the anchor
> is at the wrong place:
> https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#index-c
> As somebody who spends a lot of time helping users on the mailing
> list, IRC, stackoverflow, and elsewhere, this "feature" of the texinfo
> HTML has angered me for many years.

To be clear, I give links to users frequently (several times a week,
every week, for decades) and prefer to give them a link to specific
options. Obviously I link to the online HTML docs rather than telling
them an 'info' command to run, because most people don't use info
pages or know how to navigate them. That means I can't provide decent
links, because the actual option name I'm trying to link to is always
off the top of the page. This is simply unacceptable IMHO. Texinfo
must go.

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Jonathan Wakely via Gcc-patches

On Mon, 12 Jul 2021 at 14:41, Eli Zaretskii via Gcc  wrote:
>
> > Cc: g...@gcc.gnu.org, gcc-patches@gcc.gnu.org, jos...@codesourcery.com
> > From: Martin Liška 
> > Date: Mon, 12 Jul 2021 15:25:47 +0200
> >
> > Let's make it a separate sub-thread where we can discuss motivation why
> > do I want moving to Sphinx format.
>
> Thanks for starting this discussion.
>
> > Benefits:
> > 1) modern looking HTML output (before: [1], after: [2]):
> > a) syntax highlighting for examples (code, shell commands, etc.)
> > b) precise anchors, the current Texinfo anchors are not displayed 
> > (start with first line of an option)
> > c) one can easily copy a link to an anchor (displayed as ¶)
> > d) internal links are working, e.g. one can easily jump from listing of 
> > options

For me, these items are enough justification to switch away from
texinfo, which produces crap HTML pages with crap anchors. You can't
find out the anchors without inspecting (and searching) the HTML
source. That's utterly stupid. And even after you do that, the anchor
is at the wrong place:
https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#index-c
As somebody who spends a lot of time helping users on the mailing
list, IRC, stackoverflow, and elsewhere, this "feature" of the texinfo
HTML has angered me for many years.

Yes, some people like texinfo, but some people also dislike it and
there are serious usability problems with the output. I support
replacing texinfo with anything that isn't texinfo.


> > e) left menu navigation provides better orientation in the manual
> > f) Sphinx provides internal search capability: [3]
> > 2) internal links are also provided in PDF version of the manual
>
> How is this different from Texinfo?
>
> > 3) some existing GCC manuals are already written in Sphinx (GNAT manuals 
> > and libgccjit)
> > 4) support for various output formats, some people are interested in ePUB 
> > format
>
> Texinfo likewise supports many output formats.  Someone presented a
> very simple package to produce epub format from it.
>
> > 5) Sphinx is using RST which is quite minimal semantic markup language
>
> Is it more minimal than Texinfo?
>
> > 6) TOC is automatically generated - no need for manual navigation like seen 
> > here: [5]
>
> That is not needed in Texinfo as well, since long ago.  Nowadays, you
> just say
>
>   @node Whatever
>
> and the rest is done automatically, as long as the manual's structure
> is a proper tree (which it normally is, I know of only one manual that
> is an exception).
>
> > Disadvantages:
> >
> > 1) info pages are currently missing Page description in TOC
> > 2) rich formatting is leading to extra wrapping in info output - beings 
> > partially addresses in [4]
> > 3) one needs e.g. Emacs support for inline links (rendered as notes)
>
>  4) The need to learn yet another markup language.
> While this is not a problem for simple text, it does require a
> serious study of RST and Sphinx to use the more advanced features.

This is a problem with texinfo too.

>
>  5) Lack of macros.
> AFAIK, only simple textual substitution is available, no macros
> with arguments.

Is this a problem for GCC docs though?

Re: [PATCH] libgomp: Include early to avoid link failure with glibc 2.34

2021-07-12 Thread Florian Weimer via Gcc-patches

* Jakub Jelinek:

> On Mon, Jul 12, 2021 at 10:26:47AM +0200, Florian Weimer via Gcc-patches 
> wrote:
>>  is included indirectly in the #pragma GCC visibility hidden
>> block.  With glibc 2.34,  needs a declaration of the sysconf
>> function, and including it under hidden visibility turns other calls
>> to sysconf into hidden references, leading to a linker failure.
>> 
>> libgomp/ChangeLog:
>> 
>>  * libgomp.h: Include .
>
> If this is because of the config/linux/sem.h #include ,
> I'd prefer not to include that header instead, we rely on being compiled
> by GCC anyway (and clang/icc support __INT_MAX__ anyway).
>
> Or e.g. config/posix/sem.h uses
> #ifdef HAVE_ATTRIBUTE_VISIBILITY
> # pragma GCC visibility push(default)
> #endif
>
> #include 
>
> #ifdef HAVE_ATTRIBUTE_VISIBILITY
> # pragma GCC visibility pop
> #endif
>
> 2021-07-12  Jakub Jelinek  
>   Florian Weimer  
>
>   * config/linux/sem.h: Don't include limits.h.
>   (SEM_WAIT): Define to -__INT_MAX__ - 1 instead of INT_MIN.
>
> --- libgomp/config/linux/sem.h.jj 2021-01-18 07:18:42.360339646 +0100
> +++ libgomp/config/linux/sem.h2021-07-12 15:18:10.121178404 +0200
> @@ -33,10 +33,8 @@
>  #ifndef GOMP_SEM_H
>  #define GOMP_SEM_H 1
>  
> -#include  /* For INT_MIN */
> -
>  typedef int gomp_sem_t;
> -#define SEM_WAIT INT_MIN
> +#define SEM_WAIT (-__INT_MAX__ - 1)
>  #define SEM_INC 1
>  
>  extern void gomp_sem_wait_slow (gomp_sem_t *, int);

I tested this on csky-linux-gnuabiv2 with the glibc version that failed
before, and it works.  So I guess your version is fine, too.

Thanks,
Florian

Re: Benefits of using Sphinx documentation format

2021-07-12 Thread Eli Zaretskii via Gcc-patches

> Cc: g...@gcc.gnu.org, gcc-patches@gcc.gnu.org, jos...@codesourcery.com
> From: Martin Liška 
> Date: Mon, 12 Jul 2021 15:25:47 +0200
> 
> Let's make it a separate sub-thread where we can discuss motivation why
> do I want moving to Sphinx format.

Thanks for starting this discussion.

> Benefits:
> 1) modern looking HTML output (before: [1], after: [2]):
> a) syntax highlighting for examples (code, shell commands, etc.)
> b) precise anchors, the current Texinfo anchors are not displayed (start 
> with first line of an option)
> c) one can easily copy a link to an anchor (displayed as ¶)
> d) internal links are working, e.g. one can easily jump from listing of 
> options
> e) left menu navigation provides better orientation in the manual
> f) Sphinx provides internal search capability: [3]
> 2) internal links are also provided in PDF version of the manual

How is this different from Texinfo?

> 3) some existing GCC manuals are already written in Sphinx (GNAT manuals and 
> libgccjit)
> 4) support for various output formats, some people are interested in ePUB 
> format

Texinfo likewise supports many output formats.  Someone presented a
very simple package to produce epub format from it.

> 5) Sphinx is using RST which is quite minimal semantic markup language

Is it more minimal than Texinfo?

> 6) TOC is automatically generated - no need for manual navigation like seen 
> here: [5]

That is not needed in Texinfo as well, since long ago.  Nowadays, you
just say

  @node Whatever

and the rest is done automatically, as long as the manual's structure
is a proper tree (which it normally is, I know of only one manual that
is an exception).

> Disadvantages:
> 
> 1) info pages are currently missing Page description in TOC
> 2) rich formatting is leading to extra wrapping in info output - beings 
> partially addresses in [4]
> 3) one needs e.g. Emacs support for inline links (rendered as notes)

 4) The need to learn yet another markup language.
While this is not a problem for simple text, it does require a
serious study of RST and Sphinx to use the more advanced features.

 5) Lack of macros.
AFAIK, only simple textual substitution is available, no macros
with arguments.

[PATCH] produce simple DOT graphs from SLP trees

2021-07-12 Thread Richard Biener

This adds a dot_slp_tree debug function producing a simple DOT
graph from a starting node down the graph.  There's no fancy
direct invocation of dot but the output is directed to a specified
file.  It re-uses vect_print_slp_tree, naming nodes as their
address.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

2021-07-12  Richard Biener  

* dump-context.h (debug_dump_context::debug_dump_context):
Add FILE * parameter defaulted to stderr.
* dumpfile.c (debug_dump_context::debug_dump_context): Adjust.
* tree-vect-slp.c (dot_slp_tree): New functions.
---
 gcc/dump-context.h  |  2 +-
 gcc/dumpfile.c  |  4 ++--
 gcc/tree-vect-slp.c | 38 ++
 3 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/gcc/dump-context.h b/gcc/dump-context.h
index e8ed3743b7b..1a6bf5eb513 100644
--- a/gcc/dump-context.h
+++ b/gcc/dump-context.h
@@ -204,7 +204,7 @@ private:
 class debug_dump_context
 {
  public:
-  debug_dump_context ();
+  debug_dump_context (FILE *f = stderr);
   ~debug_dump_context ();
 
  private:
diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 2457df2df0e..8169daf7f59 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -2098,14 +2098,14 @@ enable_rtl_dump_file (void)
 /* debug_dump_context's ctor.  Temporarily override the dump_context
(to forcibly enable output to stderr).  */
 
-debug_dump_context::debug_dump_context ()
+debug_dump_context::debug_dump_context (FILE *f)
 : m_context (),
   m_saved (&dump_context::get ()),
   m_saved_flags (dump_flags),
   m_saved_pflags (pflags),
   m_saved_file (dump_file)
 {
-  set_dump_file (stderr);
+  set_dump_file (f);
   dump_context::s_current = &m_context;
   pflags = dump_flags = MSG_ALL_KINDS | MSG_ALL_PRIORITIES;
   dump_context::get ().refresh_dumps_are_enabled ();
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index cd002b3fb7c..86fa3c1b349 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2552,6 +2552,44 @@ debug (slp_tree node)
   node);
 }
 
+/* Recursive helper for the dot producer below.  */
+
+static void
+dot_slp_tree (FILE *f, slp_tree node, hash_set &visited)
+{
+  if (visited.add (node))
+return;
+
+  fprintf (f, "\"%p\" [label=\"", (void *)node);
+  vect_print_slp_tree (MSG_NOTE,
+  dump_location_t::from_location_t (UNKNOWN_LOCATION),
+  node);
+  fprintf (f, "\"];\n");
+
+
+  for (slp_tree child : SLP_TREE_CHILDREN (node))
+fprintf (f, "\"%p\" -> \"%p\";", (void *)node, (void *)child);
+
+  for (slp_tree child : SLP_TREE_CHILDREN (node))
+dot_slp_tree (f, child, visited);
+}
+
+DEBUG_FUNCTION void
+dot_slp_tree (const char *fname, slp_tree node)
+{
+  FILE *f = fopen (fname, "w");
+  fprintf (f, "digraph {\n");
+  fflush (f);
+{
+  debug_dump_context ctx (f);
+  hash_set visited;
+  dot_slp_tree (f, node, visited);
+}
+  fflush (f);
+  fprintf (f, "}\n");
+  fclose (f);
+}
+
 /* Dump a slp tree NODE using flags specified in DUMP_KIND.  */
 
 static void
-- 
2.26.2

Re: [PATCH] libgomp: Include early to avoid link failure with glibc 2.34

2021-07-12 Thread Jakub Jelinek via Gcc-patches

On Mon, Jul 12, 2021 at 10:26:47AM +0200, Florian Weimer via Gcc-patches wrote:
>  is included indirectly in the #pragma GCC visibility hidden
> block.  With glibc 2.34,  needs a declaration of the sysconf
> function, and including it under hidden visibility turns other calls
> to sysconf into hidden references, leading to a linker failure.
> 
> libgomp/ChangeLog:
> 
>   * libgomp.h: Include .

If this is because of the config/linux/sem.h #include ,
I'd prefer not to include that header instead, we rely on being compiled
by GCC anyway (and clang/icc support __INT_MAX__ anyway).

Or e.g. config/posix/sem.h uses
#ifdef HAVE_ATTRIBUTE_VISIBILITY
# pragma GCC visibility push(default)
#endif

#include 

#ifdef HAVE_ATTRIBUTE_VISIBILITY
# pragma GCC visibility pop
#endif

2021-07-12  Jakub Jelinek  
Florian Weimer  

* config/linux/sem.h: Don't include limits.h.
(SEM_WAIT): Define to -__INT_MAX__ - 1 instead of INT_MIN.

--- libgomp/config/linux/sem.h.jj   2021-01-18 07:18:42.360339646 +0100
+++ libgomp/config/linux/sem.h  2021-07-12 15:18:10.121178404 +0200
@@ -33,10 +33,8 @@
 #ifndef GOMP_SEM_H
 #define GOMP_SEM_H 1
 
-#include  /* For INT_MIN */
-
 typedef int gomp_sem_t;
-#define SEM_WAIT INT_MIN
+#define SEM_WAIT (-__INT_MAX__ - 1)
 #define SEM_INC 1
 
 extern void gomp_sem_wait_slow (gomp_sem_t *, int);


Jakub

Benefits of using Sphinx documentation format

2021-07-12 Thread Martin Liška


Hello.

Let's make it a separate sub-thread where we can discuss motivation why
do I want moving to Sphinx format.

Benefits:
1) modern looking HTML output (before: [1], after: [2]):
   a) syntax highlighting for examples (code, shell commands, etc.)
   b) precise anchors, the current Texinfo anchors are not displayed (start 
with first line of an option)
   c) one can easily copy a link to an anchor (displayed as ¶)
   d) internal links are working, e.g. one can easily jump from listing of 
options
   e) left menu navigation provides better orientation in the manual
   f) Sphinx provides internal search capability: [3]
2) internal links are also provided in PDF version of the manual
3) some existing GCC manuals are already written in Sphinx (GNAT manuals and 
libgccjit)
4) support for various output formats, some people are interested in ePUB format
5) Sphinx is using RST which is quite minimal semantic markup language
6) TOC is automatically generated - no need for manual navigation like seen 
here: [5]

Disadvantages:

1) info pages are currently missing Page description in TOC
2) rich formatting is leading to extra wrapping in info output - beings 
partially addresses in [4]
3) one needs e.g. Emacs support for inline links (rendered as notes)

I'm willing to address issue 1) in next weeks and I tend to skip emission of 
links as mentioned in 3).
Generally speaking, I'm aware that some people still use Info, but I think we 
should more focus
on more modern documentation formats. That's HTML (and partially PDF).

Martin

[1] 
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict-aliasing
[2] 
https://splichal.eu/gccsphinx-final/html/gcc/gcc-command-options/options-that-control-optimization.html#cmdoption-fstrict-aliasing
[3] 
https://splichal.eu/gccsphinx-final/html/gcc/search.html?q=-fipa-icf&check_keywords=yes&area=default#
[4] https://github.com/sphinx-doc/sphinx/pull/9391
[5] @comment node-name, next,  previous, up
@nodeInstalling GCC, Binaries, , Top

Re: [PATCH] libgomp: Include early to avoid link failure with glibc 2.34

2021-07-12 Thread Florian Weimer via Gcc-patches

* Florian Weimer:

>  is included indirectly in the #pragma GCC visibility hidden
> block.  With glibc 2.34,  needs a declaration of the sysconf
> function, and including it under hidden visibility turns other calls
> to sysconf into hidden references, leading to a linker failure.
>
> libgomp/ChangeLog:
>
>   * libgomp.h: Include .
>
> ---
>  libgomp/libgomp.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
> index 8d25dc8e2a8..1fe209429d1 100644
> --- a/libgomp/libgomp.h
> +++ b/libgomp/libgomp.h
> @@ -46,6 +46,7 @@
>  #include "libgomp-plugin.h"
>  #include "gomp-constants.h"
>  
> +#include 
>  #ifdef HAVE_PTHREAD_H
>  #include 
>  #endif

I think this is a real libgomp bug, but if this glibc patch is accepted,
at least libgomp will build again:

  Reduce  pollution due to dynamic PTHREAD_STACK_MIN
  

So while I still think libgomp should be fixed, it won't need
backporting to release branches (assuming the glibc workaround goes in,
of course).

Thanks,
Florian

[Ada] Implement support for unconstrained array types with FLB

2021-07-12 Thread Pierre-Marie de Rodat

The fixed lower bound also makes it possible to simplify the formula of
the upper bound used for unconstrained array types.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gcc-interface/decl.c (gnat_to_gnu_entity) : Use a
fixed lower bound if the index subtype is marked so, as well as a
more efficient formula for the upper bound if the array cannot be
superflat.
(flb_cannot_be_superflat): New predicate.
(cannot_be_superflat): Rename into...
(range_cannot_be_superfla): ...this.  Minor tweak.diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -217,7 +217,8 @@ static void set_reverse_storage_order_on_array_type (tree);
 static bool same_discriminant_p (Entity_Id, Entity_Id);
 static bool array_type_has_nonaliased_component (tree, Entity_Id);
 static bool compile_time_known_address_p (Node_Id);
-static bool cannot_be_superflat (Node_Id);
+static bool flb_cannot_be_superflat (Node_Id);
+static bool range_cannot_be_superflat (Node_Id);
 static bool constructor_address_p (tree);
 static bool allocatable_size_p (tree, bool);
 static bool initial_value_needs_conversion (tree, tree);
@@ -2238,13 +2239,15 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	 index += (convention_fortran_p ? - 1 : 1),
 	 gnat_index = Next_Index (gnat_index))
 	  {
-	char field_name[16];
+	const bool is_flb
+	  = Is_Fixed_Lower_Bound_Index_Subtype (Etype (gnat_index));
 	tree gnu_index_type = get_unpadded_type (Etype (gnat_index));
 	tree gnu_orig_min = TYPE_MIN_VALUE (gnu_index_type);
 	tree gnu_orig_max = TYPE_MAX_VALUE (gnu_index_type);
 	tree gnu_index_base_type = get_base_type (gnu_index_type);
 	tree gnu_lb_field, gnu_hb_field;
 	tree gnu_min, gnu_max, gnu_high;
+	char field_name[16];
 
 	/* Update the maximum size of the array in elements.  */
 	if (gnu_max_size)
@@ -2278,25 +2281,38 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 
 	/* We can't use build_component_ref here since the template type
 	   isn't complete yet.  */
-	gnu_orig_min = build3 (COMPONENT_REF, TREE_TYPE (gnu_lb_field),
-   gnu_template_reference, gnu_lb_field,
-   NULL_TREE);
+	if (!is_flb)
+	  {
+		gnu_orig_min = build3 (COMPONENT_REF, TREE_TYPE (gnu_lb_field),
+   gnu_template_reference, gnu_lb_field,
+   NULL_TREE);
+		TREE_READONLY (gnu_orig_min) = 1;
+	  }
+
 	gnu_orig_max = build3 (COMPONENT_REF, TREE_TYPE (gnu_hb_field),
    gnu_template_reference, gnu_hb_field,
    NULL_TREE);
-	TREE_READONLY (gnu_orig_min) = TREE_READONLY (gnu_orig_max) = 1;
+	TREE_READONLY (gnu_orig_max) = 1;
 
 	gnu_min = convert (sizetype, gnu_orig_min);
 	gnu_max = convert (sizetype, gnu_orig_max);
 
 	/* Compute the size of this dimension.  See the E_Array_Subtype
 	   case below for the rationale.  */
-	gnu_high
-	  = build3 (COND_EXPR, sizetype,
-			build2 (GE_EXPR, boolean_type_node,
-gnu_orig_max, gnu_orig_min),
-			gnu_max,
-			size_binop (MINUS_EXPR, gnu_min, size_one_node));
+	if (is_flb
+		&& Nkind (gnat_index) == N_Subtype_Indication
+	&& flb_cannot_be_superflat (gnat_index))
+	  gnu_high = gnu_max;
+
+	else
+	  gnu_high
+		= build3 (COND_EXPR, sizetype,
+			  build2 (GE_EXPR, boolean_type_node,
+  gnu_orig_max, gnu_orig_min),
+			  gnu_max,
+			  TREE_CODE (gnu_min) == INTEGER_CST
+			  ? int_const_binop (MINUS_EXPR, gnu_min, size_one_node)
+			  : size_binop (MINUS_EXPR, gnu_min, size_one_node));
 
 	/* Make a range type with the new range in the Ada base type.
 	   Then make an index type with the size range in sizetype.  */
@@ -2595,7 +2611,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 		 this.  If we can prove that the array can never be superflat,
 		 we can just use the high bound of the index type.  */
 	  else if ((Nkind (gnat_index) == N_Range
-		&& cannot_be_superflat (gnat_index))
+		&& range_cannot_be_superflat (gnat_index))
 		   /* Bit-Packed Array Impl. Types are never superflat.  */
 		   || (Is_Packed_Array_Impl_Type (gnat_entity)
 			   && Is_Bit_Packed_Array
@@ -6414,33 +6430,81 @@ compile_time_known_address_p (Node_Id gnat_address)
   return Compile_Time_Known_Value (gnat_address);
 }
 
+/* Return true if GNAT_INDIC, a N_Subtype_Indication node for the index of a
+   FLB, cannot yield superflat objects, i.e. if the inequality HB >= LB - 1
+   is true for these objects.  LB and HB are the low and high bounds.  */
+
+static bool
+flb_cannot_be_superflat (Node_Id gnat_indic)
+{
+  const Entity_Id gnat_type = Entity (Subtype_Mark (gnat_indic));
+  const Entity_Id gnat_subtype = Etype (gnat_indic);
+  Node_Id gnat_scalar_range, gnat_lb, gnat_hb;
+  tree gnu_lb, gnu_hb, gnu_lb_minus_one;

[Ada] Clean up Uint fields

2021-07-12 Thread Pierre-Marie de Rodat

We add new field types Valid_Uint, Unat, Upos, Nonzero_Uint,
which have predicates that assert the value is a proper
Uint value (i.e. not No_Uint), and that the value is
appropriate. It is not clear that Nonzero_Uint is needed,
but it is useful in testing; we can always remove it later.

We use the new field types for Alignment (which requires
changes) and a few others (which were easy). We intend to
use these for other fields as well. For example, Esize should
be of type Valid_Uint.

Fields of these new subtypes have no default (unlike Uint fields, which
still default to Uint_0); it is required to set them before calling
the getter. This patch fixes various places where that was not true
(so far, mainly for Alignment).

The "unknown" state of Alignment is now represented by the initial zero
value (instead of Uint_0). Unfortunately, we often set Alignment to some
value, and then set it back to unknown, so we need Init_Alignment to
call Reinit_Field_To_Zero. We intend to change other fields, such as
Esize, in a similar way.

Note that "initial zero value" and Uint_0 are two different things.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* uintp.ads, types.h: New subtypes of Uint: Valid_Uint, Unat,
Upos, Nonzero_Uint with predicates. These correspond to new
field types in Gen_IL.
* gen_il-types.ads (Valid_Uint, Unat, Upos, Nonzero_Uint): New
field types.
* einfo-utils.ads, einfo-utils.adb, fe.h (Known_Alignment,
Init_Alignment): Use the initial zero value to represent
"unknown". This will ensure that if Alignment is called before
Set_Alignment, the compiler will blow up (if assertions are
enabled).
* atree.ads, atree.adb, atree.h, gen_il-gen.adb
(Get_Valid_32_Bit_Field): New generic low-level getter for
subtypes of Uint.
(Copy_Alignment): New procedure to copy Alignment field even
when Unknown.
(Init_Object_Size_Align, Init_Size_Align): Do not bypass the
Init_ procedures.
* exp_pakd.adb, freeze.adb, layout.adb, repinfo.adb,
sem_util.adb: Protect calls to Alignment with Known_Alignment.
Use Copy_Alignment when it might be unknown.
* gen_il-gen-gen_entities.adb (Alignment,
String_Literal_Length): Use type Unat instead of Uint, to ensure
that the field is always Set_ before we get it, and that it is
set to a nonnegative value.
(Enumeration_Pos): Unat.
(Enumeration_Rep): Valid_Uint. Can be negative, but must be
valid before fetching.
(Discriminant_Number): Upos.
(Renaming_Map): Remove.
* gen_il-gen-gen_nodes.adb (Char_Literal_Value, Reason): Unat.
(Intval, Corresponding_Integer_Value): Valid_Uint.
* gen_il-internals.ads: New functions for dealing with special
defaults and new subtypes of Uint.
* scans.ads: Correct comments.
* scn.adb (Post_Scan): Do not set Intval to No_Uint; that is no
longer allowed.
* sem_ch13.adb (Analyze_Enumeration_Representation_Clause): Do
not set Enumeration_Rep to No_Uint; that is no longer allowed.
(Offset_Value): Protect calls to Alignment with Known_Alignment.
* sem_prag.adb (Set_Atomic_VFA): Do not use Uint_0 to mean
"unknown"; call Init_Alignment instead.
* sinfo.ads: Minor comment fix.
* treepr.adb: Deal with printing of new field types.
* einfo.ads, gen_il-fields.ads (Renaming_Map): Remove.
* gcc-interface/decl.c (gnat_to_gnu_entity): Use Known_Alignment
before calling Alignment. This preserve some probably buggy
behavior: if the alignment is not set, it previously defaulted
to Uint_0; we now make that explicit.  Use Copy_Alignment,
because "Set_Alignment (Y, Alignment (X));" no longer works when
the Alignment of X has not yet been set.
* gcc-interface/trans.c (process_freeze_entity): Use
Copy_Alignment.

patch.diff.gz
Description: application/gzip

[Ada] Add DWARF 5 support to System.Dwarf_Line

2021-07-12 Thread Pierre-Marie de Rodat

The encoding of the debugging line information has substantially changed
in DWARF 5, so this adds the support for it alongside the existing code.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-dwalin.ads: Adjust a few comments left and right.
(Line_Info_Register): Comment out unused components.
(Line_Info_Header): Add DWARF 5 support.
(Dwarf_Context): Likewise.  Rename "prologue" into "header".
* libgnat/s-dwalin.adb: Alphabetize "with" clauses.
(DWARF constants): Add DWARF 5 support and reorder.
(For_Each_Row): Adjust.
(Initialize_Pass): Likewise.
(Initialize_State_Machine): Likewise and fix typo.
(Open): Add DWARF 5 support.
(Parse_Prologue): Rename into...
(Parse_Header): ...this and add DWARF 5 support.
(Read_And_Execute_Isn): Rename into...
(Read_And_Execute_Insn): ...this and adjust.
(To_File_Name): Change parameter name and add DWARF 5 support.
(Read_Entry_Format_Array): New procedure.
(Skip_Form): Add DWARF 5 support and reorder.
(Seek_Abbrev): Do not count entries and add DWARF 5 support.
(Debug_Info_Lookup): Add DWARF 5 support.
(Symbolic_Address.Set_Result): Likewise.
(Symbolic_Address): Adjust.

patch.diff.gz
Description: application/gzip

1 2 >

1 - 100 of 135 matches

Mail list logo