from:"Gary Oblock"

Re: Question on updating function body on specialized functions

2022-03-08 Thread Gary Oblock via Gcc

Erick my friend,

That's exactly why I'm such a big fan of creating things
anew each time I mess with them.  

Later,

Gary


From: Erick Ochoa 
Sent: Tuesday, March 8, 2022 7:29 AM
To: Martin Jambor 
Cc: gcc@gcc.gnu.org 
Subject: Re: Question on updating function body on specialized functions

Hi Martin!

Thanks for replying, turns out that while I was trying to reply to you I
was able to get the answer. Turns out there is indeed one tree node which
is shared across the two functions. And that is

TREE_OPERAND (MEM_REF, 1).

When I was assigning to

TREE_TYPE ( TREE_OPERAND (MEM_REF, 1) ) in one function, I was modifying
the other. The solution was to create a new tree and assign it directly to
TREE_OPERAND (MEM_REF, 1) in both functions.

Thanks!

Re: What replaces FOR_EACH_LOOP_FN

2022-03-02 Thread Gary Oblock via Gcc

Andrew,

That's super! I was dreading an answer along the lines
of "we don't do that anymore so why would you ever want
to do that?" 

Many thanks,

Gary

From: Andrew Pinski 
Sent: Wednesday, March 2, 2022 2:09 PM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: What replaces FOR_EACH_LOOP_FN

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Mar 2, 2022 at 2:05 PM Gary Oblock via Gcc  wrote:
>
> Guys,
>
> I've been working on an optimization for quite a bit of time and
> in an attempt to move it to GCC 12 I found that FOR_EACH_LOOP_FN
> no longer exists. I poked around in the archives and tried a Google
> search but found nothing on it.
>
> It suited my needs and I'd hate to have to rewrite a bunch of stuff.
> What replaces it and how do I use?

This changed with r12-2605-ge41ba804ba5f5c. The replacement is just simply:

for (auto loop : loops_list (function, 0))

Thanks,
Andrew Pinski


>
> Thanks,
>
> Gary
>
>
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

What replaces FOR_EACH_LOOP_FN

2022-03-02 Thread Gary Oblock via Gcc

Guys,

I've been working on an optimization for quite a bit of time and
in an attempt to move it to GCC 12 I found that FOR_EACH_LOOP_FN
no longer exists. I poked around in the archives and tried a Google
search but found nothing on it.

It suited my needs and I'd hate to have to rewrite a bunch of stuff.
What replaces it and how do I use?

Thanks,

Gary




CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: Benchmark recommendations needed

2022-02-22 Thread Gary Oblock via Gcc

Andras,

The whole point of benchmarks is to judge a processor's performance.
That being said, just crippling GCC is not reasonable because
processors must be judged in the appropriate context and that
includes the current state of the art compiler technology. If you have
a new processor I'd benchmark it using the applications you built it
for.

Gary

From: Andras Tantos 
Sent: Monday, February 21, 2022 9:22 PM
To: Gary Oblock ; gcc@gcc.gnu.org 
Subject: Re: Benchmark recommendations needed

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

That's true, I did notice GCC being rather ... peculiar about
drhystone. Is there a way to make it less clever about the benchmark?

Or is there some alteration to the benchmark I can make to not trigger
the special behavior in GCC?

Andras

On Mon, 2022-02-21 at 03:19 +, Gary Oblock via Gcc wrote:
> Trying to use the dhrystone isn't going to be very useful. It has
> many downsides not the least is that gcc's optimizer can run rings
> about it.
>
> Gary
>
> 
> From: Gcc  on
> behalf of gcc-requ...@gcc.gnu.org 
> Sent: Tuesday, February 15, 2022 6:25 AM
> To: gcc@gcc.gnu.org 
> Subject: Re:
>
> [EXTERNAL EMAIL NOTICE: This email originated from an external
> sender. Please be mindful of safe email handling and proprietary
> information protection practices.]
>
>
> Send Gcc mailing list submissions to
> gcc@gcc.gnu.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://gcc.gnu.org/mailman/listinfo/gcc
> or, via email, send a message with subject or body 'help' to
> gcc-requ...@gcc.gnu.org
>
> You can reach the person managing the list at
> gcc-ow...@gcc.gnu.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Gcc digest..."

Re: Benchmark recommendations needed

2022-02-20 Thread Gary Oblock via Gcc

Trying to use the dhrystone isn't going to be very useful. It has many 
downsides not the least is that gcc's optimizer can run rings about it.

Gary

From: Gcc  on behalf of 
gcc-requ...@gcc.gnu.org 
Sent: Tuesday, February 15, 2022 6:25 AM
To: gcc@gcc.gnu.org 
Subject: Re:

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

Send Gcc mailing list submissions to
gcc@gcc.gnu.org

To subscribe or unsubscribe via the World Wide Web, visit
https://gcc.gnu.org/mailman/listinfo/gcc
or, via email, send a message with subject or body 'help' to
gcc-requ...@gcc.gnu.org

You can reach the person managing the list at
gcc-ow...@gcc.gnu.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Gcc digest..."

Re: Issue with a flag that I defined getting set to zero

2022-01-10 Thread Gary Oblock via Gcc

Richard,

That's nice to know but I added the option itself months ago.
Also, it's on the lto1 command line, cc1 command line and
shows up in the COLLECT_GCC_OPTIONS so I assume
it universally applied.

Thanks,

Gary

From: Richard Biener 
Sent: Monday, January 10, 2022 12:36 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Issue with a flag that I defined getting set to zero

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On Fri, Jan 7, 2022 at 9:12 AM Gary Oblock via Gcc  wrote:
>
> An optimization flag that I recently added is being
> set to zero in push_cfun (which after a couple of
> levels of calls cl_optimization_restore to this.)
>
> The flag defined like this:
>
> finterleaving-index-32-bits
> Common Var(flag_interleaving_index_32_bits) Init(0) Optimization
> Structure reorganization optimization, instance interleaving.
>
> Note, I'm working around this but l'd really like
> to not have to do so therefore I'm wondering if somebody
> could explain what's happening and what I'd need
> to do instead?

Did you rebuild all of GCC after adding the option?  Note that when you
look at the option from LTO and from within an IPA pass then you
have to use opt_for_fn (..) since the "global" option at link time will
be not set (unless you specify it again at link time), it will be only
present on the functions of the compile TU it was set globally.

>
> Thanks,
>
> Gary
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

Re: Issue with a flag that I defined getting set to zero

2022-01-07 Thread Gary Oblock via Gcc

Gabriel,

Yes, indeed, thank you.

Note, it is a reminder to those that are receiving proprietary
and that is considered as a legal obligation on the part of the
company transmitting it because they must make an effort to
protect their proprietary information.

I'm not a lawyer either but I feel like I'm being forced to
act like one. 

Now, can anybody answer my question?

Sincerely

Gary


From: Gabriel Ravier 
Sent: Friday, January 7, 2022 12:56 AM
To: Martin Liška ; Gary Oblock ; 
gcc@gcc.gnu.org 
Subject: Re: Issue with a flag that I defined getting set to zero

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On 1/7/22 09:38, Martin Liška wrote:
> On 1/7/22 09:30, Gary Oblock wrote:
>> Regarding the corporate legal gibberish. It's automatic
>> and not under my control also we're not supposed to
>> use private emails for work...
>
> I respect that. But please respect me that I won't reply to your
> emails any longer. I don't want to follow the conditions in the NOTICE!
>
> Cheers,
> Martin
>
As far as I know the notice has no legal significance at all (which Gary
should probably point out to his management. Really, pretty much the
only thing the disclaimer will do is that _some_ people _might_ read it
and _some_ of those people _might_ adhere to the terms given there,
which is basically meaningless compared to the general annoyance
resulting from disclaimers being at the end of e-mails everywhere). You
can't just magically establish an agreement that results in a duty of
nondisclosure like this without agreement, and just receiving an email
obviously isn't that.

(although ironically, I guess I should add a disclaimer of my own: I
ain't a lawyer and this isn't legal advice)

Re: Issue with a flag that I defined getting set to zero

2022-01-07 Thread Gary Oblock via Gcc

Martin,

Regarding the corporate legal gibberish. It's automatic
and not under my control also we're not supposed to
use private emails for work...

Gary

From: Martin Liška 
Sent: Friday, January 7, 2022 12:20 AM
To: Gary Oblock ; gcc@gcc.gnu.org 
Subject: Re: Issue with a flag that I defined getting set to zero

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On 1/7/22 09:10, Gary Oblock via Gcc wrote:
> An optimization flag that I recently added is being
> set to zero in push_cfun (which after a couple of
> levels of calls cl_optimization_restore to this.)

Question is: what's the value of the flag in your IPA pass
if you set -finterleaving-index-32-bits? It should not really be zero.

>
> The flag defined like this:
>
> finterleaving-index-32-bits
> Common Var(flag_interleaving_index_32_bits) Init(0) Optimization
> Structure reorganization optimization, instance interleaving.
>
> Note, I'm working around this but l'd really like
> to not have to do so therefore I'm wondering if somebody
> could explain what's happening and what I'd need
> to do instead?

You defined the flag well.

>
> Thanks,
>
> Gary
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

Can you please remove this ugly notice that it's completely misleading? If not, 
I would then
recommend creating a private email.

Martin

Issue with a flag that I defined getting set to zero

2022-01-07 Thread Gary Oblock via Gcc

An optimization flag that I recently added is being
set to zero in push_cfun (which after a couple of
levels of calls cl_optimization_restore to this.)

The flag defined like this:

finterleaving-index-32-bits
Common Var(flag_interleaving_index_32_bits) Init(0) Optimization
Structure reorganization optimization, instance interleaving.

Note, I'm working around this but l'd really like
to not have to do so therefore I'm wondering if somebody
could explain what's happening and what I'd need
to do instead?

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: Why do these two trees print differently

2022-01-04 Thread Gary Oblock via Gcc

Richard,

Well during my exploration of pretty printing I spent
a bunch of time looking at mem refs, hence the question.

I understand GIMPLE is not an AST. I worked on a
in house compiler someplace where the only IR was
an AST and it was a really horrible IR so I'm glad GIMPLE
isn't an AST. But the tree expressions seem to be chunks
of AST (with some other stuff thrown in that could be thought of
as node attributes.) Am I wrong to think of them that way?

Thanks,

Gary



From: Richard Biener 
Sent: Monday, January 3, 2022 11:28 PM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Why do these two trees print differently

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Mon, Jan 3, 2022 at 9:16 PM Gary Oblock  wrote:
>
> Richard,
>
> I was able figure it out by looking for "MEM" is
> tree-pretty-print.c. There is the condition included
> at the end of the email (mostly to provoke a chuckle)
> necessary for the "p->f" format. If it's not true then
> the MEM form is emitted.

Yes, there's some loss of information so we don't
"pretty" print the MEM.

>
> What is most interesting from this whole exercise
> the question of why am I seeing offsets in
> the GIMPLE form? I'm seeing offsets where
> the symbolic form using field seems to make
> more sense. I'm also seeing accesses with
> offsets that are multiples of the structure size.
> That kind of idiom seems more appropriate at the
> RTL level.

That seems to be an unrelated question?  Note that GIMPLE
is much closer to RTL than you think - GIMPLE is _not_ an AST.
You see offsets whenever symbolic (COMPONENT_REF I suppose)
is eventually not semantically correct.

Richard.

>
> Thanks,
>
> Gary
>
> TREE_CODE (node) == MEM_REF
>   && integer_zerop (TREE_OPERAND (node, 1))
>   /* Dump the types of INTEGER_CSTs explicitly, for we can't
>  infer them and MEM_ATTR caching will share MEM_REFs
>  with differently-typed op0s.  */
>   && TREE_CODE (TREE_OPERAND (node, 0)) != INTEGER_CST
>   /* Released SSA_NAMES have no TREE_TYPE.  */
>   && TREE_TYPE (TREE_OPERAND (node, 0)) != NULL_TREE
>   /* Same pointer types, but ignoring POINTER_TYPE vs.
>  REFERENCE_TYPE.  */
>   && (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 0)))
>   == TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1
>   && (TYPE_MODE (TREE_TYPE (TREE_OPERAND (node, 0)))
>   == TYPE_MODE (TREE_TYPE (TREE_OPERAND (node, 1
>   && (TYPE_REF_CAN_ALIAS_ALL (TREE_TYPE (TREE_OPERAND (node, 0)))
>   == TYPE_REF_CAN_ALIAS_ALL (TREE_TYPE (TREE_OPERAND (node, 1
>   /* Same value types ignoring qualifiers.  */
>   && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
>   == TYPE_MAIN_VARIANT
>   (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1)
>   && (!(flags & TDF_ALIAS)
>   || MR_DEPENDENCE_CLIQUE (node) == 0))
>
> 
> From: Richard Biener 
> Sent: Monday, January 3, 2022 5:49 AM
> To: Gary Oblock 
> Cc: gcc@gcc.gnu.org 
> Subject: Re: Why do these two trees print differently
>
> [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
> be mindful of safe email handling and proprietary information protection 
> practices.]
>
>
> On Wed, Dec 15, 2021 at 7:10 AM Gary Oblock via Gcc  wrote:
> >
> > This is one of those things that has always puzzled
> > me so I thought I break down and finally ask.
> >
> > There are two ways a memory reference (tree) prints:
> >
> > MEM[(struct arc_t *)_684].flow
> >
> > and
> >
> > _684->flow
> >
> > Poking under the hood of them, the tree codes and
> > operands are identical so what am I missing?
>
> Try dumping with -gimple, that should show you the difference.
>
> >
> > Thanks,
> >
> > Gary
> >
> >
> > CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> > for the sole use of the intended recipient(s) and contains information that 
> > is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> > is to be used solely for the purpose of furthering the parties' business 
> > relationship. Any unauthorized review, copying, or distribution of this 
> > email (or any attachments thereto) is strictly prohibited. If you are not 
> > the intended recipient, please contact the sender immediately and 
> > permanently delete the original and any copies of this email and any 
> > attachments thereto.

Re: Why do these two trees print differently

2022-01-03 Thread Gary Oblock via Gcc

Richard,

I was able figure it out by looking for "MEM" is
tree-pretty-print.c. There is the condition included
at the end of the email (mostly to provoke a chuckle)
necessary for the "p->f" format. If it's not true then
the MEM form is emitted.

What is most interesting from this whole exercise
the question of why am I seeing offsets in
the GIMPLE form? I'm seeing offsets where
the symbolic form using field seems to make
more sense. I'm also seeing accesses with
offsets that are multiples of the structure size.
That kind of idiom seems more appropriate at the
RTL level.

Thanks,

Gary

TREE_CODE (node) == MEM_REF
  && integer_zerop (TREE_OPERAND (node, 1))
  /* Dump the types of INTEGER_CSTs explicitly, for we can't
 infer them and MEM_ATTR caching will share MEM_REFs
 with differently-typed op0s.  */
  && TREE_CODE (TREE_OPERAND (node, 0)) != INTEGER_CST
  /* Released SSA_NAMES have no TREE_TYPE.  */
  && TREE_TYPE (TREE_OPERAND (node, 0)) != NULL_TREE
  /* Same pointer types, but ignoring POINTER_TYPE vs.
 REFERENCE_TYPE.  */
  && (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 0)))
  == TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1
  && (TYPE_MODE (TREE_TYPE (TREE_OPERAND (node, 0)))
  == TYPE_MODE (TREE_TYPE (TREE_OPERAND (node, 1
  && (TYPE_REF_CAN_ALIAS_ALL (TREE_TYPE (TREE_OPERAND (node, 0)))
  == TYPE_REF_CAN_ALIAS_ALL (TREE_TYPE (TREE_OPERAND (node, 1
  /* Same value types ignoring qualifiers.  */
  && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
  == TYPE_MAIN_VARIANT
  (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1)
  && (!(flags & TDF_ALIAS)
  || MR_DEPENDENCE_CLIQUE (node) == 0))

________
From: Richard Biener 
Sent: Monday, January 3, 2022 5:49 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Why do these two trees print differently

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On Wed, Dec 15, 2021 at 7:10 AM Gary Oblock via Gcc  wrote:
>
> This is one of those things that has always puzzled
> me so I thought I break down and finally ask.
>
> There are two ways a memory reference (tree) prints:
>
> MEM[(struct arc_t *)_684].flow
>
> and
>
> _684->flow
>
> Poking under the hood of them, the tree codes and
> operands are identical so what am I missing?

Try dumping with -gimple, that should show you the difference.

>
> Thanks,
>
> Gary
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

Why do these two trees print differently

2021-12-14 Thread Gary Oblock via Gcc

This is one of those things that has always puzzled
me so I thought I break down and finally ask.

There are two ways a memory reference (tree) prints:

MEM[(struct arc_t *)_684].flow

and

_684->flow

Poking under the hood of them, the tree codes and
operands are identical so what am I missing?

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: odd internal failure

2021-12-03 Thread Gary Oblock via Gcc

David,

Thanks, I've bookmarked your advice. I do use gdb but I've always
found the macros in common use are the biggest hurdle. In addition
C++ has its own associated difficulties.

Note, in the past working on other compilers I've always tried to have
a function version of the macros available.

#if USE_FUNCTIONS
foo_t
MUMBLE( grumble_t *g)
{
  return FU( BAR(g));
}
#else
MUMBLE(g) FU(BAR(g))
#endif

There are many advantages to this. Some are, better type checking,
being able to step into them and invoke them in gdb "p MUMBLE(x)".

Thanks again,

Gary





From: David Malcolm 
Sent: Thursday, December 2, 2021 6:04 AM
To: Richard Biener ; Gary Oblock 

Cc: gcc@gcc.gnu.org 
Subject: Re: odd internal failure

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Thu, 2021-12-02 at 12:40 +0100, Richard Biener via Gcc wrote:
> On Wed, Dec 1, 2021 at 9:56 PM Gary Oblock 
> wrote:
> >
> > Richard,
> >
> > I rebuilt at "-O0" and that particular call now works but on a call
> > to
> > the same function with a different offset it fails. 
>
> use a debugger to see why

In case you haven't seen them, I put together some tips on debugging
GCC here:
https://dmalcolm.fedorapeople.org/gcc/newbies-guide/debugging.html
https://github.com/davidmalcolm/gcc-newbies-guide/blob/master/debugging.rst

Inserting print statements only gets you so far; at some point you
really need a debugger.

Dave

>
> > Thanks,
> >
> > Gary
> >
> >
> > ________
> > From: Richard Biener 
> > Sent: Wednesday, December 1, 2021 1:09 AM
> > To: Gary Oblock 
> > Cc: gcc@gcc.gnu.org 
> > Subject: Re: odd internal failure
> >
> > [EXTERNAL EMAIL NOTICE: This email originated from an external
> > sender. Please be mindful of safe email handling and proprietary
> > information protection practices.]
> >
> >
> > On Wed, Dec 1, 2021 at 8:46 AM Gary Oblock via Gcc 
> > wrote:
> > >
> > > What is happening should be trivial to determine but for some
> > > reason it's
> > > not. I'd normally bounce this off a coworker but given the pandemic
> > > and modern dispersed hiring practices it's not even remotely
> > > possible.
> > >
> > > I'm making this call and tree_to_uhwi is failing on an internal
> > > error.
> > > That's normally easy to fix, but here is where the weirdness kicks
> > > in.
> > >
> > >   unsigned HOST_WIDE_INT wi_offset = tree_to_uhwi (offset);
> > >
> > > tree_to_uhwi from tree.h is:
> > >
> > > extern inline __attribute__ ((__gnu_inline__)) unsigned
> > > HOST_WIDE_INT
> > > tree_to_uhwi (const_tree t)
> > > {
> > >   gcc_assert (tree_fits_uhwi_p (t));
> > >   return TREE_INT_CST_LOW (t);
> > > }
> > >
> > > and
> > >
> > > tree_fits_uhwi_p from tree.c is
> > >
> > > bool
> > > tree_fits_uhwi_p (const_tree t)
> > > {
> > >   return (t != NULL_TREE
> > >  && TREE_CODE (t) == INTEGER_CST
> > >  && wi::fits_uhwi_p (wi::to_widest (t)));
> > > }
> > >
> > > Here's what this instrumentation shows (DEBUG_A is an indenting
> > > fprintf to
> > > stderr.)
> > >
> > >   DEBUG_A ("TREE_CODE(offset) = %s  && ", code_str (TREE_CODE
> > > (offset)));
> > >   DEBUG_A ("fits %s\n", wi::fits_uhwi_p (wi::to_widest (offset)) ?
> > > "true" : "false");
> > >   DEBUG_A ("tree_fits_uhwi_p(offset) %s\n",tree_fits_uhwi_p
> > > (offset) ? "true" : "false");
> > >
> > >TREE_CODE(offset) = INTEGER_CST  && fits true
> > >tree_fits_uhwi_p(offset) true
> > >
> > > By the way, offset is:
> > >
> > > _Literal (struct BASKET * *) 8
> > >
> > > And it's an operand of:
> > >
> > > MEM[(struct BASKET * *) + 8B]
> > >
> > > Any clues on what's going on here?
> >
> > it should just work.
> >
> > > Thanks,
> > >
> > > Gary
> > >
> >
> > Btw, try to setup things so you don't spam below stuff to public
> > mailing lists.
> >
> > > CONFIDENTIALITY NOTICE: This e-mail message, including any
> > > attachments, is for the sole use of the intended recipient(s) and
> > > contains information that is confidential and proprietary to Ampere
> > > Computing or its subsidiaries. It is to be used solely for the
> > > purpose of furthering the parties' business relationship. Any
> > > unauthorized review, copying, or distribution of this email (or any
> > > attachments thereto) is strictly prohibited. If you are not the
> > > intended recipient, please contact the sender immediately and
> > > permanently delete the original and any copies of this email and
> > > any attachments thereto.
>

Re: odd internal failure

2021-12-01 Thread Gary Oblock via Gcc

Richard,

I rebuilt at "-O0" and that particular call now works but on a call to
the same function with a different offset it fails. 

Thanks,

Gary



From: Richard Biener 
Sent: Wednesday, December 1, 2021 1:09 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: odd internal failure

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Dec 1, 2021 at 8:46 AM Gary Oblock via Gcc  wrote:
>
> What is happening should be trivial to determine but for some reason it's
> not. I'd normally bounce this off a coworker but given the pandemic
> and modern dispersed hiring practices it's not even remotely possible.
>
> I'm making this call and tree_to_uhwi is failing on an internal error.
> That's normally easy to fix, but here is where the weirdness kicks in.
>
>   unsigned HOST_WIDE_INT wi_offset = tree_to_uhwi (offset);
>
> tree_to_uhwi from tree.h is:
>
> extern inline __attribute__ ((__gnu_inline__)) unsigned HOST_WIDE_INT
> tree_to_uhwi (const_tree t)
> {
>   gcc_assert (tree_fits_uhwi_p (t));
>   return TREE_INT_CST_LOW (t);
> }
>
> and
>
> tree_fits_uhwi_p from tree.c is
>
> bool
> tree_fits_uhwi_p (const_tree t)
> {
>   return (t != NULL_TREE
>  && TREE_CODE (t) == INTEGER_CST
>  && wi::fits_uhwi_p (wi::to_widest (t)));
> }
>
> Here's what this instrumentation shows (DEBUG_A is an indenting fprintf to
> stderr.)
>
>   DEBUG_A ("TREE_CODE(offset) = %s  && ", code_str (TREE_CODE (offset)));
>   DEBUG_A ("fits %s\n", wi::fits_uhwi_p (wi::to_widest (offset)) ? "true" : 
> "false");
>   DEBUG_A ("tree_fits_uhwi_p(offset) %s\n",tree_fits_uhwi_p (offset) ? "true" 
> : "false");
>
>TREE_CODE(offset) = INTEGER_CST  && fits true
>tree_fits_uhwi_p(offset) true
>
> By the way, offset is:
>
> _Literal (struct BASKET * *) 8
>
> And it's an operand of:
>
> MEM[(struct BASKET * *) + 8B]
>
> Any clues on what's going on here?

it should just work.

> Thanks,
>
> Gary
>

Btw, try to setup things so you don't spam below stuff to public mailing lists.

> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

odd internal failure

2021-11-30 Thread Gary Oblock via Gcc

What is happening should be trivial to determine but for some reason it's
not. I'd normally bounce this off a coworker but given the pandemic
and modern dispersed hiring practices it's not even remotely possible.

I'm making this call and tree_to_uhwi is failing on an internal error.
That's normally easy to fix, but here is where the weirdness kicks in.

  unsigned HOST_WIDE_INT wi_offset = tree_to_uhwi (offset);

tree_to_uhwi from tree.h is:

extern inline __attribute__ ((__gnu_inline__)) unsigned HOST_WIDE_INT
tree_to_uhwi (const_tree t)
{
  gcc_assert (tree_fits_uhwi_p (t));
  return TREE_INT_CST_LOW (t);
}

and

tree_fits_uhwi_p from tree.c is

bool
tree_fits_uhwi_p (const_tree t)
{
  return (t != NULL_TREE
 && TREE_CODE (t) == INTEGER_CST
 && wi::fits_uhwi_p (wi::to_widest (t)));
}

Here's what this instrumentation shows (DEBUG_A is an indenting fprintf to
stderr.)

  DEBUG_A ("TREE_CODE(offset) = %s  && ", code_str (TREE_CODE (offset)));
  DEBUG_A ("fits %s\n", wi::fits_uhwi_p (wi::to_widest (offset)) ? "true" : 
"false");
  DEBUG_A ("tree_fits_uhwi_p(offset) %s\n",tree_fits_uhwi_p (offset) ? "true" : 
"false");

   TREE_CODE(offset) = INTEGER_CST  && fits true
   tree_fits_uhwi_p(offset) true

By the way, offset is:

_Literal (struct BASKET * *) 8

And it's an operand of:

MEM[(struct BASKET * *) + 8B]

Any clues on what's going on here?

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

-fchecking bug, what does it mean?

2021-11-18 Thread Gary Oblock via Gcc

Our test group added "-fchecking" to a script and my optimization
failed.

I can't find any explanation of this type of bug. I grepped the code
and flag_checking was all over the place so it's not like
I can use gdb to pin it down.

Can somebody help me make sense out of this?

lto1: error: type variant differs by TYPE_UNSIGNED
  constant 64>
unit-size  constant 8>
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x7f399333a738 precision:64 min  max 
pointer_to_this >
  constant 64>
unit-size  constant 8>
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x7f3992d1a3f0 precision:64 min  max 
pointer_to_this >
(null):0: confused by earlier errors, bailing out
lto-wrapper: fatal error: /home/gary/gcc_build_gcc11_v4/install/bin/gcc 
returned 1 exit status
compilation terminated.
[Leaving LTRANS ./exe.lto.o]
/usr/bin/ld: error: lto-wrapper failed
[Leaving exe.lto_wrapper_args]
collect2: error: ld returned 1 exit status

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: Can gcc itself be tested with ubsan? If so, how?

2021-10-01 Thread Gary Oblock via Gcc

I suppose I should answer my own question

Yes, the final compiler built has ubsan enabled.

Gary

PS. The faint hearted should note this is an overnight build. It would be nice 
if this wasn't tied to building a bootstrap compiler.


From: Gary Oblock 
Sent: Wednesday, September 29, 2021 11:55 AM
To: Toon Moene ; Erick Ochoa 
Cc: gcc@gcc.gnu.org 
Subject: Re: Can gcc itself be tested with ubsan? If so, how?

Toon,

I assume the final compiler built this way has ubsan? I ask because
I'm trying to spot a bug in a new optimization so I want to
run it on a specific test case with the new optimization
enabled.

Thanks,

Gary


From: Toon Moene 
Sent: Monday, September 27, 2021 11:47 PM
To: Erick Ochoa ; Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Can gcc itself be tested with ubsan? If so, how?

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On 9/28/21 8:35 AM, Erick Ochoa via Gcc wrote:

>> Can ubsan be used on the compiler itself?

I regularly build the compiler(s) natively with ubsan enabled, see for
instance:

https://gcc.gnu.org/pipermail/gcc-testresults/2021-September/719448.html

The configure line tells you how to do it (towards the end of the mail):

configure flags: --prefix=/home/toon/compilers/install/gcc --with-gnu-as
--with-gnu-ld --enable-languages=all,ada --disable-multilib
--disable-nls --with-build-config=bootstrap-ubsan --enable-checking=all

(the enable-checking part is not relevant, and can be omitted).

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: Can gcc itself be tested with ubsan? If so, how?

2021-09-29 Thread Gary Oblock via Gcc

Toon,

I assume the final compiler built this way has ubsan? I ask because
I'm trying to spot a bug in a new optimization so I want to
run it on a specific test case with the new optimization
enabled.

Thanks,

Gary


From: Toon Moene 
Sent: Monday, September 27, 2021 11:47 PM
To: Erick Ochoa ; Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Can gcc itself be tested with ubsan? If so, how?

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On 9/28/21 8:35 AM, Erick Ochoa via Gcc wrote:

>> Can ubsan be used on the compiler itself?

I regularly build the compiler(s) natively with ubsan enabled, see for
instance:

https://gcc.gnu.org/pipermail/gcc-testresults/2021-September/719448.html

The configure line tells you how to do it (towards the end of the mail):

configure flags: --prefix=/home/toon/compilers/install/gcc --with-gnu-as
--with-gnu-ld --enable-languages=all,ada --disable-multilib
--disable-nls --with-build-config=bootstrap-ubsan --enable-checking=all

(the enable-checking part is not relevant, and can be omitted).

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Can gcc itself be tested with ubsan? If so, how?

2021-09-27 Thread Gary Oblock via Gcc

I tried just adding "-fsanitize=undefined" to my CXX_FLAGS and got
a bunch of errors like this:

/usr/bin/ld: ../libcody/libcody.a(server.o): in function 
`std::__cxx11::basic_string, std::allocator 
>::_Alloc_hider::~_Alloc_hider()':
/usr/include/c++/9/bits/basic_string.h:150: undefined reference to 
`__ubsan_handle_type_mismatch_v1'

They all seemed library related.

Can ubsan be used on the compiler itself?

Thanks,

Gary




CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Build gcc question

2021-09-06 Thread Gary Oblock via Gcc

I've got a really amazingly bizarre bug, when running my modified
gcc under gdb, I see some bewildering behavior. So, before I start
debugging at the assembly level, I'd like to see some .s files.
This led me to try adding "-save-temps" to the CFLAGS and
CXXFLAGS on the make command line. This in turn led to plethora
of different assembly errors. Is this supposed to happen
and is there another way to preserve .s files?

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: What is this GIMPLE?

2021-08-26 Thread Gary Oblock via Gcc

Richard,

It sure looked a label but I had code that would bail out on
a label before it ever got where I was seeing a problem.

I finally printed out the gimple code and it was a GIMPLE_DEBUG!
When I bailed out on debugs my pass worked again. Of course
expanding debugging info failed in cfgexpand. It looks like "-g"
and my stuff real don't play nice together. Any thoughts on
what to do? I'm thing about simply disabling my stuff if "-g"
is specified. This is not the macho thing to do but anything is likely
a very deep rathole.

Thanks,

Gary

From: Richard Biener 
Sent: Thursday, August 26, 2021 12:45 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: What is this GIMPLE?

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On Wed, Aug 25, 2021 at 7:30 AM Gary Oblock via Gcc  wrote:
>
> I print out a bit of GIMPLE for a program and it looks like this:
>
>[local count: 13634385]:
>   # a_1 = PHI 
>   # n_11 = PHI 
> loop:
>   # DEBUG n => n_11
>   # DEBUG a => a_1
>   _2 = (long unsigned int) a_1;
>   _3 = _2 & 7;
>   _347 = _3 != 0;
>
> That bit that says "loop:" isn't a GIMPLE_LABEL and
> it has operands, the first of which is "size_t n = size_t;"
> and I find that a bit odd...

I'm sure it is a GIMPLE_LABEL.

> What is this thing and what does it do? Note, I'm trying to
> parse and transform some GIMPLE and this is confusing my
> code (and me.) I looked in the internals doc and grepped the
> code but I'm still in the dark.
>
> Note,I'd like be able to detect and ignore this if it's just informational.
>
> Thanks,
>
> Gary
>
> Gary
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

What is this GIMPLE?

2021-08-24 Thread Gary Oblock via Gcc

I print out a bit of GIMPLE for a program and it looks like this:

   [local count: 13634385]:
  # a_1 = PHI 
  # n_11 = PHI 
loop:
  # DEBUG n => n_11
  # DEBUG a => a_1
  _2 = (long unsigned int) a_1;
  _3 = _2 & 7;
  _347 = _3 != 0;

That bit that says "loop:" isn't a GIMPLE_LABEL and
it has operands, the first of which is "size_t n = size_t;"
and I find that a bit odd...

What is this thing and what does it do? Note, I'm trying to
parse and transform some GIMPLE and this is confusing my
code (and me.) I looked in the internals doc and grepped the
code but I'm still in the dark.

Note,I'd like be able to detect and ignore this if it's just informational.

Thanks,

Gary

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: A value number issue

2021-07-29 Thread Gary Oblock via Gcc

Richard,

I didn't find a use of _975 with that type...

A small test case is easier said than done. It's mcf
and I pulled a copy out of SPEC for my unit tests (we have
a SPEC license) and that passes. However, when I compile it
in the SPEC framework, it fails. Otherwise, I'd have used
creduce to make a really tiny test case a couple of weeks
ago... ;-(

In RTL a trick I used to use in the bad old days when I did one off VLIW
schedulers, was to instrument the bit that assigns the RTL's id
number and put a trap there to catch the creation of an interesting
bit of RTL. Does that kind of trick work with SSA variable creation?

Thanks, I really appreciate your help even though in this case
I'm still kind of stuck.

Gary


From: Richard Biener 
Sent: Thursday, July 29, 2021 12:12 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: A value number issue

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Jul 28, 2021 at 11:03 PM Gary Oblock  wrote:
>
> Richard,
>
> Here is more on the actual failure.
>
> From the pre pass dump:
>
> :
> Inserted _975 = (struct node.reorg.reorder *) dedangled_865;
> Replaced bea_43->tail with _975 in dedangled_864 = bea_43->tail;
>
>
>
> EMERGENCY DUMP:
>
> void master.constprop ()
> {
>   :
>   unsigned long _974;
>   struct node.reorg.reorder * _975;
>
>   :
>[local count: 4422246]:
>   _58 = MEM[(int64_t *) + 608B];
>   _59 = _58 + 1;
>   MEM[(int64_t *) + 608B] = _59;
>   dedangled_865 = bea_43->tail;
>   dedangled_863 = bea_43->head;
>   _975 = (struct node.reorg.reorder *) dedangled_865;
>   dedangled_864 = bea_43->tail;
>   dedangled_866 = bea_43->head;
>   if (red_cost_of_bea_42 > 0)
> goto ; [59.00%]
>   else
> goto ; [41.00%]
>   :
> --
> In tree-ssa-sccvn.c at about line 6000 or so there is
> this sequence:
>
>   if (!useless_type_conversion_p (TREE_TYPE (lhs),
>   TREE_TYPE (sprime)))
> {
>   /* We preserve conversions to but not from function or method
>  types.  This asymmetry makes it necessary to re-instantiate
>  conversions here.  */
>   if (POINTER_TYPE_P (TREE_TYPE (lhs))
>&& FUNC_OR_METHOD_TYPE_P (TREE_TYPE (TREE_TYPE (lhs
> sprime = fold_convert (TREE_TYPE (lhs), sprime);
>   else
> gcc_unreachable ();
> }
>
> We reach the gcc_unreachable because:
>
> lhs = dedangled_864, TREE_TYPE(lhs) = unsigned long
> sprime = _975, TREE_TYPE(sprime) = struct node.reorg.reorder *
>
> I've got to ask why does the _975 have that type?

Because the value was used with that type?

> The ssa var dedangled_865 is an unsigned long.
> If I knew why this happens then, hopefully, I can adjust
> the GIMPLE I create to avoid this situation...
>
> Question, would have including all references to denangled_865
> in the pre pass dump helped you answer this? By the way, I couldn't
> find a spot where dedangled_865 or any of its phi related uses
> (a use of x where x <- phi<... denanged_865..> and so on recursively)
> was converted to struct node.reorg.reorder *.

So what did you change in GCC?  If you did not change value-numbering
then you can reduce the testcase down and extract a GIMPLE frontend
testcase for VN (use -fdump-tree-all-gimple and massage the GIMPLE
dumped before the FRE/PRE pass that causes the issue)

> Gary
>
>
> 
> From: Richard Biener 
> Sent: Wednesday, July 28, 2021 3:40 AM
> To: Gary Oblock 
> Cc: gcc@gcc.gnu.org 
> Subject: Re: A value number issue
>
> [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
> be mindful of safe email handling and proprietary information protection 
> practices.]
>
>
> On Fri, Jul 23, 2021 at 3:40 AM Gary Oblock  wrote:
> >
> > Richard,
> >
> > OK, all that I've gotten so far out of the dump file is
> > that the name of "_920" is just something sccvn concocted
> > and wasn't something I accidentally caused.
> >
> > That still leaves me with the question of what is going on.
> >
> > Here's all the interesting bits of the dumpfile concerning
> > dedangled_864:
> >
> > :
> > dedangled_865 = *bea_43 + 64
> > dedangled_863 = *bea_43 + 128
> > dedangled_864 = *bea_43 + 64
> > dedangled_866 = *bea_43 + 128
> > dedangled_867 = dedangled_863
> > dedangled_867 = dedangled_865
> > dedangled_868 = dedangled_864
> > dedangled_868 = dedangled_866
> > :
> > Equivalence classes for ind

Re: A value number issue

2021-07-28 Thread Gary Oblock via Gcc

Richard,

Here is more on the actual failure.

>From the pre pass dump:

:
Inserted _975 = (struct node.reorg.reorder *) dedangled_865;
Replaced bea_43->tail with _975 in dedangled_864 = bea_43->tail;

EMERGENCY DUMP:

void master.constprop ()
{
  :
  unsigned long _974;
  struct node.reorg.reorder * _975;

  :
   [local count: 4422246]:
  _58 = MEM[(int64_t *) + 608B];
  _59 = _58 + 1;
  MEM[(int64_t *) + 608B] = _59;
  dedangled_865 = bea_43->tail;
  dedangled_863 = bea_43->head;
  _975 = (struct node.reorg.reorder *) dedangled_865;
  dedangled_864 = bea_43->tail;
  dedangled_866 = bea_43->head;
  if (red_cost_of_bea_42 > 0)
goto ; [59.00%]
  else
goto ; [41.00%]
  :
--
In tree-ssa-sccvn.c at about line 6000 or so there is
this sequence:

  if (!useless_type_conversion_p (TREE_TYPE (lhs),
  TREE_TYPE (sprime)))
{
  /* We preserve conversions to but not from function or method
 types.  This asymmetry makes it necessary to re-instantiate
 conversions here.  */
  if (POINTER_TYPE_P (TREE_TYPE (lhs))
   && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (TREE_TYPE (lhs
sprime = fold_convert (TREE_TYPE (lhs), sprime);
  else
gcc_unreachable ();
}

We reach the gcc_unreachable because:

lhs = dedangled_864, TREE_TYPE(lhs) = unsigned long
sprime = _975, TREE_TYPE(sprime) = struct node.reorg.reorder *

I've got to ask why does the _975 have that type?
The ssa var dedangled_865 is an unsigned long.
If I knew why this happens then, hopefully, I can adjust
the GIMPLE I create to avoid this situation...

Question, would have including all references to denangled_865
in the pre pass dump helped you answer this? By the way, I couldn't
find a spot where dedangled_865 or any of its phi related uses
(a use of x where x <- phi<... denanged_865..> and so on recursively)
was converted to struct node.reorg.reorder *.

Gary

From: Richard Biener 
Sent: Wednesday, July 28, 2021 3:40 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: A value number issue

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On Fri, Jul 23, 2021 at 3:40 AM Gary Oblock  wrote:
>
> Richard,
>
> OK, all that I've gotten so far out of the dump file is
> that the name of "_920" is just something sccvn concocted
> and wasn't something I accidentally caused.
>
> That still leaves me with the question of what is going on.
>
> Here's all the interesting bits of the dumpfile concerning
> dedangled_864:
>
> :
> dedangled_865 = *bea_43 + 64
> dedangled_863 = *bea_43 + 128
> dedangled_864 = *bea_43 + 64
> dedangled_866 = *bea_43 + 128
> dedangled_867 = dedangled_863
> dedangled_867 = dedangled_865
> dedangled_868 = dedangled_864
> dedangled_868 = dedangled_866
> :
> Equivalence classes for indirect node id 160 "dedangled_865": pointer 84, 
> location 0
> Equivalence classes for indirect node id 161 "dedangled_863": pointer 85, 
> location 0
> Equivalence classes for indirect node id 162 "dedangled_864": pointer 86, 
> location 0
> :
> dedangled_865 = { ESCAPED NONLOCAL }
> dedangled_863 = { ESCAPED NONLOCAL }
> dedangled_864 = { ESCAPED NONLOCAL }
> dedangled_866 = { ESCAPED NONLOCAL }
> dedangled_867 = { ESCAPED NONLOCAL }
> dedangled_868 = { ESCAPED NONLOCAL }
> :
> Value numbering store MEM[(int64_t *) + 608B] to _59
> Setting value number of .MEM_123 to .MEM_123 (changed)
> Value numbering stmt = dedangled_865 = bea_43->tail;
> Setting value number of dedangled_865 to dedangled_865 (changed)
> Making available beyond BB36 dedangled_865 for value dedangled_865
> Value numbering stmt = dedangled_863 = bea_43->head;
> Setting value number of dedangled_863 to dedangled_863 (changed)
> Making available beyond BB36 dedangled_863 for value dedangled_863
> Value numbering stmt = dedangled_864 = bea_43->tail;
> Inserting name _920 for expression (struct node.reorg.reorder *) dedangled_865

this means that the earlier dedangled_865 = bea_43->tail; somehow did not
produce a value that was considered OK but it was close enough so VN
remembers the expression (struct node.reorg.reorder *) dedangled_865
as known result, using _920 as value.  _920 should be then inserted during
elimination.

I'm not sure what your actual problem is - you can see in eliminate_stmt
how we deal with such values:

  /* If there is no existing usable leader but SCCVN thinks
 it has an expression it wants to use as replacement,
 insert that.  */
  tree val = VN_INFO (lhs)->valnum;
  vn_ssa_aux_t vn_info;
  if (val != VN_TOP

Re: A value number issue

2021-07-22 Thread Gary Oblock via Gcc

_val_temp_861;
  unsigned long dedangled_863;
  unsigned long dedangled_864;
  unsigned long dedangled_865;
  unsigned long dedangled_866;
  :
 [local count: 4422246]:
  _58 = MEM[(int64_t *) + 608B];
  _59 = _58 + 1;
  MEM[(int64_t *) + 608B] = _59;
  dedangled_865 = bea_43->tail;
  dedangled_863 = bea_43->head;
  _975 = (struct node.reorg.reorder *) dedangled_865;
  dedangled_864 = bea_43->tail;
  dedangled_866 = bea_43->head;
  if (red_cost_of_bea_42 > 0)
goto ; [59.00%]
  else
goto ; [41.00%]

   [local count: 2609125]:
  goto ; [100.00%]

   [local count: 1813121]:

   [local count: 4422246]:
  # dedangled_867 = PHI 
  # dedangled_868 = PHI 
  if (dedangled_867 != dedangled_868)
goto ; [89.00%]
  else
goto ; [11.00%]
=
I was pretty arbitrary here about what I extracted from the
dump file but it's 33MB in size.

I'm still thinking it's something dumb that I did
when I created "dedangled_864" but I can't spot it from
the dump. Does anyone have any ideas? Note, before I
looked at the dump I at least had a half-baked idea
of what to try but now this leaves me without a clue
as to what to do (I'm going to read up on the original
algorithm.)

Thanks,

Gary


From: Richard Biener 
Sent: Thursday, July 22, 2021 5:18 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: A value number issue

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Thu, Jul 22, 2021 at 8:06 AM Gary Oblock via Gcc  wrote:
>
> I seem to be having a problem with the pre pass.
>
> When eliminate_dom_walker::eliminate_stmt is called with
> the gsi to "dedangled_864 = bea_43->tail;" which in turn
> calls eliminate_dom_walker::eliminate_avail op of dedangled_864.
> This gives VN_INFO (lhs)->valnum of _920. The _920 is not
> associated with any SSA variable in the function and I don't
> see how it got associated with dedangled_864. This is not
> a theoretical issue because it causes an error (the gcc_unreachable
> in eliminate_stmt is called.)

But you show below the definition of _920 so I don't quite understand
your question.  You can follow VNs reasoning in the -details dump.

>
> Here is how _920 (in function main) is used.
>
>   _920 = arcnew_916->head;
>   _921 = MEM[(struct node.reorg.reorder *)_920].firstin;
>   MEM[(struct node.reorg.reorder *)_920].firstin = arcnew_916;
>
> Here is how dedangled_864 is used:
>
>[local count: 2609125]:
>   dedangled_863 = bea_43->head;
>   dedangled_864 = bea_43->tail;
>   goto ; [100.00%]
>
>[local count: 1813121]:
>   dedangled_865 = bea_43->tail;
>   dedangled_866 = bea_43->head;
>
>[local count: 4422246]:
>   # dedangled_867 = PHI 
>   # dedangled_868 = PHI 
>   delta_461 = 1;
>   goto ; [100.00%]
>
> Note, dedangled_868 is used in an ever widening net of
> phis and operations. Also, the other similar statements
>
>   dedangled_863 = bea_43->head;
>   dedangled_865 = bea_43->tail;
>   dedangled_866 = bea_43->head;
>
> don't seem to be malformed.
>
> I tried using a watchpoint to see what was happening but that turned
> out to be not productive in that it was tripping on something
> unrelated even if I set it at the start of the pre pass.
>
> I'm assuming that some of my code is malformed in some
> subtle way and I was wondering it anybody had any ideas?
> I say subtle because this was all working on a slightly different
> version of gcc without the code of some other Ampere optimizations
> in the mix (I disabled those optimizations and things still failed.)
>
> Note, if you guys don't have any ideas the next approach is adding
> tons of brute force instrumentation and special purpose sanity
> checking to the value numbering routine... please help me avoid that.
>
> Thanks,
>
> Gary
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

A value number issue

2021-07-22 Thread Gary Oblock via Gcc

I seem to be having a problem with the pre pass.

When eliminate_dom_walker::eliminate_stmt is called with
the gsi to "dedangled_864 = bea_43->tail;" which in turn
calls eliminate_dom_walker::eliminate_avail op of dedangled_864.
This gives VN_INFO (lhs)->valnum of _920. The _920 is not
associated with any SSA variable in the function and I don't
see how it got associated with dedangled_864. This is not
a theoretical issue because it causes an error (the gcc_unreachable
in eliminate_stmt is called.)

Here is how _920 (in function main) is used.

  _920 = arcnew_916->head;
  _921 = MEM[(struct node.reorg.reorder *)_920].firstin;
  MEM[(struct node.reorg.reorder *)_920].firstin = arcnew_916;

Here is how dedangled_864 is used:

   [local count: 2609125]:
  dedangled_863 = bea_43->head;
  dedangled_864 = bea_43->tail;
  goto ; [100.00%]

   [local count: 1813121]:
  dedangled_865 = bea_43->tail;
  dedangled_866 = bea_43->head;

   [local count: 4422246]:
  # dedangled_867 = PHI 
  # dedangled_868 = PHI 
  delta_461 = 1;
  goto ; [100.00%]

Note, dedangled_868 is used in an ever widening net of
phis and operations. Also, the other similar statements

  dedangled_863 = bea_43->head;
  dedangled_865 = bea_43->tail;
  dedangled_866 = bea_43->head;

don't seem to be malformed.

I tried using a watchpoint to see what was happening but that turned
out to be not productive in that it was tripping on something
unrelated even if I set it at the start of the pre pass.

I'm assuming that some of my code is malformed in some
subtle way and I was wondering it anybody had any ideas?
I say subtle because this was all working on a slightly different
version of gcc without the code of some other Ampere optimizations
in the mix (I disabled those optimizations and things still failed.)

Note, if you guys don't have any ideas the next approach is adding
tons of brute force instrumentation and special purpose sanity
checking to the value numbering routine... please help me avoid that.

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: A simple debugging question

2021-07-14 Thread Gary Oblock via Gcc

Richard,

Opps! I see my problem. I changed directory into
x264 and not mcf!

I need to stop working so late

Thank you for your input because led me to question
my assumptions about what was going on.

Gary


From: Richard Biener 
Sent: Wednesday, July 14, 2021 12:23 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: A simple debugging question

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Jul 14, 2021 at 6:42 AM Gary Oblock via Gcc  wrote:
>
> OK, I haven't asked a dumb question for a while so here goes!
>
> I'm trying to debug my optimization in lto running 505mcf_r
> (yes it's SPEC17.)
>
> Here's the bit that fails from the make.out:
>
> /home/gary/gcc_build_gcc11/install/libexec/gcc/x86_64-pc-linux-gnu/11.1.1/lto1
>  -quiet -dumpdir ./mcf_r.lto.o- -dumpbase mcf_r.lto.o -dumpbase-ext .o -m64 
> -mtune=generic -march=x86-64 -O3 -O3 -Wdfa -version -fno-openmp -fno-openacc 
> -fno-pie -fcf-protection=none -fcommon -fdump-ipa-type-escape-analysis 
> -fdump-ipa-field-reorder -funroll-loops -flto-partition=none 
> -fipa-instance-interleave -fdump-ipa-all-details -fno-ipa-icf 
> -fipa-type-escape-analysis -fipa-field-reorder -fresolution=mcf_r.res 
> -flinker-output=exec --param=early-inlining-insns=96 
> --param=max-inline-insns-auto=64 --param=inline-unit-growth=96 
> @./mcf_r.lto.o-args.0 -o ./mcf_r.lto.o-mcf_r.lto.s
>
> I bring up emacs and go into gdb.
>
> I set it to debug the lto1 path above.
>
> I cd to the build area:
> cd 
> gary@SCC-PC0TYGP5:~/spec/cpu2017/benchspec/CPU/525.x264_r/build/build_base_gcc11_rate_ampere-64.
>
> I set my beakpoint:
> break tree-ssa-sccvn.c:5975
>
> I give the breakpoint a condition (yes I instrumented the code just to do 
> this):
> cond 1 (count == 5326)
>
> Finally, I run lto1 with the long list of arguments above.
>
> It runs a tiny bit and gives me:
> lto1: fatal error: could not open symbol resolution file: No such file or 
> directory
>
> This is a new one for me guys and I've used this approach above
> many times (but not on a SPEC build.) Any hints at what I did
> wrong?

You have to preserve temporary files (the symbol resolution file) with
-save-temps and if you did
the lto1 invocation will have to happen from the directory that file resides in
(-fresolution=mcf_r.res), otherwise lto1 won't find it.

> Note, during development I always build gcc with "-O0 -g".
>
> Thanks,
>
> Gary
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

A simple debugging question

2021-07-13 Thread Gary Oblock via Gcc

OK, I haven't asked a dumb question for a while so here goes!

I'm trying to debug my optimization in lto running 505mcf_r
(yes it's SPEC17.)

Here's the bit that fails from the make.out:

/home/gary/gcc_build_gcc11/install/libexec/gcc/x86_64-pc-linux-gnu/11.1.1/lto1 
-quiet -dumpdir ./mcf_r.lto.o- -dumpbase mcf_r.lto.o -dumpbase-ext .o -m64 
-mtune=generic -march=x86-64 -O3 -O3 -Wdfa -version -fno-openmp -fno-openacc 
-fno-pie -fcf-protection=none -fcommon -fdump-ipa-type-escape-analysis 
-fdump-ipa-field-reorder -funroll-loops -flto-partition=none 
-fipa-instance-interleave -fdump-ipa-all-details -fno-ipa-icf 
-fipa-type-escape-analysis -fipa-field-reorder -fresolution=mcf_r.res 
-flinker-output=exec --param=early-inlining-insns=96 
--param=max-inline-insns-auto=64 --param=inline-unit-growth=96 
@./mcf_r.lto.o-args.0 -o ./mcf_r.lto.o-mcf_r.lto.s

I bring up emacs and go into gdb.

I set it to debug the lto1 path above.

I cd to the build area:
cd 
gary@SCC-PC0TYGP5:~/spec/cpu2017/benchspec/CPU/525.x264_r/build/build_base_gcc11_rate_ampere-64.

I set my beakpoint:
break tree-ssa-sccvn.c:5975

I give the breakpoint a condition (yes I instrumented the code just to do this):
cond 1 (count == 5326)

Finally, I run lto1 with the long list of arguments above.

It runs a tiny bit and gives me:
lto1: fatal error: could not open symbol resolution file: No such file or 
directory

This is a new one for me guys and I've used this approach above
many times (but not on a SPEC build.) Any hints at what I did
wrong?

Note, during development I always build gcc with "-O0 -g".

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: Gcc Digest, Vol 15, Issue 5

2021-05-04 Thread Gary Oblock via Gcc

I've got to say appearances can be deceptive
in GCC and struct _modif_basket *[4061] is not
necessarily equal to struct _modif_basket *[4061]
even though the printed representation is
the same...

Gary

From: Gcc  on behalf of gcc-requ...@gcc.gnu.org 

Sent: Tuesday, May 4, 2021 1:48 PM
To: gcc@gcc.gnu.org 
Subject: Gcc Digest, Vol 15, Issue 5

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


Send Gcc mailing list submissions to
gcc@gcc.gnu.org

To subscribe or unsubscribe via the World Wide Web, visit
https://gcc.gnu.org/mailman/listinfo/gcc
or, via email, send a message with subject or body 'help' to
gcc-requ...@gcc.gnu.org

You can reach the person managing the list at
gcc-ow...@gcc.gnu.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Gcc digest..."

What is going on here with fixup_cfg?

2021-05-04 Thread Gary Oblock via Gcc

My jaws hit the floor when I saw this bug:

psimplex.c: In function ‘master.constprop’:
psimplex.c:124:6: error: non-trivial conversion in ‘constructor’
  124 | void master(network_t *net, int num_threads)
  |  ^
struct _modif_basket *[4061]
struct _modif_basket *[4061]
struct _modif_basket *[4061]
struct _modif_basket *[4061]
# .MEM_111 = VDEF <.MEM_103>
perm ={v} {CLOBBER};
during GIMPLE pass: fixup_cfg
psimplex.c:124:6: internal compiler error: verify_gimple failed
0x12da3a4 verify_gimple_in_cfg(function*, bool)
../../sources/gcc/tree-cfg.c:5482
0x10e69f8 execute_function_todo
../../sources/gcc/passes.c:1992
0x10e598b do_per_function
../../sources/gcc/passes.c:1640
0x10e6be8 execute_todo
../../sources/gcc/passes.c:2046

I've included a bit too much stuff but the bit that
confused the heck out of me was the

struct _modif_basket *[4061]
struct _modif_basket *[4061]

associated with the clobber.

I've been banging away for few days
trying to make the types associated with
the left and right hand sides of this clobber
agree and now it's complaining about that
too?!

The type of perm has changed so what
should right hand type be now? When
dumping out other instances of clobber,
the instances all matched so why should
this need to be different?

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Some really strange GIMPLE

2021-04-27 Thread Gary Oblock via Gcc

I'm chasing a bug and I used Creduce to produce a
reduced test case. However, that's really beside to
point.

I this file:

typedef struct basket {
} a;
long b;
a *basket;
int d, c, e;
a *flake[2];
void primal_bea_mpp();
void primal_net_simplex() {
  flake[1] = [1];
  primal_bea_mpp(d, d, d, b, flake, 0, e, c, c, d);
}

Produces this GIMPLE:
-
;; Function primal_net_simplex (primal_net_simplex, funcdef_no=3, 
decl_uid=4447, cgraph_uid=16, symbol_order=41) (executed once)

primal_net_simplex ()
{
   [local count: 1073741824]:
  _1 = basket;
  static struct a * flake[2];
struct a *[2]
  flake[1] = _1;
  _2 = d;
  _3 = c;
  _4 = e;
  _5 = b;
  primal_bea_mpp (_2, _2, _2, _5, , 0, _4, _3, _3, _2);
  return;

}
--
These standard calls were used to dump this:

  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
  {
struct function *func = DECL_STRUCT_FUNCTION ( node->decl);
dump_function_header ( file, func->decl, (dump_flags_t)0);
dump_function_to_file ( func->decl, file, (dump_flags_t)0);
  }

The GIMPLE above looks malformed to me. Is that the case
or am I not grasping what's going on here?

Note, I wouldn't be asking this question if this wasn't at the start of
my pass and looking at stuff I hadn't modified.

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Failing in generated file options.c

2021-03-15 Thread Gary Oblock via Gcc

Guys,

I checked out a fresh copy of the GCC sources today, applied somebodies
patch to it and voila!

options.c:13591:2: error: #error Report option property is dropped #error 
Report option property is dropped

I built this the same minimally convoluted way that I always do.

cd $1
BASE=`pwd`
echo BASE = $BASE
touch objdir install
rm -rf objdir install
mkdir objdir install
cd objdir
echo BUILDING IN `pwd`
../sources/configure --prefix=$BASE/install --disable-bootstrap 
-enable-language=c,c++,lto --disable-multilib --enable-valgrind-annotations
make CFLAGS='-O2 -g' CXXFLAGS='-O2 -g' -j 12
make install

The file option.c is generated in objdir/gcc by an awk script:

mawk -f ../../sources/gcc/opt-functions.awk -f ../../sources/gcc/opt-read.awk \
   -f ../../sources/gcc/optc-gen.awk \
   -v header_name="config.h system.h coretypes.h options.h tm.h" < 
optionlist > options.c

Does anyone  have any idea what's going to here?


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

A weird bug

2021-03-04 Thread Gary Oblock via Gcc

Guys,

I've been trying to debug a linker error (which I thought was a bug in
my optimization.) Well it turns out it occurs in a brand new virgin
version of the compiler running with binutils 2.36 which is the latest
version. I'm posting this on both the binutils list and gcc list
because people of either list are just as likely to have an idea about
what's going on. Note, I tried to walk though the ld code in gdb but
it's nontrivial trying figure out why an error occured. I can't just
set a watch point on a flag because, at least in some places it seems
like the default state of the flag is failure and the code then
proceeds to try various things to see if they'll work.

Regarding the compiler that I built:

 $ git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

Now here is how I built it:

./my_dual_build_script gcc_virgin_build 2>build_log_virgin 1>&2

where my_dual_build_script is:

cd $1
BASE=`pwd`
echo BASE = $BASE
touch BU_objdir objdir install
rm -rf BU_objdir objdir install
mkdir BU_objdir objdir install

cd BU_objdir
echo BUILDING BINUTILS IN `pwd`
../binutils-2.36/configure --prefix=$BASE/install
make CFLAGS='-O2 -g' CXXFLAGS='-O2 -g' -j 12
##make CFLAGS='-O0 -g' CXXFLAGS='-O2 -g' -j 12
make install

cd ../objdir
echo BUILDING GCC IN `pwd`
../sources/configure --prefix=$BASE/install --disable-bootstrap 
-enable-language=c,c++,lto --disable-multilib --enable-valgrind-annotations
##make CFLAGS='-O0 -g' CXXFLAGS='-O0 -g' -j 12
make CFLAGS='-O2 -g' CXXFLAGS='-O2 -g' -j 12
make install

My installed gcc is:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 
7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs 
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr 
--with-gcc-major-version-only --program-suffix=-7 
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id 
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix 
--libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes 
--with-default-libstdcxx-abi=new --enable-gnu-unique-object 
--disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie 
--with-system-zlib --with-target-system-zlib --enable-objc-gc=auto 
--enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic 
--enable-offload-targets=nvptx-none --without-cuda-driver 
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu 
--target=x86_64-linux-gnu
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)

My istalled binutils are:

$ ld -v
GNU ld (GNU Binutils for Ubuntu) 2.30

Now the application that failing during testing is 505.x264_5 which is
from SPEC17. Note, please don't try to duplicate this bug unless you
have access to SPEC17 and are running an identical system because
Martin Liska tried to do this on a different system but but he
couldn't get it to duplicate.

Now for any SPEC heads out there here's the germane parts
from its confige file.

   define  gcc_dir/home/gary/gcc_virgin_build/install

   preENV_LD_LIBRARY_PATH  = %{gcc_dir}/lib64/:%{gcc_dir}/lib/:/lib64
   SPECLANG= %{gcc_dir}/bin/
   CC  = $(SPECLANG)gcc --verbose  -std=c99   %{model}
   CXX = $(SPECLANG)g++ -std=c++03 %{model}
   FC  = $(SPECLANG)gfortran   %{model}

   OPTIMIZE   = -O2 -v -Wl,--verbose=1  -Wl,-debug
   COPTIMIZE  = -save-temps

Finally here linker's part of the x264 build output:

/home/gary/gcc_virgin_build/install/lib/gcc/x86_64-pc-linux-gnu/11.0.1/../../../../x86_64-pc-linux-gnu/bin/ld
 -plugin 
/home/gary/gcc_virgin_build/install/libexec/gcc/x86_64-pc-linux-gnu/11.0.1/liblto_plugin.so
 
-plugin-opt=/home/gary/gcc_virgin_build/install/libexec/gcc/x86_64-pc-linux-gnu/11.0.1/lto-wrapper
 -plugin-opt=-fresolution=ldecod_r.res -plugin-opt=-pass-through=-lgcc 
-plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc 
-plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s 
--eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o 
ldecod_r /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o 
/home/gary/gcc_virgin_build/install/lib/gcc/x86_64-pc-linux-gnu/11.0.1/crtbegin.o
 -L/home/gary/gcc_virgin_build/install/lib/gcc/x86_64-pc-linux-gnu/11.0.1 
-L/home/gary/gcc_virgin_build/install/lib/gcc/x86_64-pc-linux-gnu/11.0.1/../../../../lib64
 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu

What is pex_run

2021-02-25 Thread Gary Oblock via Gcc

I've got collect2 finding a linker error and I'm out of
other options so I'm poking around in the collect2
sources. I'm wondering what pex_run is (since it's
getting handed the arguments this might mater?)
I figure if I can get collect2 to spill its guts
about what arguments are fed to "ld" I'll have
at least a chance figuring out what in the bleep
is going on... alternate debugging strategies
are welcome.

Thanks,

Gary




CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: What version of binutils is required

2021-02-23 Thread Gary Oblock via Gcc

Fair enough...

$ ld -V
GNU ld (GNU Binutils for Ubuntu) 2.30


From: Richard Biener 
Sent: Tuesday, February 23, 2021 12:41 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: What version of binutils is required

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Tue, Feb 23, 2021 at 1:12 AM Gary Oblock via Gcc  wrote:
>
> I'm having a "linker" error (according to Martin Liška) when
> compiling a SPEC test (x264_r) with a vendor branch under development  (my 
> optimization is done at LTO time.)
>
> The binutils on my development machine is the version
> that came with Ubuntu 18.02. Do I need to install a more
> current version of binutils?

Just try?  Or at least tell us which version ships with Ubuntu 18.02 ...

> Thanks,
>
> Gary Oblock
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

What version of binutils is required

2021-02-22 Thread Gary Oblock via Gcc

I'm having a "linker" error (according to Martin Liška) when
compiling a SPEC test (x264_r) with a vendor branch under development  (my 
optimization is done at LTO time.)

The binutils on my development machine is the version
that came with Ubuntu 18.02. Do I need to install a more
current version of binutils?

Thanks,

Gary Oblock


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Collect2 issue

2021-02-11 Thread Gary Oblock via Gcc

I'm running my new optimization (LTO with one partition)
on a SPEC17 test.

I got the mysterious message
  "collect2: error: ld returned 1 exit status"

Now, first off, with my debugging on at full tilt and it's
clear my optimization bailed out after analyzing
the code without doing anything.

Second, this is a canned test not modified by
me or anybody else for that matter so, it's
not a user error.

Finally, reading various blogs it seems that
old object files hanging around can make
collect2 to go bonkers. Therefore, I used
specmake to clean the build and it didn't
help, not that this makes any sense with one
partition.

By the way, for those of you that get upset
at the very notion of invoking LTO as one partition,
it's actually not that bad. I can compiler gcc
that way on a laptop in about 50 minuets using
only 4.7% of my memory!

Any ideas?? What's happening, how to diagnose
what's happening, how to work around it, etc...

Thanks,

Gary



CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: A problem with field decl offsets in newly minted types

2021-01-01 Thread Gary Oblock via Gcc

The offsets seem to actually be created. However,
they are almost immediately are being deleted.

Any ideas what's going on? Has some kind
of memory management gizmo gone awry?

Gary

PS For anybody who has been following my travails with the instance 
interleaving structure
reorganization optimization, this is occurring
with the Mcf sources from SPEC17.


From: Gary Oblock 
Sent: Wednesday, December 30, 2020 11:00 PM
To: gcc@gcc.gnu.org 
Subject: A problem with field decl offsets in newly minted types

I'm having some grief with creating/using some modified types.

I problem occurs in tree-ssa-sccvn.c when some code tries
to take a DECL_FIELD_OFFSET and unfortuenately gets a null
that causes a crash.

So, I traced this back the to types I created. Note, the method I used
has seemed to be fairly robust (some other engineers in our China
group even successfully copied it for thier own optimization.)
However, the DECL_FIELD_OFFSET offset is definitely null for
the fields which I created in the new versions of the types.

Here is what I'm doing.

tree type_prime = lang_hooks.types.make_type (RECORD_TYPE);

const char *base_type_name =
  identifier_to_locale ( IDENTIFIER_POINTER ( TYPE_NAME ( type)));
size_t len = strlen ( MY_PREFIX) + strlen ( base_type_name);
char *rec_name = ( char*)alloca ( len + 1);
strcpy ( rec_name, MY_PREFIX);
strcat ( rec_name, base_type_name);

TYPE_NAME ( reorg_type_prime) = get_identifier ( rec_name);

tree field;
tree new_fields = NULL;
for ( field = TYPE_FIELDS ( type); field; field = DECL_CHAIN ( field))
  {
// Probably useful in creating the new field type.
tree old_fld_type = TREE_TYPE ( field);

// Whatever you want the new field type to be
tree new_fld_type = ...;
tree new_decl =
  build_decl ( DECL_SOURCE_LOCATION (field),
  FIELD_DECL, DECL_NAME (field), new_fld_type);
 DECL_CONTEXT ( new_decl) = type_prime;
 layout_decl ( new_decl, 0);
 DECL_CHAIN ( new_decl) = new_fields;
 new_fields = new_decl;
  }

TYPE_FIELDS ( type_prime) = nreverse ( new_fields);
layout_type ( type_prime);

Note, when I change any declaration to have a modified type
I run relayout_decl on them.

If somebody could please tell what bit I'm missing or what I'm doing
wrong I'd really appreciate it. Looking at the code in layout_type,
it should be setting the decl field offsets with place_field and I
don't have a clue what's going on.

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

A problem with field decl offsets in newly minted types

2020-12-30 Thread Gary Oblock via Gcc

I'm having some grief with creating/using some modified types.

I problem occurs in tree-ssa-sccvn.c when some code tries
to take a DECL_FIELD_OFFSET and unfortuenately gets a null
that causes a crash.

So, I traced this back the to types I created. Note, the method I used
has seemed to be fairly robust (some other engineers in our China
group even successfully copied it for thier own optimization.)
However, the DECL_FIELD_OFFSET offset is definitely null for
the fields which I created in the new versions of the types.

Here is what I'm doing.

tree type_prime = lang_hooks.types.make_type (RECORD_TYPE);

const char *base_type_name =
  identifier_to_locale ( IDENTIFIER_POINTER ( TYPE_NAME ( type)));
size_t len = strlen ( MY_PREFIX) + strlen ( base_type_name);
char *rec_name = ( char*)alloca ( len + 1);
strcpy ( rec_name, MY_PREFIX);
strcat ( rec_name, base_type_name);

TYPE_NAME ( reorg_type_prime) = get_identifier ( rec_name);

tree field;
tree new_fields = NULL;
for ( field = TYPE_FIELDS ( type); field; field = DECL_CHAIN ( field))
  {
// Probably useful in creating the new field type.
tree old_fld_type = TREE_TYPE ( field);

// Whatever you want the new field type to be
tree new_fld_type = ...;
tree new_decl =
  build_decl ( DECL_SOURCE_LOCATION (field),
  FIELD_DECL, DECL_NAME (field), new_fld_type);
 DECL_CONTEXT ( new_decl) = type_prime;
 layout_decl ( new_decl, 0);
 DECL_CHAIN ( new_decl) = new_fields;
 new_fields = new_decl;
  }

TYPE_FIELDS ( type_prime) = nreverse ( new_fields);
layout_type ( type_prime);

Note, when I change any declaration to have a modified type
I run relayout_decl on them.

If somebody could please tell what bit I'm missing or what I'm doing
wrong I'd really appreciate it. Looking at the code in layout_type,
it should be setting the decl field offsets with place_field and I
don't have a clue what's going on.

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: gsi_remove on a call

2020-10-28 Thread Gary Oblock via Gcc

Martin,

After some digging and a little luck, I found that this does what I wanted:

cgraph_update_edges_for_call_stmt ( stmt, gimple_call_fndecl ( stmt), NULL);

Thanks,

Gary

From: Martin Jambor 
Sent: Tuesday, October 27, 2020 5:44 AM
To: Gary Oblock ; gcc@gcc.gnu.org 
Subject: Re: gsi_remove on a call

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Tue, Oct 27 2020, Gary Oblock via Gcc wrote:
> I'm running into grief in verify_node in cgraph.c
> when I use gsi_remove on a call statement.
>
> Specifically it's a free statement which I've replaced
> with other free statements as part of my structure
> reorg optimizations. Note, in other working code
> I do this with malloc and it doesn't seem to be a problem.
>
> Where it happens it's trying to look at the call graph edges.
> Is there a way to remove the edge in question or mark it
> to be ignored?

Have you tried using cgraph_edge::set_call_stmt to update the edge?

> I see that line below about built in unreachable
> and wonder if I'm supposed to set the decl to that but I
> don't see others doing it so...

The comment you quoted explains why __builtin_unreachable is special.

Martin

>
> Here's the code in cgraph (e->call_stmt is the free in question:)
>
>  if (gimple_has_body_p (e->caller->decl)
>  && !e->caller->inlined_to
>  && !e->speculative
>  /* Optimized out calls are redirected to __builtin_unreachable.  */
>  && (e->count.nonzero_p ()
>  || ! e->callee->decl
>  || !fndecl_built_in_p (e->callee->decl, BUILT_IN_UNREACHABLE))
>  && count == ENTRY_BLOCK_PTR_FOR_FN(DECL_STRUCT_FUNCTION  (decl))->count
> && (!e->count.ipa_p ()
>  && e->count.differs_from_p (gimple_bb (e->call_stmt)->count)))
>{
>   :
>
> Thanks,
>
> Gary
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

gsi_remove on a call

2020-10-27 Thread Gary Oblock via Gcc

I'm running into grief in verify_node in cgraph.c
when I use gsi_remove on a call statement.

Specifically it's a free statement which I've replaced
with other free statements as part of my structure
reorg optimizations. Note, in other working code
I do this with malloc and it doesn't seem to be a problem.

Where it happens it's trying to look at the call graph edges.
Is there a way to remove the edge in question or mark it
to be ignored? I see that line below about built in unreachable
and wonder if I'm supposed to set the decl to that but I
don't see others doing it so...

Here's the code in cgraph (e->call_stmt is the free in question:)

 if (gimple_has_body_p (e->caller->decl)
 && !e->caller->inlined_to
 && !e->speculative
 /* Optimized out calls are redirected to __builtin_unreachable.  */
 && (e->count.nonzero_p ()
 || ! e->callee->decl
 || !fndecl_built_in_p (e->callee->decl, BUILT_IN_UNREACHABLE))
 && count == ENTRY_BLOCK_PTR_FOR_FN(DECL_STRUCT_FUNCTION  (decl))->count
&& (!e->count.ipa_p ()
 && e->count.differs_from_p (gimple_bb (e->call_stmt)->count)))
   {
  :

Thanks,

Gary






CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: Missing functionality

2020-10-23 Thread Gary Oblock via Gcc

> gsi_insert_seq_before

Yes, I discovered this myself via Google and ironically it was because of a 
2008 comment by you that you'd renamed gsi_link_seq_before. Frankly, I find 
this a bit amusing. This doesn't seem to be a very heavily used function and 
I'm probably going to be its user.

Gary

From: Richard Biener 
Sent: Thursday, October 22, 2020 11:01 PM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Missing functionality

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Fri, Oct 23, 2020 at 5:10 AM Gary Oblock via Gcc  wrote:
>
> I'm finishing up coding my patterns for the structure reorganization
> optimization. They recognize certain instructions and replace them
> other instructions. I've got some code that generates gimple which is
> inserted as it's created with gsi_insert_before.  This code is
> something I'd like to use at other points in my optimization so I'd
> like to create a function to do this.
>
> Now comes the interesting bit. I'd like to use gimple sequences and
> after reading the internals documentation I put together something
> using gimple_seq_add_stmt to add the generated gimple to a new
> sequence. After this I tried inserting the sequence into the basic
> block's sequence with gsi_link_seq_before. It turns out there is no
> gsi_link_seq_before! I could probably write one myself but that begs
> the question, why is there no gsi_link_seq_before and what should I
> use instead?

gsi_insert_seq_before

> Thank
>
> Gary
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

Regarding last question

2020-10-22 Thread Gary Oblock via Gcc

Never mind... assume I'm grumbling about the documentation.

;-(

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Missing functionality

2020-10-22 Thread Gary Oblock via Gcc

I'm finishing up coding my patterns for the structure reorganization
optimization. They recognize certain instructions and replace them
other instructions. I've got some code that generates gimple which is
inserted as it's created with gsi_insert_before.  This code is
something I'd like to use at other points in my optimization so I'd
like to create a function to do this.

Now comes the interesting bit. I'd like to use gimple sequences and
after reading the internals documentation I put together something
using gimple_seq_add_stmt to add the generated gimple to a new
sequence. After this I tried inserting the sequence into the basic
block's sequence with gsi_link_seq_before. It turns out there is no
gsi_link_seq_before! I could probably write one myself but that begs
the question, why is there no gsi_link_seq_before and what should I
use instead?

Thank

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: Where did my function go?

2020-10-20 Thread Gary Oblock via Gcc

>IPA transforms happens when get_body is called.  With LTO this also
>trigger reading the body from disk.  So if you want to see all bodies
>and work on them, you can simply call get_body on everything but it will
>result in increased memory use since everything will be loaded form disk
>and expanded (by inlining) at once instead of doing it on per-function
>basis.
Jan,

Doing

FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node) node->get_body ();

instead of

FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node) node->get_untransformed_body ();

instantaneously breaks everything...

Am I missing something?

Gary

From: Jan Hubicka 
Sent: Tuesday, October 20, 2020 4:34 AM
To: Richard Biener 
Cc: GCC Development ; Gary Oblock 
Subject: Re: Where did my function go?

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


> > On Tue, Oct 20, 2020 at 1:02 PM Martin Jambor  wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Oct 20 2020, Richard Biener wrote:
> > > > On Mon, Oct 19, 2020 at 7:52 PM Gary Oblock  
> > > > wrote:
> > > >>
> > > >> Richard,
> > > >>
> > > >> I guess that will work for me. However, since it
> > > >> was decided to remove an identical function,
> > > >> why weren't the calls to it adjusted to reflect it?
> > > >> If the call wasn't transformed that means it will
> > > >> be mapped at some later time. Is that mapping
> > > >> available to look at? Because using that would
> > > >> also be a potential solution (assuming call
> > > >> graph information exists for the deleted function.)
> > > >
> > > > I'm not sure how the transitional cgraph looks like
> > > > during WPA analysis (which is what we're talking about?),
> > > > but definitely the IL is unmodified in that state.
> > > >
> > > > Maybe Martin has an idea.
> > > >
> > >
> > > Exactly, the cgraph_edges is where the correct call information is
> > > stored until the inlining transformation phase calls
> > > cgraph_edge::redirect_call_stmt_to_callee is called on it - inlining is
> > > a special pass in this regard that performs this IPA-infrastructure
> > > function in addition to actual inlining.
> > >
> > > In cgraph means the callee itself but also information in
> > > e->callee->clone.param_adjustments which might be interesting for any
> > > struct-reorg-like optimizations (...and in future possibly in other
> > > transformation summaries).
> > >
> > > The late IPA passes are in very unfortunate spot here since they run
> > > before the real-IPA transformation phases but after unreachable node
> > > removals and after clone materializations and so can see some but not
> > > all of the changes performed by real IPA passes.  The reason for that is
> > > good cache locality when late IPA passes are either not run at all or
> > > only look at small portion of the compilation unit.  In such case IPA
> > > transformations of a function are followed by all the late passes
> > > working on the same function.
> > >
> > > Late IPA passes are unfortunately second class citizens and I would
> > > strongly recommend not to use them since they do not fit into our
> > > otherwise robust IPA framework very well.  We could probably provide a
> > > mechanism that would allow late IPA passes to run all normal IPA
> > > transformations on a function so they could clearly see what they are
> > > looking at, but extensive use would slow compilation down so its use
> > > would be frowned upon at the very least.
> >
> > So IPA PTA does get_body () on the nodes it wants to analyze and I
> > thought that triggers any pending IPA transforms?
>
> Yes, it does (and get_untransormed_body does not)
And to bit correct Maritn's explanation: the late IPA passes are
intended to work, though I was mostly planning them for prototyping true
ipa passes and also possibly for implementing passes that inspect only
few functions.

IPA transforms happens when get_body is called.  With LTO this also
trigger reading the body from disk.  So if you want to see all bodies
and work on them, you can simply call get_body on everything but it will
result in increased memory use since everything will be loaded form disk
and expanded (by inlining) at once instead of doing it on per-function
basis.

get_body is simply mean to arrange the body on demand.  The passmanager
uses it before late passes are executed and ipa-pta uses it before it
builds constraints (that is not good for reasons described above).

Clone materialization is also triggered by get_body. The clone
materialization pass mostly happens to remove unreachable function
bodies. I plan to get rid of it, since as we are now better on doing ipa
transforms it brings in a lot of bodies already. For cc1plus it is well
over 1GB of memory.

Honza
>
> Honza
> >
> > Richard.
> >
> > > Martin
> > >

Where did my function go?

2020-10-16 Thread Gary Oblock via Gcc

I have a tiny program composed of a few functions
and one of those functions (setupB) has gone missing.
Since I need to walk its GIMPLE, this is a problem.

The program:

-- aux.h -
#include "stdlib.h"
typedef struct A A_t;
typedef struct A B_t;
struct A {
  int i;
  double x;
};

#define MAX(x,y) ((x)>(y) ? (x) : (y))

extern int max1( A_t *, size_t);
extern double max2( B_t *, size_t);
extern A_t *setupA( size_t);
extern B_t *setupB( size_t);
-- aux.c 
#include "aux.h"
#include "stdlib.h"

A_t *
setupA( size_t size)
{
  A_t *data = (A_t *)malloc( size * sizeof(A_t));
  size_t i;
  for( i = 0; i < size; i++ ) {
data[i].i = rand();
data[i].x = drand48();
  }
  return data;
}

B_t *
setupB( size_t size)
{
  B_t *data = (B_t *)malloc( size * sizeof(B_t));
  size_t i;
  for( i = 0; i < size; i++ ) {
data[i].i = rand();
data[i].x = drand48();
  }
  return data;
}

int
max1( A_t *array, size_t len)
{
  size_t i;
  int result = array[0].i;
  for( i = 1; i < len; i++  ) {
result = MAX( array[i].i, result);
  }
  return result;
}

double
max2( B_t *array, size_t len)
{
  size_t i;
  double result = array[0].x;
  for( i = 1; i < len; i++  ) {
result = MAX( array[i].x, result);
  }
  return result;
}
-- main.c -
#include "stdio.h"

A_t *data1;

int
main(void)
{
  B_t *data2 = setupB(200);
  data1 = setupA(100);

  printf("First %d\n" , max1(data1,100));
  printf("Second %e\n", max2(data2,200));
}


Here is its GIMPLE dump:
(for the sole purpose of letting you see
with your own eyes that setupB is indeed missing)

Program:
  static struct A_t * data1;
struct A_t *  (size_t)

;; Function setupA (setupA, funcdef_no=4, decl_uid=4398, cgraph_uid=6, 
symbol_order=48) (executed once)

setupA (size_t size)
{
  size_t i;
  struct A_t * data;

   [local count: 118111600]:
  _1 = size_8(D) * 16;
  data_11 = malloc (_1);
  goto ; [100.00%]

   [local count: 955630225]:
  _2 = i_6 * 16;
  _3 = data_11 + _2;
  _4 = rand ();
  _3->i = _4;
  _5 = drand48 ();
  _3->x = _5;
  i_16 = i_6 + 1;

   [local count: 1073741824]:
  # i_6 = PHI <0(2), i_16(3)>
  if (i_6 < size_8(D))
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 118111600]:
  return data_11;

}


int  (struct A_t *)

;; Function max1.constprop (max1.constprop.0, funcdef_no=1, decl_uid=4397, 
cgraph_uid=5, symbol_order=58) (executed once)

max1.constprop (struct A_t * array)
{
  size_t i;
  int result;
  size_t len;

   [local count: 118111600]:

   [local count: 118111600]:
  result_2 = array_1(D)->i;
  goto ; [100.00%]

   [local count: 955630225]:
  _4 = i_3 * 16;
  _5 = array_1(D) + _4;
  _6 = _5->i;
  result_8 = MAX_EXPR <_6, result_7>;
  i_9 = i_3 + 1;

   [local count: 1073741824]:
  # i_3 = PHI <1(2), i_9(3)>
  # result_7 = PHI 
  if (i_3 <= 99)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 118111600]:
  # result_10 = PHI 
  return result_10;

}


double  (struct B_t *)

;; Function max2.constprop (max2.constprop.0, funcdef_no=3, decl_uid=4395, 
cgraph_uid=3, symbol_order=59) (executed once)

max2.constprop (struct B_t * array)
{
  size_t i;
  double result;
  size_t len;

   [local count: 118111600]:

   [local count: 118111600]:
  result_2 = array_1(D)->x;
  goto ; [100.00%]

   [local count: 955630225]:
  _4 = i_3 * 16;
  _5 = array_1(D) + _4;
  _6 = _5->x;
  if (_6 > result_7)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 477815112]:

   [local count: 955630225]:
  # _10 = PHI 
  i_8 = i_3 + 1;

   [local count: 1073741824]:
  # i_3 = PHI <1(2), i_8(5)>
  # result_7 = PHI 
  if (i_3 <= 199)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 118111600]:
  # result_9 = PHI 
  return result_9;

}


int  (void)

;; Function main (main, funcdef_no=5, decl_uid=4392, cgraph_uid=1, 
symbol_order=25) (executed once)

main ()
{
  struct B_t * data2;

   [local count: 1073741824]:
  data2_6 = setupB (200);
  _1 = setupA (100);
  data1 = _1;
  _2 = max1 (_1, 100);
  printf ("First %d\n", _2);
  _3 = max2 (data2_6, 200);
  printf ("Second %e\n", _3);
  return 0;

}

The pass is invoked at this location in passes.def

  /* Simple IPA passes executed after the regular passes.  In WHOPR mode the
 passes are executed after partitioning and thus see just parts of the
 compiled unit.  */
  INSERT_PASSES_AFTER (all_late_ipa_passes)
  NEXT_PASS (pass_materialize_all_clones);
  NEXT_PASS (pass_ipa_type_escape_analysis);
  NEXT_PASS (pass_ipa_structure_reorg); <== my pass!
  NEXT_PASS (pass_ipa_prototype);
  NEXT_PASS (pass_ipa_pta);
  NEXT_PASS (pass_omp_simd_clone);
  TERMINATE_PASS_LIST (all_late_ipa_passes)

--
The program was compiled with these

Re: How to check reachable between blocks

2020-10-10 Thread Gary Oblock via Gcc

Andrew,

Dominance and reachability are two different but related things. It's trivial 
to come up with a simple example to show this.

Gary

From: Andrew Pinski 
Sent: Friday, October 9, 2020 8:13 PM
To: Jojo R 
Cc: GCC Development 
Subject: Re: How to check reachable between blocks

On Fri, Oct 9, 2020 at 8:01 PM Jojo R  wrote:
>
> Hi,
>
> Is there any API or common codes to check any two blocks is reachable 
> ?

Yes the API in dominance.h.
Depending on where you use it, you might need to have it created.
Using calculate_dominance_info function.
The function to do the check is dominated_by_p.

Thanks,
Andrew Pinski

>
> Thanks.
>
> Jojo

Re: Dominance information problem

2020-09-14 Thread Gary Oblock via Gcc

Erick,

I assume that this needs to be done on all the functions since
you mention "cfun".

Gary

From: Erick Ochoa 
Sent: Monday, September 14, 2020 12:10 AM
To: Gary Oblock ; gcc@gcc.gnu.org 
Subject: Re: Dominance information problem

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


Hi Gary,

I'm not 100% sure this will fix the problem, but in the past I have had
to call the following function:

   /* If dominator info is not available, we need to calculate it.  */
   if (!dom_info_available_p (CDI_DOMINATORS))
 calculate_dominance_info (CDI_DOMINATORS);

Basically dominance information was not available for cfun.
Also, you might also need to call:

   if (dom_info_available_p (CDI_DOMINATORS))
 free_dominance_info (CDI_DOMINATORS);

Just before your pass is done. These were some calls I needed to make in
a different pass when I was working with dominators.

On 12/09/2020 09:26, Gary Oblock wrote:
> I'm trying to do performance qualification for my structure
> reorganization optimization.
>
> I'm doing pretty straightforward stuff and I haven't at this point in
> time (qualifying the optimization,) modified the program. So I'm a
> little surprised this is failing.  Here is the code that's failing on
> the first iteration of the for loops:
>
>struct cgraph_node *node;
>FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)  {
>  struct function *func = DECL_STRUCT_FUNCTION ( node->decl);
>  push_cfun ( func);
>
>  class loop *loop;
>  FOR_EACH_LOOP_FN ( func, loop, LI_ONLY_INNERMOST )
>{
>  size_t num_bbs = loop->num_nodes;
>  basic_block *bbs = get_loop_body ( loop); // FAILS HERE!!!
>  :
>  stuff never reached
>
> How it's failing (in code from dominance.c) I'm guessing tells me the
> dominance information is messed up (unlikely) or needs to be
> recomputed. If I'm not wrong, how do I go about doing the later
>
> /* Return TRUE in case BB1 is dominated by BB2.  */
> bool
> dominated_by_p (enum cdi_direction dir, const_basic_block bb1, 
> const_basic_block bb2)
> {
>unsigned int dir_index = dom_convert_dir_to_idx (dir);
>struct et_node *n1 = bb1->dom[dir_index], *n2 = bb2->dom[dir_index];
>
>gcc_checking_assert (dom_computed[dir_index]); // <=== BOOM!
>
>if (dom_computed[dir_index] == DOM_OK)
>  return (n1->dfs_num_in >= n2->dfs_num_in
>   && n1->dfs_num_out <= n2->dfs_num_out);
>
>return et_below (n1, n2);
> }
>
>
> Thanks,
>
> Gary Oblock
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.
>

Dominance information problem

2020-09-12 Thread Gary Oblock via Gcc

I'm trying to do performance qualification for my structure
reorganization optimization.

I'm doing pretty straightforward stuff and I haven't at this point in
time (qualifying the optimization,) modified the program. So I'm a
little surprised this is failing.  Here is the code that's failing on
the first iteration of the for loops:

  struct cgraph_node *node;
  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)  {
struct function *func = DECL_STRUCT_FUNCTION ( node->decl);
push_cfun ( func);

class loop *loop;
FOR_EACH_LOOP_FN ( func, loop, LI_ONLY_INNERMOST )
  {
size_t num_bbs = loop->num_nodes;
basic_block *bbs = get_loop_body ( loop); // FAILS HERE!!!
:
stuff never reached

How it's failing (in code from dominance.c) I'm guessing tells me the
dominance information is messed up (unlikely) or needs to be
recomputed. If I'm not wrong, how do I go about doing the later

/* Return TRUE in case BB1 is dominated by BB2.  */
bool
dominated_by_p (enum cdi_direction dir, const_basic_block bb1, 
const_basic_block bb2)
{
  unsigned int dir_index = dom_convert_dir_to_idx (dir);
  struct et_node *n1 = bb1->dom[dir_index], *n2 = bb2->dom[dir_index];

  gcc_checking_assert (dom_computed[dir_index]); // <=== BOOM!

  if (dom_computed[dir_index] == DOM_OK)
return (n1->dfs_num_in >= n2->dfs_num_in
 && n1->dfs_num_out <= n2->dfs_num_out);

  return et_below (n1, n2);
}


Thanks,

Gary Oblock






CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: A couple GIMPLE questions

2020-09-06 Thread Gary Oblock via Gcc

>Could you please get rid of this when posting on public mailing lists?

No, I  have no control over that but I'll give the email of our corporate
IT if you want to complain to them...


From: Marc Glisse 
Sent: Saturday, September 5, 2020 11:29 PM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: A couple GIMPLE questions

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Sat, 5 Sep 2020, Gary Oblock via Gcc wrote:

> First off one of the questions just me being curious but
> second is quite serious. Note, this is GIMPLE coming
> into my optimization and not something I've modified.
>
> Here's the C code:
>
> type_t *
> do_comp( type_t *data, size_t len)
> {
>  type_t *res;
>  type_t *x = min_of_x( data, len);
>  type_t *y = max_of_y( data, len);
>
>  res = y;
>  if ( x < y ) res = 0;
>  return res;
> }
>
> And here's the resulting GIMPLE:
>
> ;; Function do_comp.constprop (do_comp.constprop.0, funcdef_no=5, 
> decl_uid=4392, cgraph_uid=3, symbol_order=68) (executed once)
>
> do_comp.constprop (struct type_t * data)
> {
>  struct type_t * res;
>  struct type_t * x;
>  struct type_t * y;
>  size_t len;
>
>   [local count: 1073741824]:
>
>   [local count: 1073741824]:
>  x_2 = min_of_x (data_1(D), 1);
>  y_3 = max_of_y (data_1(D), 1);
>  if (x_2 < y_3)
>goto ; [29.00%]
>  else
>goto ; [71.00%]
>
>   [local count: 311385128]:
>
>   [local count: 1073741824]:
>  # res_4 = PHI 
>  return res_4;
>
> }
>
> The silly question first. In the "if" stmt how does GCC
> get those probabilities? Which it shows as 29.00% and
> 71.00%. I believe they should both be 50.00%.

See the profile_estimate pass dump. One branch makes the function return
NULL, which makes gcc guess that it may be a bit less likely than the
other. Those are heuristics, which are tuned to help on average, but of
course they are sometimes wrong.

> The serious question is what is going on with this phi?
>res_4 = PHI 
>
> This makes zero sense practicality wise to me and how is
> it supposed to be recognized and used? Note, I really do
> need to transform the "0B" into something else for my
> structure reorganization optimization.

That's not a question? Are you asking why PHIs exist at all? They are the
standard way to represent merging in SSA representations. You can iterate
on the PHIs of a basic block, etc.

> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

Could you please get rid of this when posting on public mailing lists?

--
Marc Glisse

Re: A couple GIMPLE questions

2020-09-06 Thread Gary Oblock via Gcc

>That's not a question? Are you asking why PHIs exist at all?
>They are the standard way to represent merging in SSA
>representations. You can iterate on the PHIs of a basic block, etc.

Marc,

I first worked with the SSA form twenty years ago so yes I am
aware of what a phi is... I've just never seen a compiler eliminate
an assignment of a variable to a constant and jam the constant into
the phi where the SSA variable should be. What a phi is all about
is representing data flow and a constant in the phi doesn't seem
to be related to that. I can deal with this but it seems that having to
crawl the phis looking for constants seems baroque. I would hope
there is a control that can suppress this or a transformation
that I can invoke to reverse it...

Thanks,

Gary

From: Marc Glisse 
Sent: Saturday, September 5, 2020 11:29 PM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: A couple GIMPLE questions

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On Sat, 5 Sep 2020, Gary Oblock via Gcc wrote:

> First off one of the questions just me being curious but
> second is quite serious. Note, this is GIMPLE coming
> into my optimization and not something I've modified.
>
> Here's the C code:
>
> type_t *
> do_comp( type_t *data, size_t len)
> {
>  type_t *res;
>  type_t *x = min_of_x( data, len);
>  type_t *y = max_of_y( data, len);
>
>  res = y;
>  if ( x < y ) res = 0;
>  return res;
> }
>
> And here's the resulting GIMPLE:
>
> ;; Function do_comp.constprop (do_comp.constprop.0, funcdef_no=5, 
> decl_uid=4392, cgraph_uid=3, symbol_order=68) (executed once)
>
> do_comp.constprop (struct type_t * data)
> {
>  struct type_t * res;
>  struct type_t * x;
>  struct type_t * y;
>  size_t len;
>
>   [local count: 1073741824]:
>
>   [local count: 1073741824]:
>  x_2 = min_of_x (data_1(D), 1);
>  y_3 = max_of_y (data_1(D), 1);
>  if (x_2 < y_3)
>goto ; [29.00%]
>  else
>goto ; [71.00%]
>
>   [local count: 311385128]:
>
>   [local count: 1073741824]:
>  # res_4 = PHI 
>  return res_4;
>
> }
>
> The silly question first. In the "if" stmt how does GCC
> get those probabilities? Which it shows as 29.00% and
> 71.00%. I believe they should both be 50.00%.

See the profile_estimate pass dump. One branch makes the function return
NULL, which makes gcc guess that it may be a bit less likely than the
other. Those are heuristics, which are tuned to help on average, but of
course they are sometimes wrong.

> The serious question is what is going on with this phi?
>res_4 = PHI 
>
> This makes zero sense practicality wise to me and how is
> it supposed to be recognized and used? Note, I really do
> need to transform the "0B" into something else for my
> structure reorganization optimization.

That's not a question? Are you asking why PHIs exist at all? They are the
standard way to represent merging in SSA representations. You can iterate
on the PHIs of a basic block, etc.

> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

Could you please get rid of this when posting on public mailing lists?

--
Marc Glisse

A couple GIMPLE questions

2020-09-05 Thread Gary Oblock via Gcc

First off one of the questions just me being curious but
second is quite serious. Note, this is GIMPLE coming
into my optimization and not something I've modified.

Here's the C code:

type_t *
do_comp( type_t *data, size_t len)
{
  type_t *res;
  type_t *x = min_of_x( data, len);
  type_t *y = max_of_y( data, len);

  res = y;
  if ( x < y ) res = 0;
  return res;
}

And here's the resulting GIMPLE:

;; Function do_comp.constprop (do_comp.constprop.0, funcdef_no=5, 
decl_uid=4392, cgraph_uid=3, symbol_order=68) (executed once)

do_comp.constprop (struct type_t * data)
{
  struct type_t * res;
  struct type_t * x;
  struct type_t * y;
  size_t len;

   [local count: 1073741824]:

   [local count: 1073741824]:
  x_2 = min_of_x (data_1(D), 1);
  y_3 = max_of_y (data_1(D), 1);
  if (x_2 < y_3)
goto ; [29.00%]
  else
goto ; [71.00%]

   [local count: 311385128]:

   [local count: 1073741824]:
  # res_4 = PHI 
  return res_4;

}

The silly question first. In the "if" stmt how does GCC
get those probabilities? Which it shows as 29.00% and
71.00%. I believe they should both be 50.00%.

The serious question is what is going on with this phi?
res_4 = PHI 

This makes zero sense practicality wise to me and how is
it supposed to be recognized and used? Note, I really do
need to transform the "0B" into something else for my
structure reorganization optimization.

Thanks,

Gary Oblock




CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

A silly question regarding function types

2020-09-03 Thread Gary Oblock via Gcc

Note, isn't a problem, rather, it's something that puzzles me.

On walking a function types argument types this way

for ( arg = TYPE_ARG_TYPES ( func_type);
   arg != NULL;
   arg = TREE_CHAIN ( arg))
{
   .
   .
 }

I noticed an extra void argument that didn't exist
tagged on the end.

I then noticed other code doing this (which I copied:)

for ( arg = TYPE_ARG_TYPES ( func_type);
arg != NULL && arg != void_list_node;
arg = TREE_CHAIN ( arg))
 {
 .
 .
  }

What is going on here???

Thanks,

Gary



CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: Types are confused in inlining

2020-09-03 Thread Gary Oblock via Gcc

>This is absolutely not enough information to guess at the
>issue ;)

That's fair, I was hoping some mad genius out there would confess to a
fubar_adjustment phase that was probably at fault. 

>I suggest you break at the return stmt of make_ssa_name_fn
>looking for t->base.u.version == 101 to see where and with
>which type _101 is created, from there watch *>typed.type
>in case something adjusts the type.

I did the former but I used ssa_name_nodes_created
instead. Which though harder to get at, is unique.
Regarding the later... I guess... But, at various times (on certain
OS versions of certain machines) watch points have been
at bit dubious. I assume on a recent Ubuntu release
on an Intel I7 core this wouldn't be the case???

Thanks,

Gary

From: Richard Biener 
Sent: Wednesday, September 2, 2020 11:31 PM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Types are confused in inlining

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Sep 2, 2020 at 10:19 PM Gary Oblock via Gcc  wrote:
>
> I'm not accusing inlining of having problems but I really
> need to understand what's going on in this situation so I can
> fix my optimization.
>
> The error given is:
> main.c: In function ‘main’:
> main.c:5:1: error: non-trivial conversion in ‘ssa_name’
> 5 | main(void)
>   | ^
> struct type_t *
> unsigned long
> _101 = dedangled_97;
> during GIMPLE pass: fixup_cfg
> etc.
> etc.
>
> I put a conditional breakpoint in gdb where both
> _101 and dedangled_97 were created and low
> and behold they were both set to "unsigned long".
> Does anybody have a clue as to how "_101" got
> changed from "unsigned long" to "struct type_t *"?
> Note, the later is a meaningful type in my program.
> I'm trying to replace all instances of the former as
> part of structure reorganization optimization.) I should
> mention that this GIMPLE stmt is the one that moves
> the value computed in an inlined function into the body
> of code where the inling took place.

This is absolutely not enough information to guess at the
issue ;)

I suggest you break at the return stmt of make_ssa_name_fn
looking for t->base.u.version == 101 to see where and with
which type _101 is created, from there watch *>typed.type
in case something adjusts the type.

> Thanks,
>
> Gary Oblock
>
>
>
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

Types are confused in inlining

2020-09-02 Thread Gary Oblock via Gcc

I'm not accusing inlining of having problems but I really
need to understand what's going on in this situation so I can
fix my optimization.

The error given is:
main.c: In function ‘main’:
main.c:5:1: error: non-trivial conversion in ‘ssa_name’
5 | main(void)
  | ^
struct type_t *
unsigned long
_101 = dedangled_97;
during GIMPLE pass: fixup_cfg
etc.
etc.

I put a conditional breakpoint in gdb where both
_101 and dedangled_97 were created and low
and behold they were both set to "unsigned long".
Does anybody have a clue as to how "_101" got
changed from "unsigned long" to "struct type_t *"?
Note, the later is a meaningful type in my program.
I'm trying to replace all instances of the former as
part of structure reorganization optimization.) I should
mention that this GIMPLE stmt is the one that moves
the value computed in an inlined function into the body
of code where the inling took place.

Thanks,

Gary Oblock





CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: Questions regarding update_stmt and release_ssa_name_fn.

2020-08-27 Thread Gary Oblock via Gcc

> If x_2 is a default def then the IL isn't correct in the first place.  I doubt
> it is that way, btw. - we have verifiers that would blow up if it would.

Richard,

I'm just sharing this so you can tell me whether or not I'm going
crazy. ;-)

This little function is finding that arr_2 = PHI 
is problematic.

void
wolf_fence (
Info *info // Pass level gobal info (might not use it)
  )
{
  struct cgraph_node *node;

  fprintf( stderr,
  "Wolf Fence: Find wolf for default defs with non nop defines\n");

  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
{
  struct function *func = DECL_STRUCT_FUNCTION ( node->decl);
  push_cfun ( func);

  unsigned int len = SSANAMES ( func)->length ();
  for ( unsigned int i = 0; i < len; i++)
{
 tree ssa_name = (*SSANAMES ( func))[i];
 if ( ssa_name == NULL ) continue;
 if ( ssa_defined_default_def_p ( ssa_name) )
   {
 gimple *def_stmt =
SSA_NAME_DEF_STMT ( ssa_name);
 if ( !gimple_nop_p ( def_stmt) )
{
 fprintf ( stderr, "Wolf fence caught :");
 print_gimple_stmt ( stderr, def_stmt, 0);
 gcc_assert (0);
}
   }
   }
pop_cfun ();
}
fprintf( stderr, "Wolf Fence: Didn't find wolf!\n");
}

This is run at the very start of the structure reorg pass
before any of my code did anything at all (except initiate
the structure info with a few flags and the like.)

Here's C code:

- aux.h ---
#include "stdlib.h"
typedef struct type type_t;
struct type {
  double x;
  double y;
};

extern type_t *min_of_x( type_t *, size_t);
- aux.c ---
#include "aux.h"
#include "stdlib.h"

type_t *
min_of_x( type_t *arr, size_t len)
{
  type_t *end_of = arr + len;
  type_t *loc = arr;
  double result = arr->x;
  arr++;
  for( ; arr < end_of ; arr++  ) {
double value = arr->x;
if (  value < result ) {
  result = value;
  loc = arr;
}
  }
  return loc;
}
- main.c --
#include "aux.h"
#include "stdio.h"

int
main(void)
{
  size_t len = 1;
  type_t *data = (type_t *)malloc( len * sizeof(type_t));
  int i;
  for( i = 0; i < len; i++ ) {
data[i].x = drand48();
  }

  type_t *min_x;
  min_x = min_of_x( data, len);

  if ( min_x == 0 ) {
printf("min_x error\n");
exit(-1);
  }

  printf("min_x %e\n" , min_x->x);
}
---
Here's the GIMPLE comining into the structure reoganization pass:

Program:

;; Function min_of_x (min_of_x, funcdef_no=0, decl_uid=4391, cgraph_uid=2, 
symbol_order=23) (executed once)

min_of_x (struct type_t * arr, size_t len)
{
  double value;
  double result;
  struct type_t * loc;
  struct type_t * end_of;

   [local count: 118111600]:
  _1 = len_7(D) * 16;
  end_of_9 = arr_8(D) + _1;
  result_11 = arr_8(D)->x;
  arr_12 = arr_8(D) + 16;
  goto ; [100.00%]

   [local count: 955630225]:
  value_14 = arr_2->x;
  if (result_6 > value_14)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 477815112]:

   [local count: 955630225]:
  # loc_3 = PHI 
  # result_5 = PHI 
  arr_15 = arr_2 + 16;

   [local count: 1073741824]:
  # arr_2 = PHI 
  # loc_4 = PHI 
  # result_6 = PHI 
  if (arr_2 < end_of_9)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 118111600]:
  # loc_13 = PHI 
  return loc_13;

}



;; Function main (main, funcdef_no=1, decl_uid=4389, cgraph_uid=1, 
symbol_order=5) (executed once)

main ()
{
  struct type_t * min_x;
  int i;
  struct type_t * data;

   [local count: 10737416]:
  data_10 = malloc (16);
  goto ; [100.00%]

   [local count: 1063004409]:
  _1 = _4 * 16;
  _2 = data_10 + _1;
  _3 = drand48 ();
  _2->x = _3;
  i_18 = i_6 + 1;

   [local count: 1073741824]:
  # i_6 = PHI <0(2), i_18(3)>
  _4 = (long unsigned int) i_6;
  if (_4 != 1)
goto ; [99.00%]
  else
goto ; [1.00%]

   [local count: 10737416]:
  min_x_12 = min_of_x (data_10, 1);
  if (min_x_12 == 0B)
goto ; [0.04%]
  else
goto ; [99.96%]

   [local count: 4295]:
  __builtin_puts (&"min_x error"[0]);
  exit (-1);

   [local count: 10733121]:
  _5 = min_x_12->x;
  printf ("min_x %e\n", _5);
  return 0;

}

Am I crazy?

Thanks,

Gary









From: Richard Biener 
Sent: Thursday, August 27, 2020 2:04 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Questions regarding update_stmt and release_ssa_name_fn.

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Aug 26, 2020 at 11:32 PM Gary Oblock via Gcc  wrote:
>
> I'm having some major grief with a few related things that I'm try to
> do. The mostly revolve around

Re: Questions regarding update_stmt and release_ssa_name_fn.

2020-08-27 Thread Gary Oblock via Gcc

Richard,

>You need to call update_stmt () if you change SSA operands to
>sth else.

I'm having trouble parsing the "sth else" above. Could you
please rephrase this if it's important to your point. I take
what you mean is if you change any SSA operand to any
statement then update that statement.

Thanks,

Gary

From: Richard Biener 
Sent: Thursday, August 27, 2020 2:04 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Questions regarding update_stmt and release_ssa_name_fn.

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Aug 26, 2020 at 11:32 PM Gary Oblock via Gcc  wrote:
>
> I'm having some major grief with a few related things that I'm try to
> do. The mostly revolve around trying to change the type of an SSA name
> (which I've given up in favor of creating new SSA names and replacing
> the ones I wanted to change.) However, this seems too has its own
> issues.
>
> In one problematic case in particular, I'm seeing a sequence like:
>
> foo_3 = mumble_1 op mumble_2
>
> bar_5 = foo_3 op baz_4
>
> when replace foo_3 with foo_4 the (having the needed new type.)
>
> I'm seeing a later verification phase think
>
> bar_5 = foo_4 op baz_4
>
> is still associated with the foo_3.
>
> Should the transformation above be associated with update_stmt and/or
> release_ssa_name_fn? And if they are both needed is there a proper
> order required.  Note, when I try using them, I'm seeing some malformed
> tree operands that die in horrible ways.
>
> By the way, I realize I can probably simply create a new GIMPLE stmt
> from scratch to replace the ones I'm modifying but this will cause
> some significant code bloat and I want to avoid that if at all
> possible.

You need to call update_stmt () if you change SSA operands to
sth else.

> There is an addition wrinkle to this problem with C code like this
>
> void
> whatever ( int x, .. )
> {
>   :
>   x++;
>   :
> }
>
> I'm seeing x_2 being thought of as default definition in the following
> GIMPLE stmt when it's clearly not since it's defined by the statement.
>
>   x_2 = X_1 + 4
>
> My approach has been to simply make the SSA name to replace x_2a
> normal SSA name and not a default def. Is this not reasonable and
> correct?

If x_2 is a default def then the IL isn't correct in the first place.  I doubt
it is that way, btw. - we have verifiers that would blow up if it would.

Richard.

>
> Thanks,
>
> Gary Oblock
>
> Gary
>
>
>
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any unauthorized review, copying, or distribution of this email 
> (or any attachments thereto) is strictly prohibited. If you are not the 
> intended recipient, please contact the sender immediately and permanently 
> delete the original and any copies of this email and any attachments thereto.

Questions regarding update_stmt and release_ssa_name_fn.

2020-08-26 Thread Gary Oblock via Gcc

I'm having some major grief with a few related things that I'm try to
do. The mostly revolve around trying to change the type of an SSA name
(which I've given up in favor of creating new SSA names and replacing
the ones I wanted to change.) However, this seems too has its own
issues.

In one problematic case in particular, I'm seeing a sequence like:

foo_3 = mumble_1 op mumble_2

bar_5 = foo_3 op baz_4

when replace foo_3 with foo_4 the (having the needed new type.)

I'm seeing a later verification phase think

bar_5 = foo_4 op baz_4

is still associated with the foo_3.

Should the transformation above be associated with update_stmt and/or
release_ssa_name_fn? And if they are both needed is there a proper
order required.  Note, when I try using them, I'm seeing some malformed
tree operands that die in horrible ways.

By the way, I realize I can probably simply create a new GIMPLE stmt
from scratch to replace the ones I'm modifying but this will cause
some significant code bloat and I want to avoid that if at all
possible.

There is an addition wrinkle to this problem with C code like this

void
whatever ( int x, .. )
{
  :
  x++;
  :
}

I'm seeing x_2 being thought of as default definition in the following
GIMPLE stmt when it's clearly not since it's defined by the statement.

  x_2 = X_1 + 4

My approach has been to simply make the SSA name to replace x_2a
normal SSA name and not a default def. Is this not reasonable and
correct?

Thanks,

Gary Oblock

Gary





CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any unauthorized review, copying, or distribution of this email 
(or any attachments thereto) is strictly prohibited. If you are not the 
intended recipient, please contact the sender immediately and permanently 
delete the original and any copies of this email and any attachments thereto.

Re: Silly question about pass numbers

2020-08-12 Thread Gary Oblock via Gcc

Segher,

If this was on the mainline and not in the middle of a
nontrivial optimization effort I would have filed a bug report
and not asked a silly question. 

I'm at a total lost as to how I could have caused the pass
numbers to be backward... but at least have I confirmed that's
what seems to be happening. It's not doing any harm to
anything except the sanity of anybody looking at the pass
dumps...

Thanks,

Gary

From: Segher Boessenkool 
Sent: Wednesday, August 12, 2020 5:45 PM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Silly question about pass numbers

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

Hi!

On Wed, Aug 12, 2020 at 08:26:34PM +, Gary Oblock wrote:
> The files are from the same run:
> -rw-rw-r-- 1 gary gary  3855 Aug 12 12:49 exe.ltrans0.ltrans.074i.cp
> -rw-rw-r-- 1 gary gary 16747 Aug 12 12:49 
> exe.ltrans0.ltrans.087i.structure-reorg
>
> By the time .cp was created inlining results in only main existing.
> In the .structure-reorg file there are three functions.

It does not matter what time the dump files were last opened (or created
or written to).

> Not only am I seeing things in .cp (beyond a shadow of a doubt)
> that were created in structure  reorganization, inlining has also
> been done and its pass number of 79!
>
> Note, this is not hurting me in any way other than violating my
> beliefs about pass numbering.

I cannot check on any of that because this is not in mainline GCC?
It is a lot easier if you ask us about problems we may be able to
reproduce ;-)  Like maybe something with only cp and inline?

Segher

Re: Silly question about pass numbers

2020-08-12 Thread Gary Oblock via Gcc

Segher,

First, thanks for replying.

The files are from the same run:
-rw-rw-r-- 1 gary gary  3855 Aug 12 12:49 exe.ltrans0.ltrans.074i.cp
-rw-rw-r-- 1 gary gary 16747 Aug 12 12:49 
exe.ltrans0.ltrans.087i.structure-reorg

By the time .cp was created inlining results in only main existing.
In the .structure-reorg file there are three functions.

Not only am I seeing things in .cp (beyond a shadow of a doubt)
that were created in structure  reorganization, inlining has also
been done and its pass number of 79!

Note, this is not hurting me in any way other than violating my
beliefs about pass numbering.

Thanks again,

Gary

From: Segher Boessenkool 
Sent: Wednesday, August 12, 2020 1:09 PM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Silly question about pass numbers

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On Tue, Aug 11, 2020 at 08:27:29PM +, Gary Oblock via Gcc wrote:
> For these two dump files:
>
> exe.ltrans0.ltrans.074i.cp
>
> and
>
> exe.ltrans0.ltrans.087i.structure-reorg
>
> doesn't the ".074i." mean that this dump was created
> before the ".087i." dump?

It means that the 074 pass is earlier than the 087 pass in the pass
pipeline.

> If so then why does the ".074i." show GIMPLE that was
> created in the structure-reorg pass?

Were they created at the same time, or is one of the dump files old?

Are you looking at the same functions?

Etc.

Segher

Why am I seeing free.2 instead of free in exe.ltrans0.ltrans.s??

2020-08-11 Thread Gary Oblock via Gcc

Note, I'm getting close to getting my part of the structure reorganization
optimization minimally functional (my question about value range propagation 
remains open since I re-enabled a couple of optimizations to bypass it.) 
Therefore this is actually important for me to resolve.

I obviously generated calls to the standard library function "free."
Also, similar calls to malloc didn't have that ".2" appended on them.
I'd normally handle a problem like this with a Google search but
"free.2" gets turned into "free to" and I get an insane number of
junk search results.

This is obviously an easy question to answer for those that
have seen something similar in the past.

Thanks,

Gary Oblock




CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

Silly question about pass numbers

2020-08-11 Thread Gary Oblock via Gcc

For these two dump files:

exe.ltrans0.ltrans.074i.cp

and

exe.ltrans0.ltrans.087i.structure-reorg

doesn't the ".074i." mean that this dump was created
before the ".087i." dump?

If so then why does the ".074i." show GIMPLE that was
created in the structure-reorg pass?

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

Problem cropping up in Value Range Propogation

2020-08-10 Thread Gary Oblock via Gcc

I'm trying to debug a problem cropping up in value range propagation.
Ironically I probably own an original copy 1995 copy of the paper it's
based on but that's not going to be much help since I'm lost in the
weeds.  It's running on some optimization (my structure reorg
optimization) generated GIMPLE statements.

Here's the GIMPLE dump:

Function max_of_y (max_of_y, funcdef_no=1, decl_uid=4391, cgraph_uid=2, 
symbol_order=20) (executed once)

max_of_y (unsigned long data, size_t len)
{
  double value;
  double result;
  size_t i;

   [local count: 118111600]:
  field_arry_addr_14 = _reorg_base_var_type_t.y;
  index_15 = (sizetype) data_27(D);
  offset_16 = index_15 * 8;
  field_addr_17 = field_arry_addr_14 + offset_16;
  field_val_temp_13 = MEM  [(void *)field_addr_17];
  result_8 = field_val_temp_13;
  goto ; [100.00%]

   [local count: 955630225]:
  _1 = i_3 * 16;
  PPI_rhs1_cast_18 = (unsigned long) data_27(D);
  PPI_rhs2_cast_19 = (unsigned long) _1;
  PtrPlusInt_Adj_20 = PPI_rhs2_cast_19 / 16;
  PtrPlusInt_21 = PPI_rhs1_cast_18 + PtrPlusInt_Adj_20;
  dedangled_27 = (unsigned long) PtrPlusInt_21;
  field_arry_addr_23 = _reorg_base_var_type_t.y;
  index_24 = (sizetype) dedangled_27;
  offset_25 = index_24 * 8;
  field_addr_26 = field_arry_addr_23 + offset_25;
  field_val_temp_22 = MEM  [(void *)field_addr_26];
  value_11 = field_val_temp_22;
  if (result_5 < value_11)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 477815112]:

   [local count: 955630225]:
  # result_4 = PHI 
  i_12 = i_3 + 1;

   [local count: 1073741824]:
  # i_3 = PHI <1(2), i_12(5)>
  # result_5 = PHI 
  if (i_3 < len_9(D))
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 118111600]:
  # result_10 = PHI 
  return result_10;
}

The failure in VRP is occurring on

offset_16 = data_27(D) * 8;

which is the from two adjacent statements above

  index_15 = (sizetype) data_27(D);
  offset_16 = index_15 * 8;

being merged together.

Note, the types of index_15/16 are sizetype and data_27 is unsigned
long.
The error message is:

internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ 
(error_mark) in to_wide,

Things only start to look broken in value_range::lower_bound in
value-range.cc when

return wi::to_wide (t);

is passed error_mark_node in t. It's getting it from m_min just above.
My observation is that m_min is not always error_mark_node. In fact, I
seem to think you need to use set_varying to get this to even happen.

Note, the ssa_propagation_engine processed the statement "offset_16 =
data..."  multiple times before failing on it. What oh what is
happening and how in the heck did I cause it???

Please, somebody throw me a life preserver on this.

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

A problem with DECL_FIELD_OFFSET in something I declared

2020-08-06 Thread Gary Oblock via Gcc

This problem is from my structure reorganization optimization
optimization code (simplified and cleaned to illustrate
the problem.

Here's what happening below at the high level

>From the user program:

typedef struct type type_t;
struct type {
  double x;
  double y;
}:

I'll be creating:

typedef struct type_prime type_prime_t;
struct type_prime {
  double x *;
  double y *;
};

and

type_prime_t base_for_type_prime;

Later, when running partial redundancy elimination,
PRE attempts to access the DECL_FIELD_OFFSET of field
x and finds that the the DECL_FIELD_OFFSET is NULL.

Note, I suppose I could run gdb to track down in the storage
layout code what caused it to bypass place_field (where the
offset probably should be initialized) but I'd still not know
what I'm doing wrong below.

Please, somebody have a look and let me know.

Thanks,

Gary Oblock



// An alternate method of creating a record type
// I tried both ways
#define LANG_HOOKS 1

static void
create_a_new_type_and_base_var ( Info_t *info, tree type)
{
  // For the sake of this code "ri" is just a place with
  // interesting stuff about "type"
  ReorgType_t *ri = get_reorgtype_info ( type, info);
  if ( ri != NULL ) {

#if FROM_HOOKS
tree reorg_type_prime =
  lang_hooks.types.make_type (RECORD_TYPE);
#else
tree reorg_type_prime =
  build_variant_type_copy ( type MEM_STAT_DECL);
#endif

ri->reorg_ver_type = reorg_type_prime;

// Code to create name of reorg_type_prime ... irrelevant
//   :
TYPE_NAME ( reorg_type_prime) = get_identifier ( rec_name);

// Build the new pointer type fields

tree field;
tree new_fields = NULL;
for (
   #if LANG_HOOKS
   field = TYPE_FIELDS ( type);
   #else
   field = TYPE_FIELDS ( reorg_type_prime);
   #endif
   field;
   field = DECL_CHAIN ( field))
  {
 tree tree_type = TREE_TYPE ( field);
 tree new_fld_type = build_pointer_type ( tree_type);
 // I use the same name as the field of type
 tree new_decl =
   build_decl ( DECL_SOURCE_LOCATION (field),
   FIELD_DECL, DECL_NAME (field), new_fld_type);
 DECL_CONTEXT ( new_decl) = reorg_type_prime;
 layout_decl ( new_decl, 0);

 // I might be missing a bunch of attributes (see tree-nested.c:899)

 DECL_CHAIN ( new_decl) = new_fields;
 new_fields = new_decl;
  }

// store reversed fields into reorg_type_prime (having them in the same
// order in as in type makes sense.)
TYPE_FIELDS ( reorg_type_prime) = NULL;
tree next_fld;
for ( field = new_fields; field; field = next_fld)
  {
 next_fld = DECL_CHAIN ( field);
 DECL_CHAIN ( field) = TYPE_FIELDS ( reorg_type_prime);
 TYPE_FIELDS ( reorg_type_prime) = field;
  }
// Fix-up the layout
layout_type ( reorg_type_prime);

// Create the base element for the transformed type.
tree base_var =
  build_decl ( UNKNOWN_LOCATION, VAR_DECL, NULL_TREE, reorg_type_prime);

// More name creation code here... irrelevant
   //:
DECL_NAME ( base_var) = get_identifier ( base_name);

// Some attributes I really don't understand...
TREE_STATIC ( base_var) = 1;
TREE_ADDRESSABLE  ( base_var) = 1;
DECL_NONALIASED ( base_var) = 1;
SET_DECL_ALIGN ( base_var, TYPE_ALIGN ( ri->reorg_ver_type));

// Is this necessary guys???
varpool_node::finalize_decl ( base_var);

relayout_decl ( base_var);

ri->base = base_var;
  }
}




CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

Re: Gcc Digest, Vol 5, Issue 52

2020-07-29 Thread Gary Oblock via Gcc

Richard,

Thanks, I had no idea about the immediate uses mechanism and
using it will speed things up a bit and make them more reliable.
However, I'll still have to scan the LHS of each assignment unless
there's a mechanism to traverse all the SSAs for a function.
Note, I assume there is also a mechanism to add and remove
immediate use instances. If I can find it I'll post a question to the list.

I do the the patching on a per function basis immediately after
applying the transforms. It was going to be a scan of all the
GIMPLE. What you've told me might make it a bit of a misnomer
to call what I intend to do now, a scan. The default defs problem
happened when the original scan tried to simply modify the type
of a default def. There didn't seem to be a way of doing this and I've
since learned this in fact associates declarations not types but with
a declaration. Note, just modifying the type of normal ssa names
seemed to work but I can't in fact know it actually would have.

I'm not sure I can do justice to the other transformations but
here is one larger example. Note, since I'm currently only
dealing with dynamically allocated array I'll only see "a->f" and
not "a[i].f" so you are getting the former.

 _2 = _1->f

turns into

get_field_arry_addr: new_3 = array_base.f_array_field
get_index   : new_4 = (sizetype)_1
get_offset   : new_5  = new_4 * size_of_f_element
get_field_addr: new_6 = new_3 + new_5   // uses pointer arith
temp_set: new_7 = * new_6
final_set  : _2   = new_7

I hope that's sufficient to satisfy your curiosity because the only other
large transformation currently coded is that for the malloc which would
take me quite a while to put together an example of. Note, these are
shown in the HL design doc which I sent you. Though like battle plans,
no design no matter how good survives coding intact.

Thanks again,

Gary

From: Richard Biener 
Sent: Wednesday, July 29, 2020 5:42 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Gcc Digest, Vol 5, Issue 52

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On Tue, Jul 28, 2020 at 11:02 PM Gary Oblock  wrote:
>
> Richard,
>
> I wasn't aware of release_defs so I'll add that for certain.
>
> When I do a single transformation as part of the transformation pass
> each transformation uses the correct types internally but on the edges
> emits glue code that will be transformed via a dangling type fixup pass.
>
> For example when adding something to a pointer:
>
> _2 = _1 + k
>
> Where _1 & _2 are the old point types I'll
>  emit
>
> new_3 = (type_convert)_1
> new_4 = (type_convert)k
> new_5 = new_4 / struct_size // truncating divide
> new_6 = new_3 + new_5
> _2   = (type_convert)_new_6
>
> Note, the casting is done with CONVERT_EXPR
> which is harmless when I create new ssa names
> and set the appropriate operands in

OK, so you're funneling the new "index" values through
the original pointer variable _1?  But then I don't see
where the patching up of SSA names and the default
def issue happens.

> new_3 = (type_convert)_1
> _2 = (type_convert)new_6
>
> to
>
> new_3 = new_7
> new_8 = new_6
>
> Now I might actually find via a look up that
> _1 and/or _2 were already mapped to
> new_7 and/or new_8 but that's irrelevant.
>
> To intermix the applications of the transformations and
> the patching of these dangling types seems like I'd
> need to do an insanely ugly recursive walk of each functions
> body.
>
> I'm curious when you mention def-use I'm not aware of
> GCC using def-use chains except at the RTL level.
> Is there a def-use mechanism in GIMPLE because
> in SSA form it's trivial to find the definition of
> a temp variable but non trivial to find the use of
> it. Which I think is a valid reason for fixing up the
> dangling types of temps in a scan.

In GIMPLE SSA we maintain a list of uses for each SSA
def, available via the so called immediate-uses.  You
can grep for uses of FOR_EACH_IMM_USE[_FAST]

>
> Note, I'll maintain a mapping like you suggest but not use
> it at transformation application time. Furthermore,
> I'll initialize the mapping with the default defs from
> the DECLs so I won't have to mess with them on the fly.
> Now at the time in the scan when I find uses and defs of
> a dangling type I'd like to simply modify the associated operands
> of the statement. What is the real advantage creating a new
> statement with the correct types? I'll be using SSA_NAME_DEF_STMT
> if the newly created ssa name is on the left hand side. Also, the
> ssa_name it replaces will no longer be referenced

Re: Gcc Digest, Vol 5, Issue 52

2020-07-28 Thread Gary Oblock via Gcc

Richard,

I wasn't aware of release_defs so I'll add that for certain.

When I do a single transformation as part of the transformation pass
each transformation uses the correct types internally but on the edges
emits glue code that will be transformed via a dangling type fixup pass.

For example when adding something to a pointer:

_2 = _1 + k

Where _1 & _2 are the old point types I'll
 emit

new_3 = (type_convert)_1
new_4 = (type_convert)k
new_5 = new_4 / struct_size // truncating divide
new_6 = new_3 + new_5
_2   = (type_convert)_new_6

Note, the casting is done with CONVERT_EXPR
which is harmless when I create new ssa names
and set the appropriate operands in

new_3 = (type_convert)_1
_2 = (type_convert)new_6

to

new_3 = new_7
new_8 = new_6

Now I might actually find via a look up that
_1 and/or _2 were already mapped to
new_7 and/or new_8 but that's irrelevant.

To intermix the applications of the transformations and
the patching of these dangling types seems like I'd
need to do an insanely ugly recursive walk of each functions
body.

I'm curious when you mention def-use I'm not aware of
GCC using def-use chains except at the RTL level.
Is there a def-use mechanism in GIMPLE because
in SSA form it's trivial to find the definition of
a temp variable but non trivial to find the use of
it. Which I think is a valid reason for fixing up the
dangling types of temps in a scan.

Note, I'll maintain a mapping like you suggest but not use
it at transformation application time. Furthermore,
I'll initialize the mapping with the default defs from
the DECLs so I won't have to mess with them on the fly.
Now at the time in the scan when I find uses and defs of
a dangling type I'd like to simply modify the associated operands
of the statement. What is the real advantage creating a new
statement with the correct types? I'll be using SSA_NAME_DEF_STMT
if the newly created ssa name is on the left hand side. Also, the
ssa_name it replaces will no longer be referenced by the end of the
scan pass.

Note, I do have a escape mechanism in a qualification
pre-pass to the transformations. It's not intended as
catch-all for things I don't understand rather it's an
aid to find possible new cases. However, there are
legitimate things at this point in time during development
of this optimization that I need to spot things this way. Later,
when points to analysis is integrated falling through to
the default case behavior will likely cause an internal error.

Thanks,

Gary

From: Richard Biener 
Sent: Tuesday, July 28, 2020 12:07 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Gcc Digest, Vol 5, Issue 52

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On Tue, Jul 28, 2020 at 4:36 AM Gary Oblock via Gcc  wrote:
>
> Almost all of the makes sense to.
>
> I'm not sure what a conditionally initialized pointer is.

{
  void *p;
  if (condition)
p = ...;
  if (other condition)
 ... use p;

will end up with a PHI node after the conditional init with
one PHI argument being the default definition SSA name
for 'p'.

> You mention VAR_DECL but I assume this is for
> completeness and not something I'll run across
> associated with a default def (but then again I don't
> understand notion of a conditionally initialized
> pointer.)
>
> I'm at the moment only dealing with a single malloced
> array of structures of the given type (though multiple types could have this 
> property.) I intend to extend this to cover multiple array and static 
> allocations but I need to get the easiest case working first. This means no 
> side pointers are needed and if and when I need them pointer will get 
> transformed into a base and index pair.
>
> I intend to do the creation of new ssa names as a separate pass from the 
> gimple transformations. So I will technically be creating for the duration of 
> the pass possibly two defs associated with a single gimple statement. Do I 
> need to delete the old ssa names
> via some mechanism?

When you remove the old definition do

   gsi_remove (, true); // gsi points at stmt
   release_defs (stmt);

note that as far as I understand you need to modify the stmts using
the former pointer (since its now an index), and I would not recommend
to make creation of new SSA names a separate pass, instead create
them when you alter the original definition and maintain a map
between old and new SSA name.

I haven't dug deep enough into your figure how you identify things
to modify (well, I fear you're just scanning for "uses" of the changed
type ...), but in the scheme I think should be implemented you'd
follow the SSA def->use links for both tracking an objects life
as well as for modifying the accesses.

With just scanning for types I am quite sure you'll run into
cases where you

Re: Gcc Digest, Vol 5, Issue 52

2020-07-27 Thread Gary Oblock via Gcc

Almost all of the makes sense to.

I'm not sure what a conditionally initialized pointer is.

You mention VAR_DECL but I assume this is for
completeness and not something I'll run across
associated with a default def (but then again I don't
understand notion of a conditionally initialized
pointer.)

I'm at the moment only dealing with a single malloced
array of structures of the given type (though multiple types could have this 
property.) I intend to extend this to cover multiple array and static 
allocations but I need to get the easiest case working first. This means no 
side pointers are needed and if and when I need them pointer will get 
transformed into a base and index pair.

I intend to do the creation of new ssa names as a separate pass from the gimple 
transformations. So I will technically be creating for the duration of the pass 
possibly two defs associated with a single gimple statement. Do I need to 
delete the old ssa names
via some mechanism?

By the way this is really helpful information. The only
other person on the project, Erick, is a continent away
and has about as much experience with gimple as
me but a whole heck lot less compiler experience.

Thanks,

Gary


From: Gcc  on behalf of gcc-requ...@gcc.gnu.org 

Sent: Monday, July 27, 2020 1:33 AM
To: gcc@gcc.gnu.org 
Subject: Gcc Digest, Vol 5, Issue 52

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


Send Gcc mailing list submissions to
gcc@gcc.gnu.org

To subscribe or unsubscribe via the World Wide Web, visit
http://gcc.gnu.org/mailman/listinfo/gcc
or, via email, send a message with subject or body 'help' to
gcc-requ...@gcc.gnu.org

You can reach the person managing the list at
gcc-ow...@gcc.gnu.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Gcc digest..."

Re: Problems with changing the type of an ssa name

2020-07-26 Thread Gary Oblock via Gcc

Richard,

As you know I'm working on a structure reorganization optimization.
The particular one I'm working on is called instance interleaving.
For the particular case I'm working on now, there is a single array
of structures being transformed, a pointer to an element of the
array is transformed into an index into what is now a structure
of arrays. Note, I did share my HL design document with you so
there are more details in there if you need them. So what all this
means is for this example

typedef struct fu fu_t;
struct fu {
  char x;
  inty;
  double z;
};
  :
  :
  fu_t *fubar = (fu_t*)malloc(...);
  fu_t *baz;

That fubar and baz no longer are pointer types and need to be
transformed into some integer type (say _index_fu_t.) Thus if
I encounter an ssa_name of type "fu_t *", I'll need to modify its
type be _index_fu_t. This is of course equivalent to replacing
that ssa name with a new one of type _index_fu_t.

Now, how do I actually do either of these? My attempts at
former all failed and the  later seems equally difficult for
the default defs. Note, prefer modifying them to replacing
them because it seems more reasonable and it also seems
to work except for the default defs.

I really need some help with this Richard.

Thanks,

Gary

From: Richard Biener 
Sent: Saturday, July 25, 2020 10:48 PM
To: Gary Oblock ; gcc@gcc.gnu.org 
Subject: Re: Problems with changing the type of an ssa name

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On July 25, 2020 10:47:59 PM GMT+02:00, Gary Oblock  
wrote:
>Richard,
>
>I suppose that might be doable but aren't there any ramifications
>from the fact that the problematic ssa_names are the default defs?
>I can imagine easily replacing all the ssa names except those that
>are default defs.

Well, just changing the SSA names doesn't make it less ramifications. You have 
to know what you are doing.

So - what's the reason you need to change those SSA name types?

Richard.

>Gary
>
>From: Richard Biener 
>Sent: Friday, July 24, 2020 11:16 PM
>To: Gary Oblock ; Gary Oblock via Gcc
>; gcc@gcc.gnu.org 
>Subject: Re: Problems with changing the type of an ssa name
>
>[EXTERNAL EMAIL NOTICE: This email originated from an external sender.
>Please be mindful of safe email handling and proprietary information
>protection practices.]
>
>
>On July 25, 2020 7:30:48 AM GMT+02:00, Gary Oblock via Gcc
> wrote:
>>If you've followed what I've been up to via my questions
>>on the mailing list, I finally traced my latest big problem
>>back to to my own code. In a nut shell here is what
>>I'm doing.
>>
>>I'm creating a new type exaactly like this:
>>
>>tree pointer_rep =
>>  make_signed_type ( TYPE_PRECISION ( pointer_sized_int_node));
>>TYPE_MAIN_VARIANT ( pointer_rep) =
>>  TYPE_MAIN_VARIANT ( pointer_sized_int_node);
>>const char *gcc_name =
>>identifier_to_locale ( IDENTIFIER_POINTER ( TYPE_NAME (
>>ri->gcc_type)));
>>size_t len =
>>  strlen ( REORG_SP_PTR_PREFIX) + strlen ( gcc_name);
>>char *name = ( char *)alloca(len + 1);
>>strcpy ( name, REORG_SP_PTR_PREFIX);
>>strcat ( name, gcc_name);
>>TYPE_NAME ( pointer_rep) = get_identifier ( name);
>>
>>I detect an ssa_name that I want to change to have this type
>>and change it thusly. Note, this particular ssa_name is a
>>default def which I seems to be very pertinent (since it's
>>the only case that fails.)
>>
>>modify_ssa_name_type ( an_ssa_name, pointer_rep);
>>
>>void
>>modify_ssa_name_type ( tree ssa_name, tree type)
>>{
>>  // This rips off the code in make_ssa_name_fn with a
>>  // modification or two.
>>
>>  if ( TYPE_P ( type) )
>>{
>>   TREE_TYPE ( ssa_name) = TYPE_MAIN_VARIANT ( type);
>>   if ( ssa_defined_default_def_p ( ssa_name) )
>>  {
>> // I guessing which I know is a terrible thing to do...
>> SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, TYPE_MAIN_VARIANT (
>type));
>>   }
>> else
>>   {
>>   // The following breaks defaults defs hence the check
>above.
>> SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, NULL_TREE);
>>   }
>>}
>> else
>>{
>>  TREE_TYPE ( ssa_name) = TREE_TYPE ( type);
>>  SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, type);
>>}
>>}
>>
>>After this it dies when trying to call print_generic_expr with the ssa
>>name.
>>

Re: Problems with changing the type of an ssa name

2020-07-25 Thread Gary Oblock via Gcc

Richard,

I suppose that might be doable but aren't there any ramifications
from the fact that the problematic ssa_names are the default defs?
I can imagine easily replacing all the ssa names except those that
are default defs.

Gary

From: Richard Biener 
Sent: Friday, July 24, 2020 11:16 PM
To: Gary Oblock ; Gary Oblock via Gcc 
; gcc@gcc.gnu.org 
Subject: Re: Problems with changing the type of an ssa name

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On July 25, 2020 7:30:48 AM GMT+02:00, Gary Oblock via Gcc  
wrote:
>If you've followed what I've been up to via my questions
>on the mailing list, I finally traced my latest big problem
>back to to my own code. In a nut shell here is what
>I'm doing.
>
>I'm creating a new type exaactly like this:
>
>tree pointer_rep =
>  make_signed_type ( TYPE_PRECISION ( pointer_sized_int_node));
>TYPE_MAIN_VARIANT ( pointer_rep) =
>  TYPE_MAIN_VARIANT ( pointer_sized_int_node);
>const char *gcc_name =
>identifier_to_locale ( IDENTIFIER_POINTER ( TYPE_NAME (
>ri->gcc_type)));
>size_t len =
>  strlen ( REORG_SP_PTR_PREFIX) + strlen ( gcc_name);
>char *name = ( char *)alloca(len + 1);
>strcpy ( name, REORG_SP_PTR_PREFIX);
>strcat ( name, gcc_name);
>TYPE_NAME ( pointer_rep) = get_identifier ( name);
>
>I detect an ssa_name that I want to change to have this type
>and change it thusly. Note, this particular ssa_name is a
>default def which I seems to be very pertinent (since it's
>the only case that fails.)
>
>modify_ssa_name_type ( an_ssa_name, pointer_rep);
>
>void
>modify_ssa_name_type ( tree ssa_name, tree type)
>{
>  // This rips off the code in make_ssa_name_fn with a
>  // modification or two.
>
>  if ( TYPE_P ( type) )
>{
>   TREE_TYPE ( ssa_name) = TYPE_MAIN_VARIANT ( type);
>   if ( ssa_defined_default_def_p ( ssa_name) )
>  {
> // I guessing which I know is a terrible thing to do...
> SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, TYPE_MAIN_VARIANT ( type));
>   }
> else
>   {
>   // The following breaks defaults defs hence the check above.
> SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, NULL_TREE);
>   }
>}
> else
>{
>  TREE_TYPE ( ssa_name) = TREE_TYPE ( type);
>  SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, type);
>}
>}
>
>After this it dies when trying to call print_generic_expr with the ssa
>name.
>
>Here's the bottom most complaint from the internal error:
>
>tree check: expected tree that contains ‘decl minimal’ structure, have
>‘integer_type’ in dump_generic_node, at tree-pretty-print.c:3154
>
>Can anybody tell what I'm doing wrong?

Do not modify existing SSA names, instead create a new one and replace uses of 
the old.

Richard.

>Thank,
>
>Gary
>
>
>
>
>CONFIDENTIALITY NOTICE: This e-mail message, including any attachments,
>is for the sole use of the intended recipient(s) and contains
>information that is confidential and proprietary to Ampere Computing or
>its subsidiaries. It is to be used solely for the purpose of furthering
>the parties' business relationship. Any review, copying, or
>distribution of this email (or any attachments thereto) is strictly
>prohibited. If you are not the intended recipient, please contact the
>sender immediately and permanently delete the original and any copies
>of this email and any attachments thereto.

Problems with changing the type of an ssa name

2020-07-24 Thread Gary Oblock via Gcc

If you've followed what I've been up to via my questions
on the mailing list, I finally traced my latest big problem
back to to my own code. In a nut shell here is what
I'm doing.

I'm creating a new type exaactly like this:

tree pointer_rep =
  make_signed_type ( TYPE_PRECISION ( pointer_sized_int_node));
TYPE_MAIN_VARIANT ( pointer_rep) =
  TYPE_MAIN_VARIANT ( pointer_sized_int_node);
const char *gcc_name =
  identifier_to_locale ( IDENTIFIER_POINTER ( TYPE_NAME ( ri->gcc_type)));
size_t len =
  strlen ( REORG_SP_PTR_PREFIX) + strlen ( gcc_name);
char *name = ( char *)alloca(len + 1);
strcpy ( name, REORG_SP_PTR_PREFIX);
strcat ( name, gcc_name);
TYPE_NAME ( pointer_rep) = get_identifier ( name);

I detect an ssa_name that I want to change to have this type
and change it thusly. Note, this particular ssa_name is a
default def which I seems to be very pertinent (since it's
the only case that fails.)

modify_ssa_name_type ( an_ssa_name, pointer_rep);

void
modify_ssa_name_type ( tree ssa_name, tree type)
{
  // This rips off the code in make_ssa_name_fn with a
  // modification or two.

  if ( TYPE_P ( type) )
{
   TREE_TYPE ( ssa_name) = TYPE_MAIN_VARIANT ( type);
   if ( ssa_defined_default_def_p ( ssa_name) )
  {
 // I guessing which I know is a terrible thing to do...
 SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, TYPE_MAIN_VARIANT ( 
type));
   }
 else
   {
 // The following breaks defaults defs hence the check above.
 SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, NULL_TREE);
   }
}
 else
{
  TREE_TYPE ( ssa_name) = TREE_TYPE ( type);
  SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, type);
}
}

After this it dies when trying to call print_generic_expr with the ssa name.

Here's the bottom most complaint from the internal error:

tree check: expected tree that contains ‘decl minimal’ structure, have 
‘integer_type’ in dump_generic_node, at tree-pretty-print.c:3154

Can anybody tell what I'm doing wrong?

Thank,

Gary




CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

Re: Three issues

2020-07-22 Thread Gary Oblock via Gcc

Richard,

My wolf fence failed to detect an issue at the end of my pass
so I'm now hunting for a problem I caused in a following pass.

Your thoughts?

Gary

- Wolf Fence Follows -
int
wf_func ( tree *slot, tree *dummy)
{
  tree t_val = *slot;
  gcc_assert( t_val->ssa_name.var);
  return 0;
}

void
wolf_fence (
Info *info // Pass level gobal info (might not use it)
  )
{
  struct cgraph_node *node;
  fprintf( stderr,
  "Wolf Fence: Find wolf via gcc_assert(t_val->ssa_name.var)\n");
  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
{
  struct function *func = DECL_STRUCT_FUNCTION ( node->decl);
  push_cfun ( func);
  DEFAULT_DEFS ( func)->traverse_noresize < tree *, wf_func> ( NULL);
  pop_cfun ();
}
  fprintf( stderr, "Wolf Fence: Didn't find wolf!\n");
}

From: Richard Biener 
Sent: Wednesday, July 22, 2020 2:32 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Three issues

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Jul 22, 2020 at 12:51 AM Gary Oblock via Gcc  wrote:
>
> Some background:
>
> This is in the dreaded structure reorganization optimization that I'm
> working on. It's running at LTRANS time with '-flto-partition=one'.
>
> My issues in order of importance are:
>
> 1) In gimple-ssa.h, the equal method for ssa_name_hasher
> has a segfault because the "var" field of "a" is (nil).
>
> struct ssa_name_hasher : ggc_ptr_hash
> {
>   /* Hash a tree in a uid_decl_map.  */
>
>   static hashval_t
>   hash (tree item)
>   {
> return item->ssa_name.var->decl_minimal.uid;
>   }
>
>   /* Return true if the DECL_UID in both trees are equal.  */
>
>   static bool
>   equal (tree a, tree b)
>   {
>   return (a->ssa_name.var->decl_minimal.uid == 
> b->ssa_name.var->decl_minimal.uid);
>   }
> };
>
> The parameter "a" is associated with "*entry" on the 2nd to last
> line shown (it's trimmed off after that.) This from hash-table.h:
>
> template template class Allocator>
> typename hash_table::value_type &
> hash_table
> ::find_with_hash (const compare_type , hashval_t hash)
> {
>   m_searches++;
>   size_t size = m_size;
>   hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
>
>   if (Lazy && m_entries == NULL)
> m_entries = alloc_entries (size);
>
> #if CHECKING_P
>   if (m_sanitize_eq_and_hash)
> verify (comparable, hash);
> #endif
>
>   value_type *entry = _entries[index];
>   if (is_empty (*entry)
>   || (!is_deleted (*entry) && Descriptor::equal (*entry, comparable)))
> return *entry;
>   .
>   .
>
> Is there any way this could happen other than by a memory corruption
> of some kind? This is a show stopper for me and I really need some help on
> this issue.
>
> 2) I tried to dump out all the gimple in the following way at the very
> beginning of my program:
>
> void
> print_program ( FILE *file, int leading_space )
> {
>   struct cgraph_node *node;
>   fprintf ( file, "%*sProgram:\n", leading_space, "");
>
>   // Print Global Decls
>   //
>   varpool_node *var;
>   FOR_EACH_VARIABLE ( var)
>   {
> tree decl = var->decl;
> fprintf ( file, "%*s", leading_space, "");
> print_generic_decl ( file, decl, (dump_flags_t)0);
> fprintf ( file, "\n");
>   }
>
>   FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
>   {
> struct function *func = DECL_STRUCT_FUNCTION ( node->decl);
> dump_function_header ( file, func->decl, (dump_flags_t)0);
> dump_function_to_file ( func->decl, file, (dump_flags_t)0);
>   }
> }
>
> When I run this the first two (out of three) functions print
> just fine. However, for the third, func->decl is (nil) and
> it segfaults.
>
> Now the really odd thing is that this works perfectly at the
> end or middle of my optimization.
>
> What gives?
>
> 3) For my bug in (1) I got so distraught that I ran valgrind which
> in my experience is an act of desperation for compilers.
>
> None of the errors it spotted are associated with my optimization
> (although it oh so cleverly pointed out the segfault) however it
> showed the following:
>
> ==18572== Invalid read of size 8
> ==18572==at 0x1079DC1: execute_one_pass(opt_pass*) (passes.c:2550)
> ==18572==by 0x107ABD3: execute_ipa_pass_list(opt_pass*) (passes.c:2929)
> ==18572==by 0xAC0E52: symbol_table::compile() (cgraphunit.c:2786)
> ==18572==by 0x9915A9: lto_main() (lto.c:653)
> ==18572=

Re: Three issues

2020-07-22 Thread Gary Oblock via Gcc

Richard,

I was really hopeful about your suggestions but I went over my code and
anything that modified anything had a cfun_push and cfun_pop associated with it.

Also, enabling the extra annotations didn't make a difference.

I'm thinking a wolf fence test that scans for malformed default_def hash table
entries is my only recourse at this point.

Thanks,

Gary

From: Richard Biener 
Sent: Wednesday, July 22, 2020 2:32 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: Three issues

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Jul 22, 2020 at 12:51 AM Gary Oblock via Gcc  wrote:
>
> Some background:
>
> This is in the dreaded structure reorganization optimization that I'm
> working on. It's running at LTRANS time with '-flto-partition=one'.
>
> My issues in order of importance are:
>
> 1) In gimple-ssa.h, the equal method for ssa_name_hasher
> has a segfault because the "var" field of "a" is (nil).
>
> struct ssa_name_hasher : ggc_ptr_hash
> {
>   /* Hash a tree in a uid_decl_map.  */
>
>   static hashval_t
>   hash (tree item)
>   {
> return item->ssa_name.var->decl_minimal.uid;
>   }
>
>   /* Return true if the DECL_UID in both trees are equal.  */
>
>   static bool
>   equal (tree a, tree b)
>   {
>   return (a->ssa_name.var->decl_minimal.uid == 
> b->ssa_name.var->decl_minimal.uid);
>   }
> };
>
> The parameter "a" is associated with "*entry" on the 2nd to last
> line shown (it's trimmed off after that.) This from hash-table.h:
>
> template template class Allocator>
> typename hash_table::value_type &
> hash_table
> ::find_with_hash (const compare_type , hashval_t hash)
> {
>   m_searches++;
>   size_t size = m_size;
>   hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
>
>   if (Lazy && m_entries == NULL)
> m_entries = alloc_entries (size);
>
> #if CHECKING_P
>   if (m_sanitize_eq_and_hash)
> verify (comparable, hash);
> #endif
>
>   value_type *entry = _entries[index];
>   if (is_empty (*entry)
>   || (!is_deleted (*entry) && Descriptor::equal (*entry, comparable)))
> return *entry;
>   .
>   .
>
> Is there any way this could happen other than by a memory corruption
> of some kind? This is a show stopper for me and I really need some help on
> this issue.
>
> 2) I tried to dump out all the gimple in the following way at the very
> beginning of my program:
>
> void
> print_program ( FILE *file, int leading_space )
> {
>   struct cgraph_node *node;
>   fprintf ( file, "%*sProgram:\n", leading_space, "");
>
>   // Print Global Decls
>   //
>   varpool_node *var;
>   FOR_EACH_VARIABLE ( var)
>   {
> tree decl = var->decl;
> fprintf ( file, "%*s", leading_space, "");
> print_generic_decl ( file, decl, (dump_flags_t)0);
> fprintf ( file, "\n");
>   }
>
>   FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
>   {
> struct function *func = DECL_STRUCT_FUNCTION ( node->decl);
> dump_function_header ( file, func->decl, (dump_flags_t)0);
> dump_function_to_file ( func->decl, file, (dump_flags_t)0);
>   }
> }
>
> When I run this the first two (out of three) functions print
> just fine. However, for the third, func->decl is (nil) and
> it segfaults.
>
> Now the really odd thing is that this works perfectly at the
> end or middle of my optimization.
>
> What gives?
>
> 3) For my bug in (1) I got so distraught that I ran valgrind which
> in my experience is an act of desperation for compilers.
>
> None of the errors it spotted are associated with my optimization
> (although it oh so cleverly pointed out the segfault) however it
> showed the following:
>
> ==18572== Invalid read of size 8
> ==18572==at 0x1079DC1: execute_one_pass(opt_pass*) (passes.c:2550)
> ==18572==by 0x107ABD3: execute_ipa_pass_list(opt_pass*) (passes.c:2929)
> ==18572==by 0xAC0E52: symbol_table::compile() (cgraphunit.c:2786)
> ==18572==by 0x9915A9: lto_main() (lto.c:653)
> ==18572==by 0x11EE4A0: compile_file() (toplev.c:458)
> ==18572==by 0x11F1888: do_compile() (toplev.c:2302)
> ==18572==by 0x11F1BA3: toplev::main(int, char**) (toplev.c:2441)
> ==18572==by 0x23C021E: main (main.c:39)
> ==18572==  Address 0x5842880 is 16 bytes before a block of size 88 alloc'd
> ==18572==at 0x4C3017F: operator new(unsigned long) (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==18572==by 0x21E00B7:

Re: Three issues

2020-07-22 Thread Gary Oblock via Gcc

David,

Note, for the first explanation, this is the hash table for the default defs 
and not
some private pass specific table so I'm not directly touching it in way. 
However, that
doesn't mean some other common or not so common function I'm invoking has
the side effect of doing this in some pathological way (you point this out by 
asking
if they are my temporaries.) I do create temporaries but I certainly make no 
attempt
to add them to the default defs. In fact, I went so far as to instrument the 
code that
adds them to see if I was doing this but I found nothing. This likely means I'm 
causing
a subtle memory corruption or something I'm doing has bad side effects.

The second option (not an explanation) has me diddling some fairly important 
stuff
that I don't know all that much about therefore I prefer to find and fix the 
root cause.

Thanks,

Gary

From: David Malcolm 
Sent: Wednesday, July 22, 2020 12:31 AM
To: Gary Oblock ; gcc@gcc.gnu.org 
Subject: Re: Three issues

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

On Tue, 2020-07-21 at 22:49 +, Gary Oblock via Gcc wrote:
> Some background:
>
> This is in the dreaded structure reorganization optimization that I'm
> working on. It's running at LTRANS time with '-flto-partition=one'.
>
> My issues in order of importance are:
>
> 1) In gimple-ssa.h, the equal method for ssa_name_hasher
> has a segfault because the "var" field of "a" is (nil).
>
> struct ssa_name_hasher : ggc_ptr_hash
> {
>   /* Hash a tree in a uid_decl_map.  */
>
>   static hashval_t
>   hash (tree item)
>   {
> return item->ssa_name.var->decl_minimal.uid;
>   }
>
>   /* Return true if the DECL_UID in both trees are equal.  */
>
>   static bool
>   equal (tree a, tree b)
>   {
>   return (a->ssa_name.var->decl_minimal.uid == b->ssa_name.var-
> >decl_minimal.uid);
>   }
> };

I notice that tree.h has:

/* Returns the variable being referenced.  This can be NULL_TREE for
   temporaries not associated with any user variable.
   Once released, this is the only field that can be relied upon.  */
#define SSA_NAME_VAR(NODE)  \
  (SSA_NAME_CHECK (NODE)->ssa_name.var == NULL_TREE \
   || TREE_CODE ((NODE)->ssa_name.var) == IDENTIFIER_NODE   \
   ? NULL_TREE : (NODE)->ssa_name.var)

So presumably that ssa_name_hasher is making an implicit assumption
that such temporaries aren't present in the hash_table; maybe they are
for yours?

Is this a hash_table that you're populating yourself?

With the caveat that I'm sleep-deprived, another way this could happen
is if "a" is not an SSA_NAME but is in fact some other kind of tree;
you could try replacing
  a->ssa_name.ver
with
  SSA_NAME_CHECK (a)->ssa_name.var
(and similarly for b)

But the first explanation seems more likely.

>
[...snip qn 2...]

> 3) For my bug in (1) I got so distraught that I ran valgrind which
> in my experience is an act of desperation for compilers.
>
> None of the errors it spotted are associated with my optimization
> (although it oh so cleverly pointed out the segfault) however it
> showed the following:
>
> ==18572== Invalid read of size 8
> ==18572==at 0x1079DC1: execute_one_pass(opt_pass*)
> (passes.c:2550)

What is line 2550 of passes.c in your working copy?

==18572==by 0x107ABD3: execute_ipa_pass_list(opt_pass*)
> (passes.c:2929)
> ==18572==by 0xAC0E52: symbol_table::compile() (cgraphunit.c:2786)
> ==18572==by 0x9915A9: lto_main() (lto.c:653)
> ==18572==by 0x11EE4A0: compile_file() (toplev.c:458)
> ==18572==by 0x11F1888: do_compile() (toplev.c:2302)
> ==18572==by 0x11F1BA3: toplev::main(int, char**) (toplev.c:2441)
> ==18572==by 0x23C021E: main (main.c:39)
> ==18572==  Address 0x5842880 is 16 bytes before a block of size 88
> alloc'd
> ==18572==at 0x4C3017F: operator new(unsigned long) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==18572==by 0x21E00B7: make_pass_ipa_prototype(gcc::context*)
> (ipa-prototype.c:329)

You say above that none of the errors are associated with your
optimization, but presumably this is your new pass, right?  Can you
post the code somewhere?

> ==18572==by 0x106E987:
> gcc::pass_manager::pass_manager(gcc::context*) (pass-
> instances.def:178)
> ==18572==by 0x11EFCE8: general_init(char const*, bool)
> (toplev.c:1250)
> ==18572==by 0x11F1A86: toplev::main(int, char**) (toplev.c:2391)
> ==18572==by 0x23C021E: main (main.c:39)
> ==18572==
>
> Are these known issues with lto or is this a valgrind issue?

Hope this is helpful
Dave

Three issues

2020-07-21 Thread Gary Oblock via Gcc

Some background:

This is in the dreaded structure reorganization optimization that I'm
working on. It's running at LTRANS time with '-flto-partition=one'.

My issues in order of importance are:

1) In gimple-ssa.h, the equal method for ssa_name_hasher
has a segfault because the "var" field of "a" is (nil).

struct ssa_name_hasher : ggc_ptr_hash
{
  /* Hash a tree in a uid_decl_map.  */

  static hashval_t
  hash (tree item)
  {
return item->ssa_name.var->decl_minimal.uid;
  }

  /* Return true if the DECL_UID in both trees are equal.  */

  static bool
  equal (tree a, tree b)
  {
  return (a->ssa_name.var->decl_minimal.uid == 
b->ssa_name.var->decl_minimal.uid);
  }
};

The parameter "a" is associated with "*entry" on the 2nd to last
line shown (it's trimmed off after that.) This from hash-table.h:

template class Allocator>
typename hash_table::value_type &
hash_table
::find_with_hash (const compare_type , hashval_t hash)
{
  m_searches++;
  size_t size = m_size;
  hashval_t index = hash_table_mod1 (hash, m_size_prime_index);

  if (Lazy && m_entries == NULL)
m_entries = alloc_entries (size);

#if CHECKING_P
  if (m_sanitize_eq_and_hash)
verify (comparable, hash);
#endif

  value_type *entry = _entries[index];
  if (is_empty (*entry)
  || (!is_deleted (*entry) && Descriptor::equal (*entry, comparable)))
return *entry;
  .
  .

Is there any way this could happen other than by a memory corruption
of some kind? This is a show stopper for me and I really need some help on
this issue.

2) I tried to dump out all the gimple in the following way at the very
beginning of my program:

void
print_program ( FILE *file, int leading_space )
{
  struct cgraph_node *node;
  fprintf ( file, "%*sProgram:\n", leading_space, "");

  // Print Global Decls
  //
  varpool_node *var;
  FOR_EACH_VARIABLE ( var)
  {
tree decl = var->decl;
fprintf ( file, "%*s", leading_space, "");
print_generic_decl ( file, decl, (dump_flags_t)0);
fprintf ( file, "\n");
  }

  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
  {
struct function *func = DECL_STRUCT_FUNCTION ( node->decl);
dump_function_header ( file, func->decl, (dump_flags_t)0);
dump_function_to_file ( func->decl, file, (dump_flags_t)0);
  }
}

When I run this the first two (out of three) functions print
just fine. However, for the third, func->decl is (nil) and
it segfaults.

Now the really odd thing is that this works perfectly at the
end or middle of my optimization.

What gives?

3) For my bug in (1) I got so distraught that I ran valgrind which
in my experience is an act of desperation for compilers.

None of the errors it spotted are associated with my optimization
(although it oh so cleverly pointed out the segfault) however it
showed the following:

==18572== Invalid read of size 8
==18572==at 0x1079DC1: execute_one_pass(opt_pass*) (passes.c:2550)
==18572==by 0x107ABD3: execute_ipa_pass_list(opt_pass*) (passes.c:2929)
==18572==by 0xAC0E52: symbol_table::compile() (cgraphunit.c:2786)
==18572==by 0x9915A9: lto_main() (lto.c:653)
==18572==by 0x11EE4A0: compile_file() (toplev.c:458)
==18572==by 0x11F1888: do_compile() (toplev.c:2302)
==18572==by 0x11F1BA3: toplev::main(int, char**) (toplev.c:2441)
==18572==by 0x23C021E: main (main.c:39)
==18572==  Address 0x5842880 is 16 bytes before a block of size 88 alloc'd
==18572==at 0x4C3017F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==18572==by 0x21E00B7: make_pass_ipa_prototype(gcc::context*) 
(ipa-prototype.c:329)
==18572==by 0x106E987: gcc::pass_manager::pass_manager(gcc::context*) 
(pass-instances.def:178)
==18572==by 0x11EFCE8: general_init(char const*, bool) (toplev.c:1250)
==18572==by 0x11F1A86: toplev::main(int, char**) (toplev.c:2391)
==18572==by 0x23C021E: main (main.c:39)
==18572==

Are these known issues with lto or is this a valgrind issue?

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

Default defs question

2020-07-15 Thread Gary Oblock via Gcc

Regarding the other question I asked today could somebody explain to
me what the default_defs are all about. I suspect I'm doing something
wrong with regard of them. Note, I've isolated the failure in the last email
down to this bit (in red):

if (is_empty (*entry)
  || (!is_deleted (*entry) && Descriptor::equal (*entry, comparable))

Which doesn't make much sense to me.

Thanks,

Gary


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

Help on a bug showing up in a template

2020-07-15 Thread Gary Oblock via Gcc

I'm encountering a really painful error. The stack trace is below.

The code in hash-table.h is a template and it is really hyper-allergic
to instrumentation (a couple of fprintfs caused malloc to have an
internal error!)  Last time I checked gbd didn't exactly play nice
with templates either. Note, I tried adding --enable-checking=all to
my configure but all that did was cause a library installation failure.

If anybody has any clues about how to handle this kind of a bug or
even better yet if you have an a idea of what I did wrong then please
let me know.

Note, the particular optimization I'm working on is done at IPA time
and involve creating a bunce of gimple stmts, new types, new ssa temps
and changing the types of some existing declarations and ssa temps.

Thanks,

Gary

-


during IPA pass: inline
dump file: ./exe.ltrans0.ltrans.079i.inline
main.c: In function ‘main’:
main.c:18:11: internal compiler error: Segmentation fault
   18 |   max_y = max_of_y( data, len);
  |   ^
0xcbb4af crash_signal
../../source/gcc/toplev.c:328
0xd24d66 hash_table::find_with_hash(tree_node* const&, unsigned int)
../../source/gcc/hash-table.h:925
0xd21d23 ssa_default_def(function*, tree_node*)
../../source/gcc/tree-dfa.c:315
0xd56988 setup_one_parameter
../../source/gcc/tree-inline.c:3429
0xd5cb35 initialize_inlined_parameters
../../source/gcc/tree-inline.c:3585
0xd5cb35 expand_call_inline
../../source/gcc/tree-inline.c:4936
0xd5f8e9 gimple_expand_calls_inline
../../source/gcc/tree-inline.c:5266
0xd5f8e9 optimize_inline_calls(tree_node*)
../../source/gcc/tree-inline.c:5439
0xa02023 inline_transform(cgraph_node*)
../../source/gcc/ipa-inline-transform.c:736
0xb8d979 execute_one_ipa_transform_pass
../../source/gcc/passes.c:2233
0xb8d979 execute_all_ipa_transforms(bool)
../../source/gcc/passes.c:2272
0x75c15b cgraph_node::expand()
../../source/gcc/cgraphunit.c:2294
0x75d858 expand_all_functions
../../source/gcc/cgraphunit.c:2472
0x75d858 symbol_table::compile()
../../source/gcc/cgraphunit.c:2823
0x6963d1 lto_main()
../../source/gcc/lto/lto.c:653
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
lto-wrapper: fatal error: /home/gary/gcc_expt_build/install/bin/gcc returned 1 
exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
./script: line 10: ./exe: No such file or directory



CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

Re: An problematic interaction between a call created by gimple_build_call and inlining

2020-07-03 Thread Gary Oblock via Gcc

Martin,

Actually it's basic blocks that dominate on another though I suppose you could 
generalize the notion to CFG edges but to what point?

I'm seeing in some of the code that I read, immediate dominators being manually 
computed and added to the BBs at their point of creation. At the moment I'm 
punting on creating them hoping I don't create an untenable state which results 
in hard to diagnose failures. I was just trying to avoid this.

Thanks,

Gary Oblock


From: Martin Jambor 
Sent: Friday, July 3, 2020 1:59 AM
To: Gary Oblock ; Richard Biener 

Cc: gcc@gcc.gnu.org 
Subject: Re: An problematic interaction between a call created by 
gimple_build_call and inlining

Hi,

On Thu, Jul 02 2020, Gary Oblock wrote:
> Martin,
>
> What about immediate dominators?

I'm afraid I don't understand your question, what about them?

Dominators are re-computed after inlining and after clones are
materialized (when they get their own body)... I believe.

We do not store information which call graph edges dominate other call
graph edges in the callers body.  Having that information might be
useful at IPA stage.

But yeah, please be more specific what your question is.

Martin

>
> 
> From: Martin Jambor 
> Sent: Wednesday, July 1, 2020 3:40 PM
> To: Gary Oblock ; Richard Biener 
> 
> Cc: gcc@gcc.gnu.org 
> Subject: Re: An problematic interaction between a call created by 
> gimple_build_call and inlining
>
> [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
> be mindful of safe email handling and proprietary information protection 
> practices.]
>
>
> Hi,
>
> On Wed, Jul 01 2020, Gary Oblock via Gcc wrote:
>> Thank you Richard.
>>
>> I feel a bit dumb because I'm well aware of the GCC philosophy to have
>> any new code produced update the state.  Of course I didn't know the
>> commands to do this for the call graph (which I really appreciate you 
>> giving.)
>>
>> However, the real reason I'm sending a reply is this. Are there any 
>> surprising
>> cases in IPA where GCC violates its philosophy and actually regenerates the
>> information?
>
> if by "the information" you specifically mean call graph edges then no.
> (Regular) IPA optimizations are designed to also work when doing link
> time optimization (LTO) which means that unless they specifically load
> the (gimple) bodies of some selected functions, the bodies are not
> available to them.  They only operate on the call graph and information
> they collected when generating summaries.  Because gimple body is not
> available, call graph edges cannot be regenerated from it.
>
> In fact, when a call is redirected to a different/specialized function,
> at IPA time it is only recored in the call graph by redirecting the
> corresponding edge and the call statement is modified only at the end of
> IPA phase (in LTRANS in LTO-speak).  This is necessary even when not
> using LTO because until specialized nodes get their own body, they share
> it with the node from which they were cloned.  That means that several
> call graph edges, which do not share caller and may not even share the
> callee, refer to the same gimple statement - so the decl in the
> statement is actually meaningless and the edge encodes the important
> information.
>
> Martin

Questions regarding control flow during IPA passes

2020-07-02 Thread Gary Oblock via Gcc

At IPA time I'm creating GIMPLE statements. I've noticed during dumps
that gotos and labels don't seem to exist. In fact when I tried
introducing them, at least the gotos, failed.  I assume that at this
point in compilation GCC relies on the control flow graph (which I'm
updating as I create new BBs) so I actually shouldn't create them?
Furthermore, I assume I should be setting the "gotos" in the condition
statement to NULL?

Thanks,

Gary Oblock
Ampere Computing
Santa Clara, California


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

Re: An problematic interaction between a call created by gimple_build_call and inlining

2020-07-02 Thread Gary Oblock via Gcc

Martin,

What about immediate dominators?

Thanks,

Gary

From: Martin Jambor 
Sent: Wednesday, July 1, 2020 3:40 PM
To: Gary Oblock ; Richard Biener 

Cc: gcc@gcc.gnu.org 
Subject: Re: An problematic interaction between a call created by 
gimple_build_call and inlining

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]

Hi,

On Wed, Jul 01 2020, Gary Oblock via Gcc wrote:
> Thank you Richard.
>
> I feel a bit dumb because I'm well aware of the GCC philosophy to have
> any new code produced update the state.  Of course I didn't know the
> commands to do this for the call graph (which I really appreciate you giving.)
>
> However, the real reason I'm sending a reply is this. Are there any surprising
> cases in IPA where GCC violates its philosophy and actually regenerates the
> information?

if by "the information" you specifically mean call graph edges then no.
(Regular) IPA optimizations are designed to also work when doing link
time optimization (LTO) which means that unless they specifically load
the (gimple) bodies of some selected functions, the bodies are not
available to them.  They only operate on the call graph and information
they collected when generating summaries.  Because gimple body is not
available, call graph edges cannot be regenerated from it.

In fact, when a call is redirected to a different/specialized function,
at IPA time it is only recored in the call graph by redirecting the
corresponding edge and the call statement is modified only at the end of
IPA phase (in LTRANS in LTO-speak).  This is necessary even when not
using LTO because until specialized nodes get their own body, they share
it with the node from which they were cloned.  That means that several
call graph edges, which do not share caller and may not even share the
callee, refer to the same gimple statement - so the decl in the
statement is actually meaningless and the edge encodes the important
information.

Martin

Re: An problematic interaction between a call created by gimple_build_call and inlining

2020-07-01 Thread Gary Oblock via Gcc

Thank you Richard.

I feel a bit dumb because I'm well aware of the GCC philosophy to have
any new code produced update the state.  Of course I didn't know the
commands to do this for the call graph (which I really appreciate you giving.)

However, the real reason I'm sending a reply is this. Are there any surprising
cases in IPA where GCC violates its philosophy and actually regenerates the
information?

Thanks again,

Gary


From: Richard Biener 
Sent: Wednesday, July 1, 2020 12:27 AM
To: Gary Oblock 
Cc: gcc@gcc.gnu.org 
Subject: Re: An problematic interaction between a call created by 
gimple_build_call and inlining

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


On Wed, Jul 1, 2020 at 7:49 AM Gary Oblock via Gcc  wrote:
>
> I'm trying to generate calls to "free" on the fly at ipa time.
>
> I've tried several things (given below) but they both fail
> in expand_call_inline in tree-inline.c on this gcc_checking_assert:
>
>   cg_edge = id->dst_node->get_edge (stmt);
>   gcc_checking_assert (cg_edge);

It simply means you are operating at a point where we expect
callgraph edges to be present but you fail to update the callgraph
for your added function call.  it might be as easy as calling

cgraph_node::get (cfun->decl)->create_edge (cgraph_node::get_create
(fndecl_free), free_call, gimple_bb (free_call)->count);

> Now, I've tried using the built in free via:
>
>   tree fndecl_free = builtin_decl_explicit( BUILT_IN_FREE);
>   // Note to_free is set between here and the call by an assign
>   tree to_free =
> make_temp_ssa_name( reorg_pointer_type, NULL, "malloc_to_free");
>   .
>   .
>   gcall *free_call = gimple_build_call( fndecl_free, 1, to_free);
>
> or building the fndecl from scrath:
>
>   tree fntype = build_function_type ( free_return_type, param_type_list);
>   tree fnname = get_identifier ( "free");
>   tree fndecl_free =
> build_decl ( input_location, FUNCTION_DECL, fnname, fntype);
>   gcall *free_call = gimple_build_call( fndecl_free, 1, to_free);
>
> Note, I was able to get something similar to work for "malloc" by
> using the fndecl I extracted from an existing malloc call.
>
> Your advice on how to build a fndecl that doesn't have this
> problem is appreciated.
>
> Thanks,
>
> Gary Oblock
>
>
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
> for the sole use of the intended recipient(s) and contains information that 
> is confidential and proprietary to Ampere Computing or its subsidiaries. It 
> is to be used solely for the purpose of furthering the parties' business 
> relationship. Any review, copying, or distribution of this email (or any 
> attachments thereto) is strictly prohibited. If you are not the intended 
> recipient, please contact the sender immediately and permanently delete the 
> original and any copies of this email and any attachments thereto.

An problematic interaction between a call created by gimple_build_call and inlining

2020-06-30 Thread Gary Oblock via Gcc

I'm trying to generate calls to "free" on the fly at ipa time.

I've tried several things (given below) but they both fail
in expand_call_inline in tree-inline.c on this gcc_checking_assert:

  cg_edge = id->dst_node->get_edge (stmt);
  gcc_checking_assert (cg_edge);

Now, I've tried using the built in free via:

  tree fndecl_free = builtin_decl_explicit( BUILT_IN_FREE);
  // Note to_free is set between here and the call by an assign
  tree to_free =
make_temp_ssa_name( reorg_pointer_type, NULL, "malloc_to_free");
  .
  .
  gcall *free_call = gimple_build_call( fndecl_free, 1, to_free);

or building the fndecl from scrath:

  tree fntype = build_function_type ( free_return_type, param_type_list);
  tree fnname = get_identifier ( "free");
  tree fndecl_free =
build_decl ( input_location, FUNCTION_DECL, fnname, fntype);
  gcall *free_call = gimple_build_call( fndecl_free, 1, to_free);

Note, I was able to get something similar to work for "malloc" by
using the fndecl I extracted from an existing malloc call.

Your advice on how to build a fndecl that doesn't have this
problem is appreciated.

Thanks,

Gary Oblock


CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.

Re: GIMPLE problem

2020-06-24 Thread Gary Oblock via Gcc

Richard,

First off I did suspect INDIRECT_REF wasn't supported, thanks for
confirming that.

I tried what you said in the original code before I posted
but I suspect how I went at it is the problem. I'm probably
doing something(s) in a glaringly stupid way.

Can you spot it, because everything I'm doing makes total sense
to me?

Thanks Gary

--

Snippet from the code with MEM_REF:

  tree lhs_ref = build1 ( MEM_REF, field_type, field_addr);

  final_set = gimple_build_assign( lhs_ref, field_val_temp);

field_type is a double *

field_addr is an address within an malloced array of doubles.

--

Snippet from the code with ARRAY_REF:

  tree rhs_ref = build4 ( ARRAY_REF, field_type, field_arry_addr, index,
  NULL_TREE, NULL_TREE);

  temp_set = gimple_build_assign( field_val_temp, rhs_ref);

field type is double

field_arry_addr is the starting address of an array of malloced doubles.

index is a pointer_rep (an integer)
  details:
tree pointer_rep = make_node ( INTEGER_TYPE);
TYPE_PRECISION (pointer_rep) = TYPE_PRECISION (pointer_sized_int_node);

GIMPLE problem

2020-06-23 Thread Gary Oblock via Gcc

I'm somehow misusing GIMPLE (probably in multiple ways) and I need
some help in straightening out this little mess I've made.

I'm trying to do the following:

In an attempt at structure reorganization (instance interleaving) an
array of structures is being transformed into a structure of arrays.

for the simple example I'm using
typedef struct type type_t;
struct type {
  double x;
  double y;
};
.
.
type_t *data = (type_t *)malloc( len * sizeof(type_t));
.
.
result = data[i].y;

Is transformed into this or something close to it

typedef long _reorg_SP_ptr_type_type_t
typedef struct _reorg_base_type_type_t _reorg_base_type_type_t

struct _reorg_base_type_type_t {
 double *x;
 double *y;
};

_reorg_SP_ptr_type_type_t data;

_reorg_base_type_type_t _reorg_base_var_type_t;

// Note I'm ignoring a bunch of stuff that needs to happen
// when a malloc fails..
_reorg_base_var_type_t.x = (double*)malloc( len*sizeof(double));
_reorg_base_var_type_t.y = (double*)malloc( len*sizeof(double));

data = 0;
.
.
double *temp = _reorg_base_var_type_t.y;
result = temp[i];

Now, believe it or not the the whole bit above, except for "result = data[i].y",
seems to work just fine.

I attempted to do this (result = data[i].y) via basically two different
ways. One is using ARRAY_REF and in the other faking an array access with
INDIRECT_REF. The first approach chokes on the fact that temp is a pointer
and the second dies in ssa operand scanning because it doesn't have a case
for INDIRECT_REF.

The code below shows both ways. What have I done wrong here and what to
I need to do differently to get it to work?

Thanks,

Gary

PS Please ignore the then case below.


 gimple_stmt_iterator gsi = gsi_for_stmt( stmt);

 // Dump for debugging
 print_gimple_stmt ( stderr, stmt, 0);

 tree lhs = gimple_assign_lhs( stmt);
 tree rhs = gimple_assign_rhs1( stmt);

 bool ro_on_left = tree_contains_a_reorgtype_p ( lhs, info);

 tree ro_side = ro_on_left ? lhs : rhs;
 tree nonro_side = ro_on_left ? rhs : lhs;

 switch ( recognize_op ( ro_side, info) )  // "a->f"
   {
   case ReorgOpT_Indirect:
 {
   tree orig_field = TREE_OPERAND( ro_side, 1);
   tree field_type = TREE_TYPE( orig_field);
   tree base = ri->instance_interleave.base;

   tree base_field =
   find_coresponding_field ( base, orig_field);

   tree base_field_type = TREE_TYPE( base_field);

   tree field_val_temp =
 make_temp_ssa_name( field_type, NULL, "field_val_temp");

   tree inner_op = TREE_OPERAND( ro_side, 0);

   // For either case generate common code:

   // field_array = _base.f
   tree field_arry_addr =
   make_temp_ssa_name( base_field_type, NULL, "field_arry_addr");

   tree rhs_faa = build3 ( COMPONENT_REF,
  //base_field_type, // This doesn't work
  ptr_type_node, // This seems bogus
  base,
 //base_field, // This doesn't work
 orig_field, // This seems bogus
 NULL_TREE);

   // Use this to access the array of element.
   gimple *get_field_arry_addr =
   gimple_build_assign( field_arry_addr, rhs_faa);

  // index = a
  tree index =
make_temp_ssa_name( ri->pointer_rep, NULL, "index");
  gimple *get_index =
gimple_build_assign( index, inner_op);

  gimple *temp_set;
  gimple *final_set;

  #if WITH_INDIRECT
  // offset = index * size_of_field
  tree size_of_field = TYPE_SIZE_UNIT ( base_field_type);
  tree offset = make_temp_ssa_name( sizetype, NULL, "offset");

  gimple *get_offset =
gimple_build_assign ( offset, MULT_EXPR, index, size_of_field);

  // field_addr = field_array + offset
  // bug fix here (TBD) type must be *double not double
  tree field_addr =
make_temp_ssa_name( base_field_type, NULL, "field_addr");

  gimple *get_field_addr =
gimple_build_assign ( field_addr, PLUS_EXPR, field_arry_addr, 
offset);
  #endif

  if ( ro_on_left )
{
   // With:a->f = rhs
   // Generate:

   //   temp = rhs
   temp_set = gimple_build_assign( field_val_temp, rhs);

   #if WITH_INDIRECT
   // NOTE, THIS (MEM_REF) SHOULD NOT WORK (IGNORE THIS PLEASE!)
   // not tested yet! I know THIS bit won't work.
   // *field_addr = temp
   tree lhs_ref = build1 ( MEM_REF, field_type, field_addr);
   #else
   // field_arry_addr[index]
   tree lhs_ref =

Question about comparing function function decls

2020-06-04 Thread Gary Oblock via Gcc





I'm trying to determine during LTO optimization (with one partition)
whether of not a function call is to a function in the partition.

Here is the routine I've written. Note, I'm willing to admit up front
that the comparison below ( ) is probably dicey.

---
static bool
is_user_function ( gimple *call_stmt)
{
  tree fndecl = gimple_call_fndecl ( call_stmt);

  DEBUG_L("is_user_function: decl in: %p,", fndecl);
  DEBUG_F( print_generic_decl, stderr, fndecl, (dump_flags_t)-1);
  DEBUG("\n");
  INDENT(2);

  cgraph_node* node;
  bool ret_val = false;
  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
  {
DEBUG_L("decl %p,", node->decl);
DEBUG_F( print_generic_decl, stderr, node->decl, (dump_flags_t)-1);
DEBUG("\n");

if ( node->decl == fndecl )
  {
ret_val = true;
break;
  }
  }

  INDENT(-2);
  return ret_val;
}
---

Here's the test program I was compiling.

-- aux.h --
#include "stdlib.h"
typedef struct type type_t;
struct type {
  int i;
  double x;
};

#define MAX(x,y) ((x)>(y) ? (x) : (y))

extern int max1( type_t *, size_t);
extern double max2( type_t *, size_t);
extern type_t *setup( size_t);
-- aux.c --
#include "aux.h"
#include "stdlib.h"

type_t *
setup( size_t size)
{
  type_t *data = (type_t *)malloc( size * sizeof(type_t));
  size_t i;
  for( i = 0; i < size; i++ ) {
data[i].i = rand();
data[i].x = drand48();
  }
  return data;
}

int
max1( type_t *array, size_t len)
{
  size_t i;
  int result = array[0].i;
  for( i = 1; i < len; i++  ) {
result = MAX( array[i].i, result);
  }
  return result;
}

double
max2( type_t *array, size_t len)
{
  size_t i;
  double result = array[0].x;
  for( i = 1; i < len; i++  ) {
result = MAX( array[i].x, result);
  }
  return result;
}
-- main.c -
#include "aux.h"
#include "stdio.h"

type_t *data1;

int
main(void)
{
  type_t *data2 = setup(200);
  data1 = setup(100);

  printf("First %d\n" , max1(data1,100));
  printf("Second %e\n", max2(data2,200));
}
---

The output follows:

---
L# 1211: is_user_function: decl in: 0x7f078461be00,  static intD. 
max1D. (struct type_t *, size_t);
L# 1222:   decl 0x7f078462,  static struct type_t * setupD. (size_t);
L# 1222:   decl 0x7f078461bf00,  static intD. max1.constprop.0D. 
(struct type_t *);
L# 1222:   decl 0x7f078461bd00,  static doubleD. max2.constprop.0D. 
(struct type_t *);
L# 1222:   decl 0x7f078461bb00,  static intD. mainD. (void);
---

Now it's pretty obvious that constant propagation decided the size_t
len arguments to max1 and max2 were no longer needed. However, the
function declaration information on the calls to them weren't updated
so they'll never match. Now if there is another way to see if the
function is in the partition or if there is some other way to compare
the functions in a partition, please let me know.

Thanks,

Gary Oblock
Ampere Computing

PS. The body of the message is attached in a file because my email program
(Outlook) mangled the above.



CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and contains information that is 
confidential and proprietary to Ampere Computing or its subsidiaries. It is to 
be used solely for the purpose of furthering the parties' business 
relationship. Any review, copying, or distribution of this email (or any 
attachments thereto) is strictly prohibited. If you are not the intended 
recipient, please contact the sender immediately and permanently delete the 
original and any copies of this email and any attachments thereto.


gcc_msg
Description: gcc_msg

Silly GIT related question

2020-01-14 Thread Gary Oblock

If you just do a clone and don't checkout a branch, is this equivalent
the  top of the trunk in the old scheme? If not then how do I get the
top of trunk?

Thanks for your patience,

Gary Oblock

Official GIT based scripts????

2020-01-13 Thread Gary Oblock

I'm using the recommended old git mirror based scripts...
Where are the new scripts documented? Note, I looked on
gcc.gnu.org and could find noting.

Note, if somebody posted them here already, I apologize, but
I'm forced to use our official Microsoft  Outlook webmail
email reader which turns half my email into gibberish
so please take pity me and post them here again.

Thanks,

Gary Oblock

Re: [EXT] Re: Option processing question

2020-01-13 Thread Gary Oblock

If I have an optimization set up to run in LTO (call it my_opt) and
the flags -flto-partition=one, -flto options and -fmy-opt are set
then the optimization might or might not be run depending of whether
it can all fit in one partition.

What I'm thinking is as long as it's a fatal error detectable anywhere
upstream in the compilation to not specify -fmy-opt without -flto-partition=one
then all will be well. So how do I detect it at all and where would I put the 
checking?

Gary

From: Richard Biener 
Sent: Monday, January 13, 2020 2:30 AM
To: Gary Oblock ; Jan Hubicka 
Cc: gcc@gcc.gnu.org 
Subject: [EXT] Re: Option processing question

External Email

--
On Sat, Jan 11, 2020 at 4:47 AM Gary Oblock  wrote:
>
> I'm writing an LTO optimization that requires "-flto-partition=one" How can I 
> make
> sure that this is the case? I've spent hours greping the code and the 
> Internals Doc is
> worth less than nothing for something like this.

That's of course because you shouldn't be doing this ;)

> If you have an answer or even
> I good idea of where to look please let me know. Note, for an normal "-fbalh" 
> there's
> a flag_blah that I could look at but for this there seems to be zip, nil, 
> diddly squat, etc.

At LTRANS time you indeed don't know.

But there's -flto-partition=none (none, not one), that you can detect somehow
(I think in_lto_p && !flag_ltrans && !flag_wpa).

Richard.

> Many thanks,
>
> Gary

Option processing question

2020-01-10 Thread Gary Oblock

I'm writing an LTO optimization that requires "-flto-partition=one" How can I 
make
sure that this is the case? I've spent hours greping the code and the Internals 
Doc is
worth less than nothing for something like this. If you have an answer or even
I good idea of where to look please let me know. Note, for an normal "-fbalh" 
there's
a flag_blah that I could look at but for this there seems to be zip, nil, 
diddly squat, etc.

Many thanks,

Gary

Re: [EXT] Re: Comparing types at LTO time

2020-01-10 Thread Gary Oblock

Richard,

Let me see if I've got this straight. Are you saying it's the
shape of objects combined with the variables that point at these
objects (or some subpart of them) that should drive struct reorg?
It seems to be hard to understand how to proceed from that notion
without something like types to fall back on.

If the struct reorg lumps all the types with the same shape together
into sets, aren't each of those sets type like and couldn't the sets be used
to drive the optimization?

Gary

From: Richard Biener 
Sent: Friday, January 10, 2020 2:29 AM
To: Gary Oblock 
Cc: Jan Hubicka ; gcc@gcc.gnu.org 
Subject: Re: [EXT] Re: Comparing types at LTO time

On Thu, Jan 9, 2020 at 9:36 PM Gary Oblock  wrote:
>
> Richard,
>
> Alas, when doing structure reorg I have to be able to know some
> arbitrary use of variable X in some GIMPLE expression is of a
> type that needs to be transformed in that given expression. I see no
> way around this.

Sure, if you view it as it transforming a type.  I see it as transforming
the layout of a specific object so all you need to know is whether an
arbitrary memory access accesses the very object - which you could,
if you face accesses you can't analyze, even check at runtime to some
extent (worst case by providing a copy in/out to a temporary with the
old layout).

Richard.

> 
> From: Richard Biener 
> Sent: Thursday, January 9, 2020 3:51 AM
> To: Jan Hubicka 
> Cc: Gary Oblock ; gcc@gcc.gnu.org 
> Subject: [EXT] Re: Comparing types at LTO time
>
> External Email
>
> --
> On Thu, Jan 9, 2020 at 9:53 AM Jan Hubicka  wrote:
> >
> > > There doesn't seem to be a way to compare types at LTO time. The functions
> > > same_type_p and comptypes are front end only if I'm not totally confused
> > > (which is quite possible) and type_hash_eq doesn't seem to apply for
> > > structure types. Please, any advice would be welcome.
> >
> > At LTO time it is bit hard to say what really is the "same type".  We
> > have multiple notions of equivalence:
> >  1) TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2)
> > means that both types were equal in tree merging at stream in, this
> > means that their IL representaiton is identical.
> >
> > This will lead to "false" for types that are same bud did not get
> > merged for various reasons. One such valid reason, for example, is
> > visibility of associated virtual tables
> >  2) types_types_same_for_odr returns true if types are considered same
> > by C++ one definition rule.  This is reliable but works only for C++
> > types with names (structures and unions)
> >  3) same_type_for_tbaa returns true if types are equivalent for type
> > based alias analysis.  It returns true in much more cases than 1
> > but is too coarse if you want to do datastructure changes.
> >
> > So in general this is quite hard problem (and in fact I started to play
> > with ODR types originally to understand it better).  I would suggest
> > starting with 1 if you want to rewrite types and eventually add a new
> > comparsion once pass does something useful.
> >
> > Richard may have some extra insights.
>
> My advice would be to not go down the route that requires comparing types
> since I'm not sure you can do that conservatively since you at the same
> time may not say two types are equal when they are not nor miss two
> equal types.  For example if you have a C TU and a Fortran TU there's
> defined interoperability but the actual type representations are distinct
> enough so that Honzas equality according to 1) doesn't trigger (nor does 2),
> but 3) does, but that will identify too many types as equal.
>
> Richard.
>
> > Honza
> > >
> > > Thanks,
> > >
> > > Gary Oblock
> > >

Anybody have any idea about why local_decls would go missing?

2020-01-09 Thread Gary Oblock

This is at LTO time and the function in question is this:

#include "stdlib.h"
typedef struct bogus type_ta;

struct bogus {
  int i;
  double x;
  int j;
};

void
helper( void *x)
{
  type_ta *y = (type_ta*)x;
  y->i =  rand();
}

and I'm checking the local_decls it like this:

  cgraph_node* node;
  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
  {
tree decl;
unsigned i;

node->get_untransformed_body ();

struct function *fn = DECL_STRUCT_FUNCTION ( node->decl);
DEBUG( "L# %d, fuction name = %s\n", __LINE__, 
lang_hooks.decl_printable_name ( node->decl, 2));
if( fn == NULL )
{
  DEBUG( "  EMPTY\n");
  continue;
}

FOR_EACH_LOCAL_DECL ( fn, i, decl)
{
  tree base = base_type_of ( decl);
  DEBUG( "L# %d Consider local var decl\n", __LINE__);
  .
  .

Needless to sat I get the function name "helper" but no variables.
I suspected the LTO streaming process but that seems bullet proof
and if local_decls are there when streaming out it should be restored when
streaming in. I also suspected inlining but it still happened with
"-fno-inline" present.

If you'll note the get_untransformed_body call above (which David Malcolm
suggested to cure a NULL fn) I suspect I'm lacking some other call
which will make all things right.

Thanks,

Gary Oblock

Re: [EXT] Re: Comparing types at LTO time

2020-01-09 Thread Gary Oblock

Richard,

Alas, when doing structure reorg I have to be able to know some
arbitrary use of variable X in some GIMPLE expression is of a
type that needs to be transformed in that given expression. I see no
way around this.

From: Richard Biener 
Sent: Thursday, January 9, 2020 3:51 AM
To: Jan Hubicka 
Cc: Gary Oblock ; gcc@gcc.gnu.org 
Subject: [EXT] Re: Comparing types at LTO time

External Email

--
On Thu, Jan 9, 2020 at 9:53 AM Jan Hubicka  wrote:
>
> > There doesn't seem to be a way to compare types at LTO time. The functions
> > same_type_p and comptypes are front end only if I'm not totally confused
> > (which is quite possible) and type_hash_eq doesn't seem to apply for
> > structure types. Please, any advice would be welcome.
>
> At LTO time it is bit hard to say what really is the "same type".  We
> have multiple notions of equivalence:
>  1) TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2)
> means that both types were equal in tree merging at stream in, this
> means that their IL representaiton is identical.
>
> This will lead to "false" for types that are same bud did not get
> merged for various reasons. One such valid reason, for example, is
> visibility of associated virtual tables
>  2) types_types_same_for_odr returns true if types are considered same
> by C++ one definition rule.  This is reliable but works only for C++
> types with names (structures and unions)
>  3) same_type_for_tbaa returns true if types are equivalent for type
> based alias analysis.  It returns true in much more cases than 1
> but is too coarse if you want to do datastructure changes.
>
> So in general this is quite hard problem (and in fact I started to play
> with ODR types originally to understand it better).  I would suggest
> starting with 1 if you want to rewrite types and eventually add a new
> comparsion once pass does something useful.
>
> Richard may have some extra insights.

My advice would be to not go down the route that requires comparing types
since I'm not sure you can do that conservatively since you at the same
time may not say two types are equal when they are not nor miss two
equal types.  For example if you have a C TU and a Fortran TU there's
defined interoperability but the actual type representations are distinct
enough so that Honzas equality according to 1) doesn't trigger (nor does 2),
but 3) does, but that will identify too many types as equal.

Richard.

> Honza
> >
> > Thanks,
> >
> > Gary Oblock
> >

Comparing types at LTO time

2020-01-08 Thread Gary Oblock

There doesn't seem to be a way to compare types at LTO time. The functions
same_type_p and comptypes are front end only if I'm not totally confused
(which is quite possible) and type_hash_eq doesn't seem to apply for
structure types. Please, any advice would be welcome.

Thanks,

Gary Oblock

Re: [EXT] Re: Mechanism to get at function information seems not to work

2020-01-03 Thread Gary Oblock

Thanks David,

I'll give it a try. By the way, I'm trying to force one partition
with "-flto-partition=one" I'm not sure if that makes a difference.

Gary

From: David Malcolm 
Sent: Friday, January 3, 2020 3:52 PM
To: Gary Oblock ; gcc@gcc.gnu.org 
Subject: [EXT] Re: Mechanism to get at function information seems not to work

External Email

--
On Fri, 2020-01-03 at 23:02 +0000, Gary Oblock wrote:
> I'm having some grief attempting to get at the local definitions
> in LTO (more about the options used later.)
>
> Here's the sequence of code in my optimization (part of attempt
> at structure reorganization optimizations.)
>
>   cgraph_node* node;
>   FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
>   {
> tree decl;
> unsigned i;
> struct function *fn = DECL_STRUCT_FUNCTION ( node->decl);
>
> // I'm assuming it's obvious what my debugging macros do...
> DEBUG( "fn %p\n", fn);
> DEBUG_F( print_generic_decl, stderr, node->decl, (dump_flags_t)-
> 1);
> DEBUG( "\n");
> // I don't know why this is coming up null but I'll
> // skip it for now because causes a crash.
> if( fn == NULL )
> {
>   continue;
> }
>
> FOR_EACH_LOCAL_DECL ( fn, i, decl)
> {
>   :
>
> What it returns is:
>
> fn 0xb0fc9210
>   static intD. max1.constprop.0D. (struct type_t *);
> fn 0xb0fc9370
>   static doubleD. max2.constprop.0D. (struct type_t *);
> fn (nil)
>   static intD. mainD. (void);
> fn (nil)
>   static struct type_t * setupD. (size_t);
>

[...snip...]

> Here is how I compile them:
>
> GCC=/home/goblock/str-reorg-gcc-build-dir/install/bin/gcc
> OPTIONS="-O2 -flto -flto-partition=one -fipa-structure-reorg"
>
> $GCC $OPTIONS -c main.c
> $GCC $OPTIONS -c aux.c
> $GCC $OPTIONS -o exe main.o aux.o
>
> ./exe
>
> I'm wondering if this is a fundamental issue, if there's a bug
> or perhaps I'm doing something dumb. I any advice is appreciated
> here because my only real alternative here is insanely ugly.

This looks like the same issue as one I ran into when trying to add LTO
support to the static analyzer [1].

AIUI the LTO infrastructure partitions the functions in a kind of
sharding operation.  There's no guarantee for any given partition's
invocation of lto1 that it has a particular function body.

I fixed this in the analyzer by putting this loop at the top of the
pass:

  /* If using LTO, ensure that the cgraph nodes have function bodies.  */
  cgraph_node *node;
  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
node->get_untransformed_body ();

which seems to fix it for me (my pass then goes on to do a non-trivial
interprocedural traversal, so it needs to have all the function bodies
present first).

I'm not sure if this is the correct fix for your issue, but it worked
for my pass (I'm just using "-flto", FWIW).

Hope this is helpful
Dave

[1] 
https://urldefense.proofpoint.com/v2/url?u=https-3A__gcc.gnu.org_wiki_DavidMalcolm_StaticAnalyzer=DwICaQ=nKjWec2b6R0mOyPaz7xtfQ=HVs3hYm_BnTtuG8V-km21WLujN2g6AKxQlP-LTQPUQI=PLNbIn26nObhyevJgxSrCwF8SwZVo7ILB5s5Uu2DmRk=quDOZbDeR-DxertzdPRUzHiBYIXOpQ6cRmjMsvC9EbE=

Mechanism to get at function information seems not to work

2020-01-03 Thread Gary Oblock

I'm having some grief attempting to get at the local definitions
in LTO (more about the options used later.)

Here's the sequence of code in my optimization (part of attempt
at structure reorganization optimizations.)

  cgraph_node* node;
  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node)
  {
tree decl;
unsigned i;
struct function *fn = DECL_STRUCT_FUNCTION ( node->decl);

// I'm assuming it's obvious what my debugging macros do...
DEBUG( "fn %p\n", fn);
DEBUG_F( print_generic_decl, stderr, node->decl, (dump_flags_t)-1);
DEBUG( "\n");
// I don't know why this is coming up null but I'll
// skip it for now because causes a crash.
if( fn == NULL )
{
  continue;
}

FOR_EACH_LOCAL_DECL ( fn, i, decl)
{
  :

What it returns is:

fn 0xb0fc9210
  static intD. max1.constprop.0D. (struct type_t *);
fn 0xb0fc9370
  static doubleD. max2.constprop.0D. (struct type_t *);
fn (nil)
  static intD. mainD. (void);
fn (nil)
  static struct type_t * setupD. (size_t);

Here's the test case:

aux.h
-
#include "stdlib.h"
typedef struct type type_t;
struct type {
  int i;
  double x;
};

#define MAX(x,y) ((x)>(y) ? (x) : (y))

extern int max1( type_t *, size_t);
extern double max2( type_t *, size_t);
extern type_t *setup( size_t);

main.c
--
#include "aux.h"
#include "stdio.h"

int
main(void)
{
  type_t *data1 = setup(100);
  type_t *data2 = setup(200);

  printf("First %d\n" , max1(data1,100));
  printf("Second %e\n", max2(data2,200));
}

aux.c
-
#include "aux.h"
#include "stdlib.h"

type_t *
setup( size_t size)
{
  type_t *data = (type_t *)malloc( size * sizeof(type_t));
  size_t i;
  for( i = 0; i < size; i++ ) {
data[i].i = rand();
data[i].x = drand48();
  }
  return data;
}

int
max1( type_t *array, size_t len)
{
  size_t i;
  int result = array[0].i;
  for( i = 1; i < len; i++  ) {
result = MAX( array[i].i, result);
  }
  return result;
}

double
max2( type_t *array, size_t len)
{
  size_t i;
  double result = array[0].x;
  for( i = 1; i < len; i++  ) {
result = MAX( array[i].x, result);
  }
  return result;
}

Here is how I compile them:

GCC=/home/goblock/str-reorg-gcc-build-dir/install/bin/gcc
OPTIONS="-O2 -flto -flto-partition=one -fipa-structure-reorg"

$GCC $OPTIONS -c main.c
$GCC $OPTIONS -c aux.c
$GCC $OPTIONS -o exe main.o aux.o

./exe

I'm wondering if this is a fundamental issue, if there's a bug
or perhaps I'm doing something dumb. I any advice is appreciated
here because my only real alternative here is insanely ugly.

Thanks,

Gary Oblock

Information on Loop Blocking

2020-01-02 Thread Gary Oblock

One of the engineers here at Marvel was experimenting, at the user level, with 
GCC in a failed attempt
to get loop blocking to do loop blocking. Here's basically his question.

  Exactly how does loop blocking work in GCC?

I know this must involve the polyhedral optimization code so an explanation 
might get a little ugly
but he's very bright fellow and can cope with it.

Thank,

Gary Oblock

Questions about points-to analysis in gcc

2019-10-24 Thread Gary Oblock

I'm wondering if the code in tree-ssa-structalias.c can be invoked in
a whole program mode? There are some comments in there about
it not playing well with WHOPR and WPA (not that I intend to use
that way.) Ironically, in the literature on points-to analysis this
algorithm
was only originally intended to be run on whole programs.

By the way, how are the users of the points-to sets actually supposed
to access and use them?

Thanks,

Gary

Re: [EXT] Re: Modifying types during optimization

2019-10-03 Thread Gary Oblock

On 10/2/19 3:15 AM, Richard Biener wrote:
> External Email
>
> --
> On Wed, Oct 2, 2019 at 1:43 AM Gary Oblock  wrote:
>> I'm working on structure reorganization optimizations and one of the
>> things that needs to happen is that pointers to arrays of structures
>> need to be modified into either being an integer of a structure depending
>> on which optimization is required.
>>
>> I'm seeing something similar happening in omp-low.c where the code in
>> install_var_field and fixup_child_record_type both seem to rebuild the
>> entire type from scratch if a field is either added or modified. Wouldn't
>> it be possible simply modify the field(s) in question and rerun layout_type?
>>
>> I suspect the answer will be no but reasons as to why that wouldn't work
>> will probably be equally valuable to me.
> I think it's undesirable at least.  When last discussing "structure reorg"
> I was always arguing that trying to change the "type" is the wrong angle
> to look at (likewise computing something like "type escape").  It's
> really individual objects you are transforming and that you need to track
> so there may be very well instances of the original type T plus the
> modified type T' in the program after the transform.
>
> Richard.
>
>> Thanks,
>>
>> Gary Oblock
>>
I answered Richard privately yesterday but I was wondering if anybody else
had any ideas about modifying type fields. Note, I agreed with Richard
and assured
him I was already planning to do things that way but I still don't see
any reason
why rebuilding a type (it's clone of the other type) is better than
modifying it.
I take my inspiration for relayout_decl and just want to create what in
essence would
be a relayout_type.

Gary

Modifying types during optimization

2019-10-01 Thread Gary Oblock

I'm working on structure reorganization optimizations and one of the
things that needs to happen is that pointers to arrays of structures
need to be modified into either being an integer of a structure depending
on which optimization is required.

I'm seeing something similar happening in omp-low.c where the code in
install_var_field and fixup_child_record_type both seem to rebuild the
entire type from scratch if a field is either added or modified. Wouldn't
it be possible simply modify the field(s) in question and rerun layout_type?

I suspect the answer will be no but reasons as to why that wouldn't work
will probably be equally valuable to me.

Thanks,

Gary Oblock

How can I build new functions on the fly during optimization?

2019-09-18 Thread Gary Oblock

I'm trying to build new functions on the fly during optimization.
For those of you that have not been following my previous questions,
this is structure reorganization optimization related. For example when
somebody frees an array of type fu, I'd like to build a new
function _reorg_free_fu which does the correct things for a
transformed array of type _reorg_fu.

I've run across uses of these:
  build_fn_decl
  gimple_build_call
However, I don't see any code going any further than that.

Anybody have any ideas about how I can accomplish the rest of
what I need to do? Note, I'll be doing this during LTRANS.

Thanks,

Gary Oblock

Re: [EXT] Re: Questions about initialization data during LTO

2019-09-16 Thread Gary Oblock

On 9/14/19 8:39 AM, Martin Liška wrote:

On 9/13/19 3:01 PM, Gary Oblock wrote:


So, back to my questions, any ideas about how to get initialization
information? This
is going to be a very powerful optimization for code with structures of
arrays and
I just need a little help getting around a few obstacles in my path.



Sure. So I would point you to the IPA ICF pass, which makes merging
of variables in WPA phase of LTO. Let's take a look at:

ipa-icf.c:1839 (sem_variable::equals) where
we do:
   if (DECL_INITIAL (decl) == error_mark_node && in_lto_p)
 dyn_cast (node)->get_constructor ();

That's the way how load DECL_INITIAL of variables.
Does it help you?

Martin



So Martin, let me get this straight, all of the initialization information
can be fetched here? I ask this because I was under the impression that
some of it was deleted and could not be recovered. The worst case scenario
for my optimization (making it illegal) is if the user specified an 
initialization
and the initialization disappeared without leaving a trace in the IR that it 
ever
existed.

Many thanks,

Gary

1 2 >

1 - 100 of 108 matches

Mail list logo