> At that particular time I think Kenner was mostly focused on the alpha
> and ppc ports, but I think he was also still poking around with romp and
> a29k. I think romp is an unlikely target for this because it didn't
> promote modes and it wasn't even building for several months
>
> I don't see why we should have to "comply with the GNU style" if we're
> truly an independent project run by the GCC developers and aided by
> the steering committee.
I think it critical than any code have *some* style guidelines. If you
don't like the GNU coding convention, which do you
> Correct. It is truncated for integer shift, but not simd shift
> instructions. We generate a pattern in the split that only generates
> the integer shift instructions.
That's unfortunate, because it would be nice to do this in simplify_rtx,
since it's machine-independent, but that has to be
> Because for integer shift instructions the shift count is
> truncated. We ensure that we only use integer shift instructions by
> emitting a shift with a mask. This only matches integer shift
> instructions in the md file.
That's why I asked about SHIFT_COUNT_TRUNCATED. So it's truncated for
> This case is covered by Wilco's previous reply:
>
> https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00575.html
Which I don't understand:
> No it's perfectly safe - it becomes an integer-only shift after the
> split since it keeps the masking as part of the pattern.
Let say we have your first
> The pattern will only be matched if the value is positive. More
> specifically if the constant value is 32 (SImode) or 64 (DImode).
I don't mean the constant, but the value subtracted from it.
If that's negative, then we have a shift count larger than the
wordsize.
> On Aarc64 SHIFT_COUNT_TRUNCATED is only true if SIMD code generation
> is disabled. This is because the simd instructions can be used for
> shifting but they do not truncate the shift count.
In that case, the change isn't safe! Consider if the value was
negative, for example. Yes, it's
> That is simplify:
> (SHIFT A (32 - B)) -> (SHIFT A (AND (NEG B) 31))
> etc.
I think you need SHIFT_COUNT_TRUNCATED to be true for this to be
valid, but this is exactly what I was getting at in my last message.
> This patch improves code generation for shifts with subtract
> instructions where the first operand to the subtract is equal to the
> bit-size of the operation.
I would suspect that this will work on lots of targets. Is doing it
in combine an option?
> > Out of curiousity, does the old Alpha/VMS stack-checking API meet the
> > requirements? From what I recall, I think it does.
> Unsure. Is this documented somewhere?
It seems to be in
http://h20565.www2.hpe.com/hpsc/doc/public/display?docId=emr_na-c04621389
starting at page 3-54.
Out of curiousity, does the old Alpha/VMS stack-checking API meet the
requirements? From what I recall, I think it does.
> > The latter references other documents, which advocate for more use of
> > contractions even in formal writing.
>
> These are legal guides, not obviously relevant in the context
> of technical writing.
Yes and no. The argument for them is that legal writing is the most formal
of all and has
> First, I agree that the less formal language is becoming more
> acceptable. Some style guides explicitly allow contractions,
> but others advise against them. The technical specifications
> that significant parts of GCC aim to conform to, and those I
> happen to work with the most closely
> The GCC manual uses "cannot" in most places (280 lines) but there
> are a few instances of "can't" (33 lines).
>
> The attached patch replaces the informal "can't" with the former
> for consistency.
In my opinion, this is the wrong direction. Contractions are becoming
more acceptable in even
> Sorry, I don't understand. Surely anything released under the LGPL by
> the FSF can be upgraded to the current GPLv3? First upgrade to the
> latest LGPL, then switch over to the GPLv3?
That seems correct to me.
> "returns always true" -> "always returns true" ?
>
> (The former is how we'd say it in German, and hence might be common in
> Dutch as well? In English, both probably are fine, the latter feeling
> more natural to me. But then, I'm not a native speaker. ;-)
The former is unusual in English
any symbols it references. This may result in those symbols getting
discarded by GCC as unreferenced.
We can omit by GCC here.
We can, but we should not. We should avoid the passive voice like the
plague in technical documentation, even if doing so leads to some
slight redundancy.
I
The thing about written policy is that it sets the tone for a project.
A restrictive policy tends to authoritarian rule by maintainers, it
seems to me.
And a too little restrictive policy runs the risk of creating a
feeling that the rules aren't necessarily to be taken too seriously.
Neither
That doesn't sound like the right place to me. We want the same code to
be generated whether users write and || directly or write corresponding
sequences of if conditions. In general we want to move away from
optimizing complicated expressions in fold-const, towards having GIMPLE
register int a;
extern volatile int b;
a = b;
a |= 0x54;
b = a;
The ISO spec seems to allow gcc to perform those operations in a
single physical insn, as long as the operations on 'b' are all
performed, and in the correct sequence.
And I believe it's also
One of the nice things about gcc is that gcc usually still works,
long after a vendor has abandoned a machine. I rather like that gcc
will just work, unlike vendor software, which often says, please buy
a new machine. One doesn't have to remove support in gcc for
something, just because a
I don't see how a VAR_DECL can ever get a DECL_RTL equal to one of
the mentioned regs.
Doesn't that happen when you have a local variable that's a
variable-sized object? What would have changed that would cause it to
no longer happen? This is tree-level stuff, not RTL.
The patch is ok if a
VLA VAR_DECLs have just DECL_HAS_VALUE_EXPR_P set and DECL_VALUE_EXPR being
INDIRECT_REF (or MEM_REF now?) dereferencing some DECL_ARTIFICIAL VAR_DECL
that is initialized from alloca builtin.So the VLA VAR_DECLs don't have
any DECL_RTL at all (kept for debug info purposes only), and the
My personal opinion is that it is better if open source software is not
encumbered by multiple copyright
holders. A copyright holder probably has the right to change the work's
permission notice.
Off-topic, but that works both ways: if you want to ensure that a
work's license's terms will
I found a weird piece of code that was added by kenner in a really early
revision. It checks for VAR_DECLs with frame or stack pointers as
DECL_RTL, and the comment in front of it mentions strength reduction.
Presumably this was for the old loop optimizer? I can't think of
anything that would
I tried to implement that suggestion, but interestingly enough I cannot
really test it since I was unable to find any single case where that
SUBREG case in apply_distributive_law actually causes any difference
whatsoever in generated code.
Do you have any further suggestion of how to find a
Maybe the best solution would be to remove the SUBREG case from the generic
apply_distributive_law subroutine, and instead add a special check for the
distributed subreg case right at the above place in simplify_set; i.e. to
perform the inverse distribution only if it is already guaranteed
patterns.)
Right.
No immediate suggestions, sorry. It looks like the combine case was
added by this pre-egcs patch:
Wed Mar 18 05:54:25 1998 Richard Kenner ken...@vlsi1.ultra.nyu.edu
* combine.c (gen_binary): Don't make AND that does nothing.
(simplify_comparison, case
+@item -mjump-to-noreturn
+@opindex mjump-to-noreturn
+Use a jump instruction instead of a call instruction when calling a
+no-return functions. This option is active if optimization is turned
+on and just affects the way a call instruction is printed out.
Would emit be better here than
Same here, I don't like it but I hardly see any alternative. The only
possibility could be to prevent calling expand_compound_operation
completely for addresses. Richard, what do you think? Don't worry,
combine hasn't changed much since your days. :)
The problem wasn't potential changes
Or being fooled by the 0xfffc masking, perhaps.
No, I'm pretty sure that's NOT the case. The *whole point* of the
routine is to deal with that masking.
at the end. make_compound_operation doesn't know how to
restore ZERO_EXTEND.
It does in general. See make_extraction, which it calls. The question is
why it doesn't in this case. That's the bug.
An and:DI is cheaper than a zero_extend:DI of an and:SI.
That depends strongly on the constants and whether the machine is 32-bit
or 64-bit.
But that's irrelevant in this case since the and:SI will be removed (it
reflects what already been done).
Does it look OK?
No.
If I understand your code correctly, there's essentially the same code
as you have a bit above that:
/* If the constant is one less than a power of two, this might be
representable by an extraction even if no shift is present.
If it doesn't end
But the current code converts (and X 3) into a bit extraction
since ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) = 0) is true
when UINTVAL (XEXP (x, 1)) == 3. Should we do it or not?
By adding the test for nonzero bits, you'd potentially be doing the
conversion more often (which is the point
I am testing this patch. The difference is it checks nonzero
bits of the first operand.
I would suggest moving (and expanding) the comments from the existing block
into your new block.
Like ths?
Yes, that's what I meant. Thanks.
Again, I'd suggest doing some performance testing on this just to verify
that it doesn't pessimize things.
X86 backend doesn't accept the new expression as valid address while
(zero_extend:DI) works just fine. This patches keeps ZERO_EXTEND
when zero-extending address to Pmode. It reduces number of lea from
24173 to 21428 in x32 libgfortran.so. Does it make any senses?
I'd be inclined to have
1. The placement of subreg in
(plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ])
(const_int 4 [0x4])) 0)
(subreg:DI (reg:SI 106) 0))
isn't supported by x86 backend.
That's easy to fix.
2. The biggest problem is optimizing mask 0x to
Let me jump in on this a little bit, since much of the code in this area
was originally written by me.
Are all sizetype (sub-)expressions always of value in that range?
What do we do about the fact that sizetype is unsigned, so -x always
overflows for x != 0? Thus, do we need to disable all a
So what's your opinion on the bug that triggered the patch in question?
Namely extract_muldiv_1 folding
(((10240 - (sizetype) first) + 1) * 8) /[cl] 8
to
((sizetype) first * 0x0fff8 + 81928) /[cl] 8
to
((sizetype) first * 2305843009213693951 + 10241)
thus, folding A
So what's your opinion on the bug that triggered the patch in question?\
Namely extract_muldiv_1 folding
(((10240 - (sizetype) first) + 1) * 8) /[cl] 8
to
((sizetype) first * 0x0fff8 + 81928) /[cl] 8
to
((sizetype) first * 2305843009213693951 + 10241)
I think this
FWIW, elsewhere in gcc we use continue; for empty loop bodies.
I think I've never run into this idiom in about a decade of work on GCC. :-)
Nor have I. I'm used to seeing just a ; on its own line.
Agreed. Things would have been different twenty years ago, but these
days using linker is a lot more natural and common (as a grep in gcc/doc
confirms, too).
Even 20 years ago, I think linker would have been the more natural
word. I remember linker from my IBM days in the early 80's.
That's what we should phase out. The eventual aim should be for (a)
folding on GIMPLE (gimple-fold etc. - working with SSA not combined trees)
as an optimization and (b) folding done by front ends only when required
for language semantics (e.g. constant expressions).
Why? Isn't it
There are pros and cons about early optimization, actually.
Generating extremely optimized IL very early can actually tie up
subsequent passes. For instance, loop unrolling and vectorization.
There are others in the literature.
Sure, in the sorts of examples you mention where there's a level
I think we may be talking at different levels. It's my impression
that Richard K. was referring to local transformations like a - a -
0 once we are in the middle end. I agree that doing that
transformation close to the FE is undesirable, but once we are in the
middle end that should be
47 matches
Mail list logo