from:"marc.glisse at normalesup dot org"

[Bug target/72827] [7 Regression] gnat bootstrap broken on powerpc64le-linux-gnu

2016-08-30 Thread marc.glisse at normalesup dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72827

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup dot 
org

--- Comment #9 from Marc Glisse  ---
(In reply to Eric Botcazou from comment #8)
> Unfortunately I don't seem to be able to connect to gcc112 in the
> CompileFarm:
> 
> eric@arcturus:~> ssh -l ebotcazou gcc112.osuosl.org
> ssh: Could not resolve hostname gcc112.osuosl.org: Name or service not known

gcc112.fsffrance.org (aka gcc2-power8.osuosl.org)

[Bug target/63789] g++ -m32 on solaris has trouble finding abs with int64_t

2016-08-29 Thread marc.glisse at normalesup dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63789

--- Comment #6 from Marc Glisse  ---
Sorry, by recent I meant at least 6.1, I should have been more specific.

[Bug target/63789] g++ -m32 on solaris has trouble finding abs with int64_t

2016-08-29 Thread marc.glisse at normalesup dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63789

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup dot 
org

--- Comment #4 from Marc Glisse  ---
I would expect that this is fixed (or at least the behavior changed) in recent
versions of gcc, that ship a C++ stdlib.h wrapper. Can someone confirm?

[Bug tree-optimization/73714] [Regression 7] Incorrect unsigned long long arithmetic optimization

2016-08-19 Thread marc.glisse at normalesup dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73714

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup dot 
org

--- Comment #2 from Marc Glisse  ---
Ah, my verifier considers that some_32bit_int << 57 is 0, while it actually is
undefined or unspecified, and in particular may yield some_32bit_int << 25 on
some platforms. I hope I didn't introduce too many similar bugs...

I'll look at it more closely when I get time, but anyone should feel free to
revert the first hunk of my patch if they need a quicker resolution.

[Bug tree-optimization/51938] missed optimization: 2 comparisons

2012-06-07 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

Marc Glisse  changed:

   What|Removed |Added

  Component|rtl-optimization|tree-optimization

--- Comment #4 from Marc Glisse  2012-06-07 
14:54:02 UTC ---
Changing to tree-optimization (doing the optimization at RTL level would
require finite-math-only).

There is plenty of code that corresponds to A&&B and A||B, but (almost) nothing
for A&&!B. Quite a big missing piece...

:
  if (x_2(D) > 0.0)
goto ;
  else
goto ;

:
  if (x_2(D) < 0.0)
goto ;
  else
goto ;

The 2 conditions don't share the same then branch or the same else branch (it
is a mix), so ifcombine doesn't even try to turn it into

  if (x_2(D) > 0.0 || !(x_2(D) < 0.0))
goto ;
  else
goto ;

Besides, it doesn't look like the logic is in place to fold that condition into
just its second half (but I may have missed it).

[Bug c++/53360] Problems with -std=gnu++0x

2012-05-15 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53360

--- Comment #1 from Marc Glisse  2012-05-15 
15:00:58 UTC ---
clang and gcc reject it, but intel and oracle accept it.

[Bug c++/53350] Internal compiler error when compiling boost/smart_ptr/intrusive_ptr.hpp 1.49

2012-05-15 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53350

--- Comment #5 from Marc Glisse  2012-05-15 
14:50:42 UTC ---
You may first want to check whether you still get the bug with a more recent
gcc version.

[Bug c/53216] fmaf() alters rounding mode of sse2 FPU

2012-05-03 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53216

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #1 from Marc Glisse  2012-05-03 
19:47:45 UTC ---
This looks like a glibc issue, doesn't it? Or do you see something wrong with
the code gcc produces for this example?

[Bug target/53101] Recognize casts to sub-vectors

2012-05-03 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53101

--- Comment #4 from Marc Glisse  2012-05-03 
19:19:00 UTC ---
(define_peephole2
  [(set (mem:VI8F_256 (match_operand 2))
(match_operand:VI8F_256 1 "register_operand"))
   (set (match_operand: 0 "register_operand")
(mem: (match_dup 2)))]
  "TARGET_AVX"
  [(set (match_dup 0)
(vec_select: (match_dup 1)
 (parallel [(const_int 0) (const_int
1)])))]
)

(and similar for VI4F_256) is much less hackish than the XEXP stuff. I was
quite sure I'd tested exactly this and it didn't work, but now it looks like it
does :-/

Except that following http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00197.html ,
this is not the right place to try and add such logic. That's a good thing
because it is way too fragile, another instruction can easily squeeze between
the two sets and disable the peephole.

[Bug tree-optimization/30318] VRP does not create ANTI_RANGEs on overflow

2012-05-02 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30318

--- Comment #7 from Marc Glisse  2012-05-02 
14:33:42 UTC ---
(In reply to comment #6)
> On Sat, 28 Apr 2012, marc.glisse at normalesup dot org wrote:
> > I find it easier to use bignum and wrap at the end, instead of checking for
> > each operation if it overflows.
> I think using GMP is way too expensive for this (simple) task.

As long as you only try to handle operations on types no larger than
HOST_WIDE_INT, using double_int should be possible. But if you want to handle
wrapping multiplication of __int128, that's going to be hard without a widening
multiplication to __int256. I guess I could implement a mulhi on double_int...
Or at least make sure the slow path is only used for __int128 and not for small
types. Or even fall back to VR_VARYING when __int128 overflows, but that's sad.

(as a side note, it is strange that double_int is signed, it seems it should
break with strict overflow)

> Well, my original idea was to simultanely do range propagation for
> wrapping and undefined overflow, and in the case that both results
> result in different final transforms warn (to avoid the fact that
> we do not fully take advantage of undefined overflow during propagation
> and to avoid false positives on the warnings for undefined overflow).

Good idea.

I guess one of my problems is that there are several possible notions of
overflow and I don't really know which gcc wants.

- wrap (unsigned and -fwrapv)
- saturating (not currently)
- trap (has to detect overflows and do something about them)
- unspecified (don't know anything about the value produced by an overflow, but
it is legal)
- illegal (we are allowed to crash the computer if such a path is ever taken,
but also to just keep going with a random value, that may not even be
consistent between uses, I guess that's -fstrict-overflow)

The comments at the definition of TYPE_OVERFLOW_UNDEFINED seem to indicate that
it means "illegal", but tree-vrp tends to use: non-wrapping => unspecified. And
I don't think value_range_d has a notion of an empty range (VR_UNDEFINED or
VR_RANGE with max

[Bug c++/53177] 20_util/function/cons/callable.cc failed with -m32 -march=corei7

2012-05-01 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53177

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #4 from Marc Glisse  2012-05-02 
00:54:38 UTC ---
(In reply to comment #2)
> I was seeing an ICE in the same place with an earlier version of the changes
> which caused this testcase regression.  I have only managed to reduce it to 
> 10k
> lines so far - that delta-reduced file is attached, I haven't had time to try
> manually reducing it.

If you only want a small example causing the ICE, here is one (-std=c++0x is
enough, no need for -m32 or -march). If you want something that looks vaguely
like a valid C++ program, it's going to be bigger...

extern "C++" namespace __attribute__ ) template < } template < typename >
struct add_rvalue_reference ; template < _Tp > typename declval ( ) noexcept ;
struct { typedef long __type } struct { } template < typename _Res , typename
... _ArgTypes > class function < _Res ( _ArgTypes ) { _Signature_type (
_ArgTypes ) template < typename _Functor > using _Invoke decltype ( ( declval <
_Functor > ) ) template < typename , typename > struct _CheckResult { }
template < ; template < typename _Functor > using _Callable _CheckResult <
_Invoke < _Functor > , _Res > template < typename , typename > using _Requires
template < typename _Functor , typename = _Requires < _Callable < _Functor > ,
void > > function ( _Functor ; } ; f ( function < void > ) { f ( [ ] )

[Bug target/53101] Recognize casts to sub-vectors

2012-05-01 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53101

--- Comment #3 from Marc Glisse  2012-05-01 
17:17:42 UTC ---
(In reply to comment #2)
> but operands[2] and operands[3] don't compare equal with rtx_equal_p, and
> trying a match_dup refuses to compile because of the mode mismatch, so I don't
> know how to constrain 2 and 3 to be "the same".

rtx_equal_p (XEXP (operands[2], 0), XEXP (operands[3], 0))

seems to give the right answer in the 3 manual tests I did. Currently checking
if the testsuite finds something. It is very likely not the right way to do it,
but I didn't find any inspiring pattern in the .md files.

Then I'll see if I understand how the fancy macros make it possible to have a
single piece of code for all modes, and if instead of calling
gen_vec_extract_lo_v8sf I shouldn't give a replacement pattern like (set
(match_dup 0) (vec_select (match_dup 1) (const_int 0))).

[Bug target/53101] Recognize casts to sub-vectors

2012-05-01 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53101

--- Comment #2 from Marc Glisse  2012-05-01 
15:10:26 UTC ---
(In reply to comment #1)
> We get MEM[(T * {ref-all})&x] for the casting (not a BIT_FIELD_REF for
> example).
> This gets expanded to
> 
> (insn 6 5 7 (set (reg:OI 63)
> (subreg:OI (reg/v:V4DF 61 [ x ]) 0)) t.c:8 -1
>  (nil))
> 
> (insn 7 6 8 (set (reg:V2DF 60 [  ])
> (subreg:V2DF (reg:OI 63) 0)) t.c:8 -1
>  (nil))
> 
> but that should be perfectly optimizable.

A bit hard for me (never touched those md files before)... This obviously
incorrect code does the transformation:

(define_peephole2
[
(set
 (match_operand:V8SF 2 "memory_operand")
 (match_operand:V8SF 1 "register_operand")
)
(set
 (match_operand:V4SF 0 "register_operand")
 (match_operand:V4SF 3 "memory_operand")
)
]
  "TARGET_AVX"
[(const_int 0)]
{
  emit_insn (gen_vec_extract_lo_v8sf (operands[0], operands[1]));
  DONE;
})

(the code in this experiment uses __v4sf and __v8sf instead of __m128d/__m256d
in the description above)

but operands[2] and operands[3] don't compare equal with rtx_equal_p, and
trying a match_dup refuses to compile because of the mode mismatch, so I don't
know how to constrain 2 and 3 to be "the same". I tried adding some (subreg:
...) in there, but it didn't match, and looking at the rtl peephole dump, there
isn't any subreg there.

Then maybe peephole isn't the right place, but that's the only one where I
managed to get something that compiles and is executed by the compiler on this
testcase.

[Bug middle-end/53100] Optimize __int128 with range information

2012-05-01 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53100

--- Comment #3 from Marc Glisse  2012-05-01 
12:47:03 UTC ---
(In reply to comment #2)
> and not to introduce them just before an optimization that removes them.

Usually, doing (long)num1*(__int128)(long)num2 does the right thing. I tried in
the example here replacing the plain __int128 multiplications with:

inline bool g1(__int128 x){
  //return(x<=LONG_MAX)&&(x>=LONG_MIN);
  //on 2 lines because of PR30318, unless you apply the patch I posted there
  bool b1 = x<=LONG_MAX;
  bool b2 = x>=LONG_MIN;
  return b1&&b2;
}
inline __int128 mul(__int128 a,__int128 b){
  bool B=g1(a)&&g1(b);
  if(__builtin_constant_p(B)&&B)
return (long)a*(__int128)(long)b;
  return a*b;
}

__builtin_constant_p does detect we are in the right case, however, because of
bad timing between the various optimizations, the double cast
(__int128)(long)(u-x) is simplified to just (u-x) before it gets a chance to
help. I need to replace the subtraction instead (or in addition) to the
multiplication:

inline __int128 sub(__int128 a,__int128 b){
  bool B=g1(a)&&g1(b)&& g1(a-b);
  if(__builtin_constant_p(B)&&B)
return (long)a-(long)b;
  return a-b;
}

But it would fit better inside the compiler than as a fragile use of
__builtin_constant_p.

[Bug middle-end/27139] Optimize double INT->FP->INT conversions with -ffast-math

2012-05-01 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27139

--- Comment #4 from Marc Glisse  2012-05-01 
09:32:25 UTC ---
Hello Uros,
is there any other case you think should be handled, or should we close the
bug?

[Bug c++/53173] PROD02

2012-04-30 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53173

--- Comment #1 from Marc Glisse  2012-04-30 
20:02:59 UTC ---
Uh, where are you reporting a bug in gcc?

(In reply to comment #0)
> I am trying to upgrade (GCC) 4.4.0 to (GCC) 4.6.2.  I see bunch of 
> incompatible
> error from code which works with (GCC) 4.4.0 but NOT with (GCC) 4.6.2. 

Yes, g++ becomes better at detecting illegal code.

> 1. error: ‘constexpr’ needed for in-class initialization of static data member

Are you using -std=c++0x? Why?

> 2. error: no matching function for call to ‘std::pair boost::shared_ptr 3. /usr/include/sigc++-2.0/sigc++/signal.h:38:11: error: 'ptrdiff_t' does not
> name a typeFix: #include  

actually stddef.h if you want ptrdiff_t and not just std::ptrdiff_t (unless
there is a using namespace std, as 6. makes me fear)

> 4. error: no matching function for call to ‘make_pair(std::string&,
> std::string&)’

#include 

> 5. error: declaration of ‘~typename

Missing most of the message again

> 6. error: call of overloaded ‘isnan(double&)’ is ambiguous

PR48891 maybe?

> I  do refer https://wiki.edubuntu.org/GCC4.6 to fix some of the issue. I
> rebuilt boost_1_47_0,  SQLAPI-3.7.35, etc. with (GCC) 4.6.2 as well to remove
> incompatibilty between these.

Gcc release notes often also contain relevant information, too.

> I am suspicious if some of the issue is already fixed in (GCC) 4.6.3 (already
> released).

What do you mean, fixed? The bugs are in your code.

> Please let me know if we can use (GCC) 4.6.3 instead of (GCC) 4.6.2.

Sure, more bugs fixed.

[Bug c++/51312] [C++0x] Wrong interpretation of converted constant expressions (for enumerator initializers)

2012-04-29 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51312

--- Comment #5 from Marc Glisse  2012-04-29 
14:12:12 UTC ---
Created attachment 27261
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27261
build_enumerator patch

Changes the behavior on g++.dg/cpp0x/enum_base.C from an error to a warning in
the first 2 cases. And reusing the narrowing warning message may look a bit
strange.

enum E4 : char { 
  val = 500 // { dg-error "too large" }
};

enum_base.C:9:9: warning: narrowing conversion of '500' from 'int' to 'char'
inside { } [-Wnarrowing]
   val = 500 // { dg-error "too large" }
 ^
enum_base.C:9:9: warning: overflow in implicit constant conversion [-Woverflow]
   val = 500 // { dg-error "too large" }
 ^

[Bug c++/53159] New: Missing narrowing check

2012-04-29 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53159

 Bug #: 53159
   Summary: Missing narrowing check
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


In this example, I get narrowing warnings for a and b but not c.

struct X
{
  constexpr operator int() { return __INT_MAX__; }
};
int f(){ return __INT_MAX__; }

signed char a { __INT_MAX__ };
signed char b { f() };
signed char c { X{} };

[Bug libstdc++/48891] std functions conflicts with C functions when building with c++0x support (and using namespace std)

2012-04-29 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48891

--- Comment #6 from Marc Glisse  2012-04-29 
13:15:40 UTC ---
I don't think it matters that much whether the return type is int or bool,
compared to the inconvenience of having 2 functions that conflict.

The constexpr qualifier is nice, but not required by the standard, and not even
by gcc which recognizes that extern "C" int isnan(double) is a builtin (note
that it doesn't recognize it anymore if you change the return type to bool,
that should be fixed).

For the same reason (recognized as a builtin), there is no performance
advantage to having it inline.

So I think:
* glibc could change the return type of isnan to bool in C++ (there would be a
regression in that ::isnan wouldn't be constexpr and inline until g++ is taught
the right prototype)
* libstdc++ could import ::isnan in std::, assuming isnan exists. Maybe that
requires a configure test. Maybe that test would be rather fragile (depends on
feature macros). Maybe that's where this stops being a good idea :-(

[Bug middle-end/53100] Optimize __int128 with range information

2012-04-29 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53100

--- Comment #2 from Marc Glisse  2012-04-29 
08:42:36 UTC ---
(In reply to comment #1)
> On the other hand, tree-vrp does have the information that the
> differences are in [-4294967295, 4294967295], which comfortably fits in a type
> half the size of __int128. It seems a possible strategy would be to have
> tree-vrp mark variables that fit in a type half their size (only for TImode?),
> try and preserve that information along the way, and finally use it in
> expand_doubleword_mult.

An other possibility would be, when the range analysis detects this situation,
to have it introduce a double-cast: (__int128)(long)var. In the example here,
it would give:

((__int128)(long)((__int128)c-(__int128)a))*((__int128)(long)((__int128)f-(__int128)b))

and existing optimizations already handle:

(long)((__int128)c-(__int128)a) as (long)c-(long)a

and

(__int128)mylong1*(__int128)mylong2 as a widening multiplication.

But then we'd have to be careful not to introduce too many such casts, not to
introduce them too late, and not to introduce them just before an optimization
that removes them. And find the appropriate half-sized type to cast to. And
possibly do this only for modes not handled natively.

[Bug middle-end/53100] Optimize __int128 with range information

2012-04-29 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53100

--- Comment #1 from Marc Glisse  2012-04-29 
08:05:59 UTC ---
(In reply to comment #0)
> It would be convenient if I
> could just write the whole code with __int128 and let the compiler do the
> optimization by tracking the range of numbers.

The transformation from an __int128 to a pair of long happens extremely late
(optabs.c), so we can't count on tree-vrp to notice that one of them is always
zero (and actually it is either 0 or -1, as a sign extension, which would make
this hard). On the other hand, tree-vrp does have the information that the
differences are in [-4294967295, 4294967295], which comfortably fits in a type
half the size of __int128. It seems a possible strategy would be to have
tree-vrp mark variables that fit in a type half their size (only for TImode?),
try and preserve that information along the way, and finally use it in
expand_doubleword_mult. But that seems to imply storing the information in an
rtx, and rtx seems a bit too densely packed to add this.

Better ideas?

[Bug c/43772] Errant -Wlogical-op warning when testing limits

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43772

--- Comment #19 from Marc Glisse  2012-04-28 
22:16:55 UTC ---
(In reply to comment #18)
> I'm afraid that false positives would still be likely.
> For example, suppose we're on a platform where
> INT_MAX = LONG_MAX < INTMAX_MAX.  Then:
> 
>   intmax_t i = (whatever);
>   if (INT_MAX < i && i <= LONG_MAX)
>  print ("i is in 'long' but not 'int' range");

Have you actually seen that? I would imagine the following to be more common:
if(i<=INT_MAX)
  print("i is in 'int'");
else if(i<=LONG_MAX)
  ...

> This sort of thing is fairly common in portable code,
> and GCC shouldn't warn about it merely because
> we're on a platform where the two tests cannot both
> be true when INT_MAX == LONG_MAX.

Well, can you define a set of circumstances where gcc could / should warn?
a

[Bug testsuite/53155] Not parallel: test for -j fails with new make

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53155

--- Comment #2 from Marc Glisse  2012-04-28 
21:49:43 UTC ---
laptop-mg /tmp/m $ cat Makefile 
all:
$(MAKE) plouf

plouf:
echo $(MFLAGS) "$(filter -j, $(MFLAGS))"
laptop-mg /tmp/m $ make -j
make plouf
make[1]: Entering directory `/tmp/m'
echo -wj ""
-wj 
make[1]: Leaving directory `/tmp/m'

version 3.81-8.2

[Bug testsuite/53155] New: Not parallel: test for -j fails

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53155

 Bug #: 53155
   Summary: Not parallel: test for -j fails
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


Hello,

in order to decide whether they should run the testsuite in parallel, the
makefiles in gcc/ and libstdc++-v3/testsuite/ use the following test:

[ "$(filter -j, $(MFLAGS))" = "-j" ]

However, at least with the gnu make 3.81 shipped by debian, MFLAGS merges all
options, so it would normally be something like -wkj, which doesn't match the
filter.

[Bug c/43772] Errant -Wlogical-op warning when testing limits

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43772

--- Comment #17 from Marc Glisse  2012-04-28 
18:49:49 UTC ---
(In reply to comment #16)
> I understand now, and I think you are right. We don't have a warning for
> "((int)x) < INT_MIN" or ((int)x) > INT_MAX but I think it should go to
> Wtype-limits.

Interestingly, for an int i, we don't warn for x<=INT_MAX, but we do warn for
x<=(long)INT_MAX (adapt if your platform has int and long of the same size).

> Do you think we could test this situation just before the Wlogical-op warning?

It is easy to re-check inside warn_logical_operator if one of the tests is
always true. I have no idea how to pass the information from Wtype-limits that
warn_logical_operator shouldn't be called.

> I can see that some macros may generate x >= INT_MIN but the x < INT_MIN case
> seems less likely to be intented and we should warn (and then return and avoid
> warning with Wlogical-op).

I think < INT_MIN and >= INT_MIN should either both warn of both be quiet. It
is a matter of style whether people write:
if (x in range) do the work;
or
if (x out of range) abort;
do the work;

(In reply to comment #12)
> Do you mean:
> 
> if (or_op && integer_onep(tem)) { warn();}
> else if (!or_op && integer_zerop(tem)) { warn();}

Even smaller would be to replace the current (TREE_CODE (tem) != INTEGER_CST)
with integer_zerop(tem) and pass build_range_check in_p^or_op (or in_p==or_op,
don't know which) instead of just in_p. It would already be an improvement over
the current situation, and I expect the remaining false positives to be very
rare. i>=INT_MIN&&isomething are common, but
isomething seems less likely.

[Bug tree-optimization/30318] VRP does not create ANTI_RANGEs on overflow

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30318

--- Comment #5 from Marc Glisse  2012-04-28 
13:18:25 UTC ---
Created attachment 27260
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27260
Wrap using gmp

I find it easier to use bignum and wrap at the end, instead of checking for
each operation if it overflows.

There is something wrong about having better range propagation for the wrapping
case than for the case where overflow is undefined behavior. There are cases
where a range is set to varying whereas it could be set to empty, and the
branch marked as unreachable (haven't seen how that's done). But that's not the
subject of this bug.

[Bug c/43772] Errant -Wlogical-op warning when testing limits

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43772

--- Comment #15 from Marc Glisse  2012-04-28 
12:55:28 UTC ---
(In reply to comment #14)
> (In reply to comment #13)
> > 
> > Except that this version would warn for xINT_MAX, whereas this
> > belongs to other warnings. So testing the triviality of the first ranges 
> > seems
> > best.
>
> I don't understand. This warning (whatever its name) should precisely warn for
> that with "logical 'and' of mutually exclusive tests is always false".

No, there could be a warning that the first test is always false, another one
that the second one is always false, but adding a third warning that the
conjunction of the 2 is always false seems bogus. This warning is meant for:
x<5&&x>10, where each test independently could be true, just not both at the
same time.

At least that is my understanding...

[Bug c/53131] -Wlogical-op: ready for prime time in -Wall ?

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53131

--- Comment #6 from Marc Glisse  2012-04-28 
12:45:19 UTC ---
(In reply to comment #5)
> It seems a pretty small warning, but I guess #1 and #2 could
> be split up, if that helps get #2 in.

I think it is the opposite actually, #2 is more controversial than #1 (at least
until PR43772 is fixed).

[Bug c/43772] Errant -Wlogical-op warning when testing limits

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43772

--- Comment #13 from Marc Glisse  2012-04-28 
12:40:14 UTC ---
(In reply to comment #10)
> But there is something strange, because it is warning "it is always false",
> which is obviously not true. So I think at some moment it is doing some
> transformation we don't want to do.

It notices that it should warn, and unless one of the first ranges is trivial
(a case it forgot), with an operator &&, the only warning that makes sense is
that it is always false. It never shows that it is false, it is just a bit
hasty in deciding which warning to pick. And indeed the "logical and...always
true" sentence does not exist, because it doesn't make sense.

(In reply to comment #11)
> (In reply to comment #9)
> > It forgets to check first whether the first 2 ranges are trivial.
> Or easier, instead of checking:
>   if (TREE_CODE (tem) != INTEGER_CST)
> it could check integer_onep(tem) or integer_zerop(tem) depending on or_op. Or
> build a tree integer constant from or_op and tree_int_cst_equal it to tem.

Except that this version would warn for xINT_MAX, whereas this
belongs to other warnings. So testing the triviality of the first ranges seems
best.

[Bug c/43772] Errant -Wlogical-op warning when testing limits

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43772

--- Comment #11 from Marc Glisse  2012-04-28 
12:33:26 UTC ---
(In reply to comment #9)
> It forgets to check first whether the first 2 ranges are trivial.

Or easier, instead of checking:
  if (TREE_CODE (tem) != INTEGER_CST)
it could check integer_onep(tem) or integer_zerop(tem) depending on or_op. Or
build a tree integer constant from or_op and tree_int_cst_equal it to tem.

[Bug c/43772] Errant -Wlogical-op warning when testing limits

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43772

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #9 from Marc Glisse  2012-04-28 
12:19:54 UTC ---
For : x>=INT_MIN && x<=INT_MAX
the code creates a range for x>=INT_MIN, another range for x<=INT_MAX, merges
them into a single range, checks that that range is trivial (empty or full),
and then warns according to the operator && or ||. It forgets to check first
whether the first 2 ranges are trivial.

[Bug c/53131] -Wlogical-op: ready for prime time in -Wall ?

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53131

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #4 from Marc Glisse  2012-04-28 
12:05:20 UTC ---
(In reply to comment #2)
> > Do the warnings indicate bugs or not?
> Yes. I checked the first ten.

Could you give a sample? -Wlogical-op merges 2 unrelated warnings:
*) x && 2 (you would expect a boolean, not 2, so maybe x&2 was meant)
*) x<0 && x>0 (not so likely to happen) or x>=-5 || x<2 (always true)

and it is not clear which one you are most interested in.

[Bug c/53153] ice in tree_low_cst, at tree.c:6569

2012-04-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53153

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #1 from Marc Glisse  2012-04-28 
09:11:45 UTC ---
Reduced:

void f (char *BufPtr) {
  int Char = *BufPtr;
  switch (Char) {
case 'a':
case 181:
case ~(0xff & (~180)):
  PrintError();
  }
}

$ gcc a.c -c -O2
a.c: In function 'f':
a.c:3:3: internal compiler error: in tree_low_cst, at tree.c:6569
   switch (Char) {
   ^

The regression is recent.

[Bug c++/53139] internal compiler error: expected a type, got '#'tree_vec' not supported by dump_expr#'

2012-04-27 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53139

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #1 from Marc Glisse  2012-04-27 
10:26:35 UTC ---
Works fine on trunk (since very recently). The 4.6 message looks fine (it
indeed wasn't implemented in 4.6). Can you check whether it works with a 4.7
snapshot?

[Bug c++/29131] [DR 225] Bad name lookup for templates due to fundamental types namespace for ADL.

2012-04-26 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29131

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #24 from Marc Glisse  2012-04-26 
19:14:45 UTC ---
(In reply to comment #22)
> I am sorry if my knowledge on this issue is limited, but if I put t() and f()
> in namespace glm (re. the code in comment #20), should this compile? (That is
> what you comment #19 implies). Actually it does not.

So you are talking about this? Notice how vec3 isn't actually in glm.
Interactions between namespaces and name lookup can be difficult.

namespace glm {
  namespace detail {
struct vec3{};
  }
  using detail::vec3;
}

template
int t(T i)
{
  return f (i);
}

namespace glm {
  int
f (glm::vec3 i)
{
  return 0;
}
}

int main()
{
  glm::vec3 b;
  return t(b);
}

[Bug c++/53121] New: Allow static_cast from pointer-to-vector to pointer-to-object

2012-04-25 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53121

 Bug #: 53121
   Summary: Allow static_cast from pointer-to-vector to
pointer-to-object
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


Hello,

casting a __m128d* to a double* currently requires an ugly C cast. However,
this pointer cast is the official way to access the elements of the vector, so
I believe it should be allowed in a static_cast. Whether that should extend to
casts with references and arrays is a harder question, but the answer is
probably yes. Casts to sub-vectors (__m256d* to __m128d*) would be a bonus.

#include 
double* f(__m128d* x){
  return static_cast(x);
  // return (double*)(x);
}

v.cc: In function ‘double* f(__m128d*)’:
v.cc:3:32: error: invalid static_cast from type ‘__m128d* {aka __vector(2)
double*}’ to type ‘double*’

[Bug c++/53000] Conditional operator does not behave as standardized

2012-04-24 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53000

--- Comment #7 from Marc Glisse  2012-04-24 
23:23:09 UTC ---
(In reply to comment #6)
> which way is the standards committee leaning?

The DR is young, there hasn't been a meeting since. There weren't many
objections to the proposed resolution, although it did seem strange to some
that common_type::type would be int and not int&. I am too new to
the process to say more...

(I guess the proposed resolution should make the one-argument version of
common_type equivalent to decay, to be consistent)

[Bug c++/53000] Conditional operator does not behave as standardized

2012-04-24 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53000

--- Comment #5 from Marc Glisse  2012-04-24 
22:35:31 UTC ---
(In reply to comment #4)
> it's not obvious to me what the right fix is
> either so I'm not in a rush to change anything.

Actually, I now believe it is a good idea to rush (well, maybe not quite) the
change:
- it is needed by clang,
- it gives users an opportunity to complain against the proposed resolution (if
they don't, it is an argument in favor of it),
- it removes an excuse not to fix ?: with xvalues.

I think I've canceled my comment #3 enough that we are back to your comment #2
where you were proposing to make the change ;-)

[Bug c++/53000] Conditional operator does not behave as standardized

2012-04-24 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53000

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #3 from Marc Glisse  2012-04-24 
19:31:52 UTC ---
(In reply to comment #2)
> Confirmed.
> 
> I suppose we could make the libstdc++ change now rather than waiting for the 
> FE
> fix, as it shouldn't change the current behaviour of the library.

It doesn't seem completely obvious to me that this is the right library fix.
What happens if instead of the standard declval you use the trivial version?

  template _Tp __declval2() noexcept;

(except for the obvious problem with indestructible types, but then the decay
version may give you an answer that isn't constructible from the input for
references to a non-copyable type, so that's fair)

Rereading the DR, it appears that some people actually want to decay
independently from this rvalue issue, which is quite a strong change. And after
all, people can use decay, but if decay is included in
common_type, it can't be undone.

Although now that I think as a library writer who has to specialize common_type
for some of his types, I don't really want to specialize it for all cv-ref
variants of my types, so I'd actually like the default common_type to decay not
only the result, but also its arguments! And while we are at it, it could even
try canonicalizing them, like operator auto().

Hmm, I guess you can forget this rant and go ahead (I am still posting it
because there may be real arguments somewhere).

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-04-24 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

--- Comment #23 from Marc Glisse  2012-04-24 
11:57:22 UTC ---
(In reply to comment #21)
> What does it mean "exercise the backend a lot"? Do you mean it takes a lot of
> time?

I think so.

> I haven't looked at the tests, but I think it is not a problem to run
> compile-only tests with both gcc and g++. 

compile-time tests are not always sufficient.

The __builtin_shuffle tests are spread in:
gcc.dg{,/torture}
gcc.target/{i386,powerpc}
gcc.c-torture/{compile,execute}

I assume the tests in gcc.dg can move to c-c++-common. The target tests should
stay in target. Not sure about gcc.c-torture.

But one interesting thing to test is if the front-end passes the arguments as
constants and thus the backend can use specialized code instead of the slow
generic one. And this kind of test seems necessarily target-specific. Bah, I
guess I shouldn't ask for too much and moving the gcc.dg tests would be enough.

[Bug target/53101] New: Recognize casts to sub-vectors

2012-04-24 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53101

 Bug #: 53101
   Summary: Recognize casts to sub-vectors
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org
Target: x86_64-linux-gnu


Hello,

starting from an AVX __m256d vector x, getting its first element is best done
with *(double*)&x, which is what x[0] internally does, and which generates no
instruction (well, the following has vzeroupper, but let's forget that).
However, *(__m128d*)&x generates 2 movs and I have to explicitly use
_mm256_extractf128_pd to get the proper nop. Could the compiler be taught to
recognize the casts between pointers to vectors of the same object type the
same way it recognizes casts to pointers to that object type?

#include 
#if 0
typedef double T;
#else
typedef __m128d T;
#endif
T f(__m256d x){
  return *(T*)&x;
}

The closest report I found is PR 44551, which is quite different. PR 29881
shows that using a union is not an interesting alternative. I marked this one
as target, but it may very well be that the recognition should be in the
middle-end, or even that the front-end should mark the cast somehow.

[Bug middle-end/53100] New: Optimize __int128 with range information

2012-04-24 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53100

 Bug #: 53100
   Summary: Optimize __int128 with range information
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: middle-end
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


(not sure about the "component" field)

In the following program, on x86_64, the first version generates two imulq,
while the second generates 4 imulq and 2 mulq. It would be convenient if I
could just write the whole code with __int128 and let the compiler do the
optimization by tracking the range of numbers.

int f(int a,int b,int c,int d,int e,int f){
#if 0
  long x=a;
  long y=b;
  long z=c;
  long t=d;
  long u=e;
  long v=f;
  return (z-x)*(__int128)(v-y) < (u-x)*(__int128)(t-y);
#else
  __int128 x=a;
  __int128 y=b;
  __int128 z=c;
  __int128 t=d;
  __int128 u=e;
  __int128 v=f;
  return (z-x)*(v-y) < (u-x)*(t-y);
#endif
}

[Bug middle-end/27139] Optimize double INT->FP->INT conversions with -ffast-math

2012-04-23 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27139

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #2 from Marc Glisse  2012-04-24 
06:35:43 UTC ---
(In reply to comment #0)
> int test (int a)
> {
> return (double) a;
> }

I just wrote the very same testcase today, extracted from my code...

> Produces:
> 
> cvtsi2sd%edi, %xmm0
> cvttsd2si   %xmm0, %eax
> ret

Still does. Did you have any idea how to handle it?

> However, following code does the same (at least for -ffast-math):
> movl%edi, %eax
> ret

I don't think -ffast-math is relevant here, on x86 the int->double conversion
is exact hence the reverse has to be as well.

(In reply to comment #1)
> Confirmed, I doubt this shows up that much anyways.

Just posting to mention that it does show up...

[Bug c++/53094] New: vector literal

2012-04-23 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53094

 Bug #: 53094
   Summary: vector literal
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


Hello,

VECTOR_TYPE should be a literal type in C++11, so we can have for instance:
constexpr __m128i v = { 1, 0 };
constexpr __m128i s = v + v;

Once PR c++/51033 is fixed, ideally, the following would also work:
constexpr long long i = v[1];
constexpr __m128i w = __builtin_shuffle (m, m);

but I guess this can be made in several steps as long as the compiler doesn't
ICE on those.

[Bug middle-end/53082] local malloc/free optimization

2012-04-23 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53082

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #2 from Marc Glisse  2012-04-23 
07:11:14 UTC ---
(In reply to comment #1)
> Dup of an older bug 19831.

The second part (coalescing mallocs and/or replacing them with alloca) doesn't
look like a dup of 19831.

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-04-22 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

--- Comment #22 from Marc Glisse  2012-04-22 
15:09:23 UTC ---
(In reply to comment #20)
> And then I still need to write a cxx_eval_vec_perm function so the result of
> __builtin_shuffle can be constexpr. I haven't seen how the C front-end handles
> shuffles of constants. Maybe a "sorry" would do for now.

Making vectors literals is too much for now, the following seems sufficient as
long as they are not.

--- cp/semantics.c(revision 186667)
+++ cp/semantics.c(working copy)
@@ -8262,10 +8262,11 @@ potential_constant_expression_1 (tree t,
 case TRANSACTION_EXPR:
 case IF_STMT:
 case DO_STMT:
 case FOR_STMT:
 case WHILE_STMT:
+case VEC_PERM_EXPR:
   if (flags & tf_error)
 error ("expression %qE is not a constant-expression", t);
   return false;

 case TYPEID_EXPR:

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-04-22 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

--- Comment #20 from Marc Glisse  2012-04-22 
13:21:14 UTC ---
(In reply to comment #19)
> Created attachment 27217 [details]
> shuffle

Doesn't work with -std=c++11, which requires:

--- semantics.c(revision 186667)
+++ semantics.c(working copy)
@@ -5603,11 +5603,12 @@ float_const_decimal64_p (void)

 bool
 literal_type_p (tree t)
 {
   if (SCALAR_TYPE_P (t)
-  || TREE_CODE (t) == REFERENCE_TYPE)
+  || TREE_CODE (t) == REFERENCE_TYPE
+  || TREE_CODE (t) == VECTOR_TYPE)
 return true;
   if (CLASS_TYPE_P (t))
 {
   t = complete_type (t);
   gcc_assert (COMPLETE_TYPE_P (t) || errorcount);
@@ -8487,10 +8488,11 @@ potential_constant_expression_1 (tree t,
   want_rval, flags))
   return false;
   return true;

 case FMA_EXPR:
+case VEC_PERM_EXPR:
  for (i = 0; i < 3; ++i)
   if (!potential_constant_expression_1 (TREE_OPERAND (t, i),
 true, flags))
 return false;
  return true;


And then I still need to write a cxx_eval_vec_perm function so the result of
__builtin_shuffle can be constexpr. I haven't seen how the C front-end handles
shuffles of constants. Maybe a "sorry" would do for now.

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-04-22 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

--- Comment #19 from Marc Glisse  2012-04-22 
10:31:33 UTC ---
Created attachment 27217
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27217
shuffle

With this patch, g++ passes the few __builtin_shuffle tests I tried, and
generates generic code for non-constant indexes and special code for constant
indexes. I don't really know what to do about the testsuite. The tests exercise
the backend a lot, and it probably doesn't make sense to run everything with
both gcc and g++. But we still want to test that g++ accepts the syntax, and
maybe even that it handles constants well.

Content of the patch:
- move c_build_vec_perm_expr to c-common and condition the maybe_const stuff to
the dialect
- adapt the C RID_BUILTIN_SHUFFLE parser code to the C++ FE (the 2 are
different enough that it isn't easy to share)
- remove the C_ONLY tag from __builtin_shuffle

As usual, my limited knowledge of the compiler means I may have missed
fundamental things.

[Bug c/53060] Typo in build_binary_op for scalar-vector ops

2012-04-21 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53060

--- Comment #2 from Marc Glisse  2012-04-21 
14:59:54 UTC ---
(In reply to comment #1)
> * gcc.dg/scal-to-vec2.c: New test.

This one runs the problematic code, but since this is a compile-only test, it
can't detect a problem. A variant that does fail:

extern void abort (void);

int f(void) { return 2; }
unsigned intg(void) { return 5; }
unsigned inth = 1;

typedef unsigned int vec __attribute__((vector_size(16)));

vec i = { 1, 2, 3, 4};

vec fv1(void) { return i + (h ? f() : g()); }
vec fv2(void) { return (h ? f() : g()) + i; }

int main(){
  vec j = fv1();
  if (j[0] != 3) abort();
}

(it works ok with fv2)

[Bug c/53060] New: Typo in build_binary_op for scalar-vector ops

2012-04-21 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53060

 Bug #: 53060
   Summary: Typo in build_binary_op for scalar-vector ops
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


Hello,

in file c-typeck.c, function build_binary_op, for mixed scalar-vector
operations, there are 2 cases: stv_firstarg and stv_secondarg. The first one
has:
op0 = c_wrap_maybe_const (op0, true);
while the second has:
op0 = c_wrap_maybe_const (op1, true);

I think the second one should read "op1 = ...", for symmetry.

I haven't managed to come up with a testcase that runs this line of code :-(

[Bug c++/53057] [c++0x] ICE on construction off of initializer list with overloads for constructor

2012-04-21 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53057

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #1 from Marc Glisse  2012-04-21 
07:57:23 UTC ---
This seems to have been fixed recently on trunk. Maybe related to PR c++/52905
?

[Bug c++/53025] [C++11] noexcept operator depends on copy-elision

2012-04-21 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53025

--- Comment #2 from Marc Glisse  2012-04-21 
07:45:57 UTC ---
Created attachment 27210
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27210
patch

Bootstrapped and regression tested.

Not posting it to gcc-patches yet, for several reasons:
- I have other patches (at least 3) waiting for a review,
- I am not 100% certain that this can't cause legitimate elisions to be missed
(say if something is first instantiated inside the noexcept),
- people may not like using globals that way,
- I might prefer the old behavior...

but if anyone wants to submit it, feel free.

[Bug c++/53055] ICE in cp_build_indirect_ref, at cp/typeck.c:2836

2012-04-20 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53055

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #2 from Marc Glisse  2012-04-20 
14:44:29 UTC ---
A brutal application of delta gives this short but non-sensical code:

void f () ;
struct A A :: * p ;
int i = p ->* f ;

[Bug c++/51314] [C++0x] sizeof... and parentheses

2012-04-19 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51314

--- Comment #2 from Marc Glisse  2012-04-19 
21:19:23 UTC ---
Created attachment 27200
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27200
patch

s.cc: In function 'void f(U ...)':
s.cc:3:18: error: 'sizeof...' argument must be surrounded by parentheses
   A x; // template argument 1 is invalid
  ^
s.cc:3:19: error: template argument 1 is invalid
   A x; // template argument 1 is invalid
   ^
s.cc:3:22: error: invalid type in declaration before ';' token
   A x; // template argument 1 is invalid
  ^
s.cc: At global scope:
s.cc:10:37: error: 'sizeof...' argument must be surrounded by parentheses
   typedef Indices type; // OK
 ^

Error recovery is not that great in the first case, but fine in the second.

[Bug c++/53036] [c++11] trivial class fails std::is_trivial test

2012-04-19 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53036

--- Comment #2 from Marc Glisse  2012-04-19 
12:14:04 UTC ---
Created attachment 27189
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27189
basic patch

The patch detects D as trivial.

Sadly, on this case:
struct A {
  A()=default;
  A(int=2);
};
it says A is trivial whereas I guess the ambiguity makes it non-trivial. That
could be solved for the traits by combining it with is_default_constructible,
but it may be problematic to let g++ internally believe that the class is
trivially default constructible. For some strange reason, in the case of an
ellipsis:
struct A {
  A()=default;
  A(...);
};
it does say: non-trivial.

Maybe the whole dance should only be done if the constructor argument is a
parameter pack (one that belongs to the function? or several packs?).

[Bug c++/53036] [c++11] trivial class fails std::is_trivial test

2012-04-18 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53036

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #1 from Marc Glisse  2012-04-19 
06:28:57 UTC ---
(In reply to comment #0)
> In my understanding of the new C++ standard, the following code should 
> compile.
> It does not.
> 
>   struct D
>   {
>   D() = default;
>   D(D const &) = default;
>   template
>   constexpr D(U ...u)
>   {}
>   };
>   static_assert(std::is_trivial::value, "here");

With the declarations in this order, it seems easy to fix, in
grok_special_member_properties, only set TYPE_HAS_COMPLEX_DFLT to 1 if we
didn't already have TYPE_HAS_DEFAULT_CONSTRUCTOR (might have hidden issues, but
they are not obvious to me).

Now if you put the defaulted constructor after the user-provided variadic one,
it becomes much harder, and it looks like we'd have to remember one extra bit
of information: the reason why we set TYPE_HAS_COMPLEX_DFLT.

[Bug c++/53025] [C++11] noexcept operator depends on copy-elision

2012-04-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53025

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #1 from Marc Glisse  2012-04-18 
06:18:57 UTC ---
(In reply to comment #0)
> b) An even more convincing argument is that when adding the compiler argument
> 
> --no-elide-constructors 
> 
> the original code becomes accepted as well, thus the outcome indeed depends on
> copy-elision taking place or not. The semantics of the noexcept operator
> (5.3.7) are described by "potentially evaluated functions calls" and 3.2 p3
> says in a note that "A constructor selected to copy or move an object of class
> type is odr-used even if the call is actually elided by the implementation", 
> so
> this observable behaviour seems to be non-conforming.

It seems you are right, because the standard gives an unusual definition of
"potentially evaluated". In English an elided function call is not potentially
evaluated as the code for it isn't even generated.

It looks like the standard may require noexcept to be computed as if there were
no elisions, but that is a code pessimization that may not be necessary (or it
may be, so we can better rely on noexcept not subtly changing when the
circumstances are just different enough that the compiler won't elide a copy).

I wonder if saving flag_elide_constructors and setting it to false in
cp_parser_unary_expression before calling cp_parser_expression, and restoring
it afterwards (like many other flags get saved, set and restored) would be
enough, or if the elision is sometimes done later.

[Bug c/53024] New: Power of 2 requirement on vector_size not documented

2012-04-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53024

 Bug #: 53024
   Summary: Power of 2 requirement on vector_size not documented
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


Hello,

typedef float VEC __attribute__ ((__vector_size__ (12)));

fails to compile with the message:
error: number of components of the vector not a power of two

This is quite clear, and I guess it makes sense. However,
http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html says:

"Specifying a combination that is not valid for the current architecture will
cause GCC to synthesize the instructions using a narrower mode."

so I was expecting gcc to handle it somehow. Could we add a sentence, anywhere
in that page, that makes the requirement that the size is a power of 2
explicit? Or if the requirement can be lifted... (I don't care so much about 3
float, I can just store 4 and ignore the last, but I do care about 12 double
and don't want to store 16 until we get 512bit vectors)

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-04-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

--- Comment #18 from Marc Glisse  2012-04-17 
16:41:58 UTC ---
(In reply to comment #17)
> > And now I should actually bootstrap and run the testsuite ;-)
> Good luck!

It worked fine, same failures as I got the other day for another patch.

> BTW, it may be handy to get an account in the GCC compile farm:
> http://gcc.gnu.org/wiki/CompileFarm

Thanks for the advice. I looked into it once, but don't currently need it: make
-j check leaves my 3 year old desktop at least 60% idle, and the few
architectures that could tempt me are not available in the farm (an x64 with
AVX? a sparc with VIS3?).

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-04-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

--- Comment #16 from Marc Glisse  2012-04-17 
13:57:05 UTC ---
(In reply to comment #15)
> Are you planning to send it to gcc-patches for approval or are you not happy
> with it yet?

There is the problem of moving the testcases. What svn diff prints is nonsense,
so I guess I should just write the Changelog and let whoever commits do the
moves?

The following can move to c-c++-common:
   gcc.dg/vector-2.c
   gcc.dg/vector-subscript-2.c
   gcc.dg/vector-3.c
   gcc.dg/vector-subscript-3.c
   gcc.dg/vector-init-1.c
   gcc.dg/vector-4.c
   gcc.dg/vector-init-2.c
   gcc.dg/vector-1.c
   gcc.dg/vector-subscript-1.c

with these minor modifications:

Index: c-c++-common/vector-subscript-1.c
===
--- c-c++-common/vector-subscript-1.c(revision 186523)
+++ c-c++-common/vector-subscript-1.c(working copy)
@@ -6,7 +6,7 @@

 float vf(vector float a)
 {
-  return 0[a]; /* { dg-error "subscripted value is neither array nor pointer
nor vector" } */
+  return 0[a]; /* { dg-error "subscripted value is neither array nor pointer
nor vector|invalid types .* for array subscript" } */
 }


Index: c-c++-common/vector-3.c
===
--- c-c++-common/vector-3.c(revision 186523)
+++ c-c++-common/vector-3.c(working copy)
@@ -2,4 +2,7 @@

 /* Check that we error out when using vector_size on the bool type. */

+#ifdef __cplusplus
+#define _Bool bool
+#endif
 __attribute__((vector_size(16) )) _Bool a; /* { dg-error "" } */


And now I should actually bootstrap and run the testsuite ;-)

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-04-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

--- Comment #14 from Marc Glisse  2012-04-17 
13:06:40 UTC ---
Created attachment 27178
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27178
subscript 2 (Manuel-compliant)

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-04-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

--- Comment #12 from Marc Glisse  2012-04-17 
11:59:12 UTC ---
(In reply to comment #11)
> If it is indeed a copy, you should move the code c-common.c and share it. The
> C-family FEs should share as much code as possible.

I agree on the principle. If more code was shared, C++ would already support
this feature ;-)

On the other hand, here I am copying a small block of code in the middle of a
function. Making just that paragraph common wouldn't make much sense imho.
Factoring most of (cp_)build_array_ref might make sense, but requires someone
with a better understanding of the FEs, because there are slight differences
that may or may not be relevant.

[Bug c++/53017] New: Integer constant not constant enough for vector_size

2012-04-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53017

 Bug #: 53017
   Summary: Integer constant not constant enough for vector_size
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


In the following code, s is apparently not an acceptable parameter for the
vector_size attribute, but s+0 is.

constexpr int s=32;
typedef double VEC __attribute__ ((__vector_size__ (s
#ifndef BUG
+ 0
#endif
)));

VEC a={2.,3.,4.};



$ g++ -std=c++0x v.cc -Wall -W -c -O3 
$ g++ -std=c++0x v.cc -Wall -W -c -O3 -DBUG
v.cc:6:4: warning: '__vector_size__' attribute ignored [-Wattributes]
v.cc:8:16: error: scalar object 'a' requires one element in initializer

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-04-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

--- Comment #10 from Marc Glisse  2012-04-17 
10:22:07 UTC ---
Created attachment 27176
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27176
subscript

This patch (a simple copy of a paragraph from the C front-end) seems sufficient
to add vector subscript support to the C++ front-end. At least, on the related
testcases I could find in the testsuite (vector-init-2.c,
vector-subscript-[123].c), g++ produces the same results as gcc (some error
messages have different content, but the same meaning, and the carets point to
'[' in C and ']' in C++).

I don't know if any of the functions called have more idiomatic counterparts in
the C++ front-end.

__builtin_shuffle seems a bit harder to move for someone not familiar with the
code.

Note that in C++ operator[] can only be a member function, which means we don't
need to worry about overloading or anything like that.

[Bug c++/50025] [DR 1288] C++0x initialization syntax doesn't work for class members of reference type

2012-04-14 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50025

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #7 from Marc Glisse  2012-04-14 
07:07:05 UTC ---
Link changed now that it has been voted into the working paper:
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1288

should it be un-suspended?

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-04-11 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #29 from Marc Glisse  2012-04-11 
20:35:00 UTC ---
Created attachment 27136
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27136
V4DF generic shuffle

A patch (independent from the others) implementing what is explained in the
last 2 comments. It is simple and works really well, all V4DF shuffles (even
with 2 vectors) take only 3 insn (and often just 2). It only requires AVX, but
also improves a lot on the current AVX2 code which casts to vectors of integers
and uses up to 9 insn (although my "default case" patch also goes down to 3
insn on AVX2).

The drawback is that it is limited to V4DF. vshufps is a different enough beast
from vshufpd that it would require a different code, which wouldn't even apply
that often. For V8SF, my "default case" patch seems more interesting. Integer
vectors have different instructions again...

By the way, I tested all V4DF permutations (there are only 2^12 of them) in the
simulator. I also have a file (400K) with the code for each permutation, that
looks like the following:
0,0,0,0
vpermilpd$0, %ymm0, %ymm0
vperm2f128$0, %ymm0, %ymm0, %ymm0
[...]
1,7,6,3
vperm2f128  $48, %ymm1, %ymm0, %ymm2
vperm2f128  $19, %ymm1, %ymm0, %ymm0
vshufpd $11, %ymm0, %ymm2, %ymm0
1,7,6,4
vperm2f128  $48, %ymm1, %ymm0, %ymm0
vperm2f128  $33, %ymm1, %ymm1, %ymm1
vshufpd $3, %ymm1, %ymm0, %ymm0
[...]
If anyone wants to take a look, tell me and I'll attach it.

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-04-11 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #28 from Marc Glisse  2012-04-11 
16:48:47 UTC ---
A difficulty I hadn't foreseen is that the code that canonicalizes permutations
(and in particular checks if one of the operands is unused) is in
ix86_expand_vec_perm_const. So if I ask expand_vec_perm_1 to generate the
2-operand 0,1,2,3 permutation, it will happily generate vperm2f128 with
immediate 16 without noticing that it is the identity on the first operand. I
should probably move that code into its own function so I can call it before
expand_vec_perm_1.

[Bug libstdc++/52931] New: std::hash shouldn't be defined for unknown types

2012-04-11 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52931

 Bug #: 52931
   Summary: std::hash shouldn't be defined for unknown types
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


As explained by Daniel Krügler in c++std-lib-32420 and nearby messages, the
default definition of std::hash is non-standard, it is supposed to be undefined
so we can test with sfinae whether hash was specialized for some type.

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-04-09 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #27 from Marc Glisse  2012-04-09 
16:50:47 UTC ---
Notes to self (or other):
- Intel's SDE makes it possible to test without appropriate hardware;
- for V4DF shuffles, there seems to be a very simple generic solution that
performs two vperm2f128 and then one vshufpd.

permutation (a,b,c,d), input (x,y):
t1 = vperm2f128(x,y,(a/2)+16*(c/2));
t2 = vperm2f128(x,y,(b/2)+16*(d/2));
return vshufpd(t1,t2,(a%2)+2*(b%2)+4*(c%2)+8*(d%2));

(when t1 or t2 is equal to x or y, it generates only 2 insn in cases that the
current code doesn't detect, like {3,1,2,2})

[Bug c++/52901] invalid rvalue reference

2012-04-08 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52901

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #3 from Marc Glisse  2012-04-08 
15:32:45 UTC ---
(In reply to comment #2)
> > X&& f() {
> > X x;
> > return std::move(x);
> > }
> 
> This function is unsafe, it returns a reference to a local variable. You
> probably meant it to return X not X&&
> 
> It is effectively the same as:
> 
> X& f() {
>X x;
>return x;
> }
> 
> (except G++ warns about that, because it's simpler)

Maybe this could be taken as a RFE for a warning with std::move? Many people
learning C++11 are bound to try similar things. g++ warns for

return X();
return static_cast(x);

but not

return std::move(x);

I expect the case of std::move to be important enough that if doing a generic
warning is too hard, special-casing std::move could be worth the trouble
(assuming it is easier).

[Bug c++/49152] Unhelpful diagnostic for iterator dereference

2012-04-01 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49152

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #29 from Marc Glisse  2012-04-01 
20:28:14 UTC ---
(In reply to comment #24)
> Personally, I don't believe Gaby is open to other solutions outside the
> full-fledged "caret diagnostics" context,

He didn't seem opposed to _adding_ the type information (without removing the
current information).

[Bug c++/52654] [C++11] Warn on overflow in user-defined literals

2012-03-31 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52654

--- Comment #7 from Marc Glisse  2012-03-31 
17:18:37 UTC ---
(In reply to comment #6)
> Also, what about this:
> 
>  -3_w;

What about it? IIUC, it is just -(3_w), I don't think it requires a particular
treatment.

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-31 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

Marc Glisse  changed:

   What|Removed |Added

  Attachment #26979|0   |1
is obsolete||

--- Comment #26 from Marc Glisse  2012-03-31 
14:02:54 UTC ---
Created attachment 27052
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27052
default case

Updated with your comments, still can't properly test.

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-31 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #25 from Marc Glisse  2012-03-31 
09:37:51 UTC ---
The test for AVX2 in expand_vec_perm_interleave2 might be too strict. For the
V4DF shuffle 4,0,2,6, removing that check lets the compiler generate a nice
vunpcklpd+vpermilpd (as opposed to 3 insn with my patch and 5+ without). The
expansion of dfinal is already protected (so the function returns false for
4,2,0,6), I haven't checked whether something else (dremap?) needs protecting,
but it doesn't look like it.

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-29 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #24 from Marc Glisse  2012-03-29 
14:19:11 UTC ---
(In reply to comment #23)
> (In reply to comment #18)
> 
> +  if (!d->testing_p)
> +dsecond.target = gen_reg_rtx (dsecond.vmode);
> +  dfirst.op1 = dsecond.target;
> 
> This bit has a problem with testing_p in that we'll have op0==op1
> while testing and not when expanding.  Which means that testing_p
> will be checking something else.

Unless d->target==d->op0 (is that the case? I was kind of assuming it wasn't),
it looks ok, but I agree that it should be improved. From other code, it looks
like using gen_reg_rtx in testing is fine and avoiding it is just an
optimization.

On the other hand, if I remember correctly, the function could just return true
early when testing (like the other function does) and assert during expansion,
since it is not supposed to fail (except for the initial mode/target check),
that would document the intent better.

> I've been meaning to convert i386 from op0==op1 to one_operand_p,
> like I used in targets I converted later, like ia64.  I'll see about
> making this change this afternoon, and then you can update your
> patch to match.

ok (no promise timewise).

> +  ok = expand_vec_perm_1 (&dsecond);
> +  ok &= ix86_expand_vec_perm_const_1 (&dfirst);
> +
> +  if (!ok)
> +return false;
> +
> +  return true;
> 
> Better with a short-circuit to avoid extra work:
> 
>   return (expand_vec_perm_1 (&dsecond)
>   && ix86_expand_vec_perm_const_1 (&dfirst));

Indeed!

Thanks for the comments.

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-27 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #22 from Marc Glisse  2012-03-27 
20:57:16 UTC ---
(In reply to comment #20)
> Lastly for each routine it is desirable to think whether it might be useful 
> for
> other vector modes (likely 32-byte only) for TARGET_AVX2.

I am not very familiar with the integer versions, so I tried:
#include 
__v32qi f(__v32qi x){
  __v32qi
m={0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31};
  return __builtin_shuffle(x,m);
}

$ gcc r.c -S -O1 -mavx && cat r.s
r.c: In function 'f':
r.c:2:9: error: invalid position or size operand to BIT_FIELD_REF
BIT_FIELD_REF 
r.c:2:9: note: in statement
D.5992_24 = BIT_FIELD_REF ;

r.c:2:9: error: invalid position or size operand to BIT_FIELD_REF
BIT_FIELD_REF 
r.c:2:9: note: in statement
D.5993_25 = BIT_FIELD_REF ;

[...]

r.c:2:9: internal compiler error: verify_gimple failed

(with -mavx2 it works fine)

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-27 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #21 from Marc Glisse  2012-03-27 
18:21:39 UTC ---
(In reply to comment #20)
> I don't like much the calls to ix86_expand_vec_perm_const_1, if you are 
> looking
> for exactly two insn permutations,

Actually, it isn't just 2 insn. The call in expand_vec_perm_vperm2f128_merge
can take 3, and the calls in expand_vec_perm_perm_blend(...,true) up to 4 (this
is how I get a maximum of 9 insn, 1+2*4). But some more splits of
ix86_expand_vec_perm_const_1 to avoid recursive calls should be doable, if you
don't like the recursion.

> then really the two insn permutation
> functions should be groupped together into expand_vec_perm_2 and you should
> call that instead, or if it is 1 or 2, then expand_vec_perm_1 ||
> expand_vec_perm_2.

Yes, this grouping by size makes sense, whether it ends up being used or not.
Although there are expanders in the "3" category that occasionally get lucky
and generate only 2 :-)

> expand_vec_perm_vperm2f128_merge has probably swapped the meaning of dfirst 
> and
> dsecond permutations when it first performs the dsecond permutation.

If you are just talking of the naming of the variables, yes, I completely agree
they should be swapped (or given more explicit names, like swap_lanes and
dintra).

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-25 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

Marc Glisse  changed:

   What|Removed |Added

  Attachment #26938|0   |1
is obsolete||

--- Comment #18 from Marc Glisse  2012-03-25 
13:52:09 UTC ---
Created attachment 26979
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26979
default case

An updated version of this simple, generic-case shuffle (do note that I didn't
run the generated code, just checked that it compiled and the instructions
generated looked roughly ok). With the patch, we have (concerning v4df and
v8sf):

- no single-vector shuffle takes more than 4 insn,
- no 2-vector shuffle takes more than 9 insn (or 3 (+ 2 movs for constants...)
with AVX2).

I think the current code already guarantees than anything that can be done in a
single instruction is.

Some possible goals (making everything optimal may be a bit hard) would be:

- everything that can be done in 2 insn is,
- no single-vector v4df takes more than 3 insn,
- one or two extra optimizations, if they are generic enough.

I do wonder occasionally about allowing wild indexes (jokers, places where you
can put anything) in shuffles, whether it is exposed to users or just an
internal tool.

[Bug c++/52521] [C++11] user defined literals and order of declaration

2012-03-22 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52521

--- Comment #12 from Marc Glisse  2012-03-22 
09:42:43 UTC ---
(In reply to comment #11)
> GCC 4.7.0 is being released, adjusting target milestone.

I think it is already fixed, actually.
(not closing with this message to leave someone a chance to contradict me)

[Bug c++/52654] New: [C++11] Warn on overflow in user-defined literals

2012-03-21 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52654

 Bug #: 52654
   Summary: [C++11] Warn on overflow in user-defined literals
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


Hello,

should there be a warning for this kind of overflow? (-Wall -Wextra is
currently silent)

int operator"" _w(unsigned long long){return 0;}
int main(){
  return 12345678901234567890123456789012345678901234567890_w;
}

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-20 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

Marc Glisse  changed:

   What|Removed |Added

  Attachment #26912|0   |1
is obsolete||

--- Comment #17 from Marc Glisse  2012-03-20 
21:50:40 UTC ---
Created attachment 26938
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26938
intra-lane shuffle in 3 insn

This (mostly untested) patch is a reformulation of the generic v8sf single
vector shuffle in 4 insn as a generic intra-lane 2 vector shuffle in at most 3
insn. Reformulating __builtin_shuffle(x,m) as
__builtin_shuffle(x,vperm2f128(x,1),mm) would then guarantee a maximum size of
4.

Note that the strategy of doing a 2-vector shuffle by shuffling (not restricted
to one vpermilp*) each vector and blending the results gives a maximum of 9
insn, whereas the current code often generates twice that number.


By the way, I have trouble understanding this comment:
  /* For d->op0 == d->op1 the only useful vperm2f128 permutation
 is 0x10.  */
Is it really 0x10, or is there a stray 0 at the end and it is really just 1?

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-20 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #16 from Marc Glisse  2012-03-20 
19:05:22 UTC ---
(In reply to comment #15)
> If I am not mistaken, the V8SF shuffle 22022246 is doable by a vperm2f128 that
> takes 01234567 to 01230123, followed by a vshufps (mask 138 maybe). Was your
> patch supposed to handle it?

Uh, no it isn't supposed to handle it (there would be redundancy and it
wouldn't know where to take elements from), sorry, forget that comment.

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-20 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #15 from Marc Glisse  2012-03-20 
19:00:32 UTC ---
If I am not mistaken, the V8SF shuffle 22022246 is doable by a vperm2f128 that
takes 01234567 to 01230123, followed by a vshufps (mask 138 maybe). Was your
patch supposed to handle it?

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-19 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #9 from Marc Glisse  2012-03-19 
18:29:50 UTC ---
(In reply to comment #8)
> I'm not very keen on having too many different routines, the more generic they
> are, the better.

Agreed, that was one of my concerns from the first message in this bug, but to
experiment it was easier to have separate functions.

> So IMHO e.g. the two insn sequence, vperm2[if]128 + some one
> insn shuffle could look like:
> 
> /* A subroutine of ix86_expand_vec_perm_builtin_1.  Try to expand
>a vector permutation using two instructions, vperm2f128 resp.
>vperm2i128 followed by any single in-lane permutation.  */

I haven't yet looked at it closely enough to understand what it does (those
functions are surprisingly confusing when you don't write them yourself), but
that looks interesting.

My first idea in order to make things more generic was to tentatively turn
__builtin_shuffle(x,m) into __builtin_shuffle(x,vperm2f128(x,x,33),mm) where mm
avoids any cross-lane. The 2-vector no-cross-lane shuffle should take at most 3
instructions in v4df or v8sf (I haven't checked if it works now) and that's
where most of the work would happen (instead of having many routines for
single-vector shuffles that almost all start with vperm2f128). Then you would
probably want to check how many instructions it used, since it could be more or
less than one of the few instruction sequences that don't start with
vperm2f128.

>From a quick look, it looks like you may be doing something even more
generic...

> This will handle e.g. vperm2f128 + {vshufpd,vblendpd,vunpcklpd,vunpckhpd} etc.

Cool!

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-18 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #6 from Marc Glisse  2012-03-18 
18:58:44 UTC ---
Created attachment 26912
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26912
generic shuffle of a single v8sf

An additional function (I should find better names...) to handle generic
shuffles of a single v8sf in 4 instructions. Only tested on {6,2,3,3,5,2,3,7}.

By the way, expand_vec_perm_vperm2f128_vblend2 does vpermilpd+vperm2f128 in
this order, but it would be better to do it in the reverse order (adapting the
mask), because it is common to need several __builtin_shuffle(x,*) and the
vperm2f128 can then be shared.

I also noticed while experimenting that -mavx2 generates vpermd instead of
vpermps (the vpermq->vpermpd change didn't affect that).

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-18 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

Marc Glisse  changed:

   What|Removed |Added

  Attachment #26909|0   |1
is obsolete||

--- Comment #5 from Marc Glisse  2012-03-18 
12:53:13 UTC ---
Created attachment 26911
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26911
patch

With this one, all V4DF shuffles on one vector are done in at most 3
instructions (and are correct). Doing V8SF at the same time was getting
confusing so I dropped it for the last 2 functions, which end up looking almost
like: "if the pattern is 0112 do this, if it is 0130 do that, etc".

I didn't check if all the functions are still used by at least one pattern...

Note: my access to an avx machine is not sufficient to submit a patch, so feel
free to take pieces of this and modify/test/submit them (I have a copyright
assignment).

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

Marc Glisse  changed:

   What|Removed |Added

  Attachment #26908|0   |1
is obsolete||

--- Comment #4 from Marc Glisse  2012-03-17 
22:03:08 UTC ---
Created attachment 26909
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26909
patch

Here is a try. Again, I just looked at the generated code on a couple examples,
which isn't very reliable...

expand_vec_perm_vperm2f128_vblend0 is already covered by
expand_vec_perm_vperm2f128_vblend1, but it is confusing to have a 3-instruction
function generate only 2.

I didn't do generic permutations with 4 instructions.

There is probably more that can be done with vshufp[sd].

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #3 from Marc Glisse  2012-03-17 
19:55:18 UTC ---
Uh. I feel silly, but it looks like vshufpd could replace vpermilpd+vblendpd in
many cases, including the original 1230 from PR52568...

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-17 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #2 from Marc Glisse  2012-03-17 
19:20:36 UTC ---
Created attachment 26908
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26908
copy-paste patch for 0213 and 1302

This seems to handle 0213 and 1302 (I only vaguely looked at the generated
code, can't do proper testing).

It is really a copy-paste of the function that handles 1230. I didn't try to
understand everything, so there may be things that made sense in the original
function but don't anymore here. It should be possible to merge the 2 new
functions, but merging them with the previous one looks harder.

[Bug target/52607] v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-16 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

--- Comment #1 from Marc Glisse  2012-03-17 
01:05:57 UTC ---
Note that {1,2,0,3} seems harder, I need one extra vpermilpd. Actually, it
looks like every v4df shuffle can be realized as a vblendpd of a vpermilpd and
a vpermilpd+vperm2f128. For v8sf, it also seems true but may require the
version of vpermilps that takes its controls from a register/memory.

[Bug target/52607] New: v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}

2012-03-16 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

 Bug #: 52607
   Summary: v4df __builtin_shuffle with {0,2,1,3} or {1,3,0,2}
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


Hello,

this is really just a follow-up to PR52568. The permutations {0,2,1,3} and
{1,3,0,2} can be realized with a very similar technique.

Starting from 0123:
vpermilpd+vperm2f128->3210
vblendpd(0123,3210)->0213

or:
vpermilpd->1032
vperm2f128->2301
vblendpd(1032,2301)->1302

I am not sure if there is a nice way to generalize this or if the function
expand_vec_perm_vperm2f128_vblend should be cloned a few times and slightly
modified.

(these permutations are less important to me than 1230 was)

[Bug c++/52521] [C++11] user defined literals and order of declaration

2012-03-16 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52521

--- Comment #7 from Marc Glisse  2012-03-16 
19:39:14 UTC ---
(In reply to comment #6)
> constexpr long double operator"" _degrees(long double d)
> {
>return d * 0.0175;
> }
> 
> int main()
> {
>long double pi = 180_degrees;
>std::cout << pi << std::endl;
> }

There is no dot in 180, so it is looking for an unsigned long long overload
(which you could provide). 180._degrees works.

[Bug target/52572] suboptimal assignment to avx element

2012-03-13 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52572

--- Comment #3 from Marc Glisse  2012-03-13 
17:57:58 UTC ---
Or for this variant:
__m256d f(__m256d *y){
  __m256d x=*y;
  x[0]=0; // or x[3]
  return x;
}
it looks like vmaskmovpd could replace:
vmovapd(%rdi), %ymm0
vmovapd%xmm0, %xmm1
vmovlpd.LC0(%rip), %xmm1, %xmm1
vinsertf128$0x0, %xmm1, %ymm0, %ymm0
(I tried a version with __builtin_shuffle but it wouldn't generate vmaskmovpd
either)

(sorry for the naive suggestions, there are too many possibilities to optimize
them all...)

[Bug target/52572] suboptimal assignment to avx element

2012-03-13 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52572

--- Comment #2 from Marc Glisse  2012-03-13 
08:16:58 UTC ---
(In reply to comment #1)
> Have you actually tried that?

Ah, no, sorry, I only have occasional access to such a machine to benchmark the
code. From a -Os perspective it is still shorter (but indeed that matters less
to me than -O3 performance).

>  Mixing VEX encoded insns with legacy encoded
> SSE* insns is very costly, for good performance there needs to be a vzeroupper
> in between (but then you lose the upper bits).  See e.g. 2.8 in the AVX
> Programming Reference.

Thanks, I'd missed that.

The vblendpd solution should still apply (from the initial 'v' it sounds safe),
no?

[Bug target/52572] New: suboptimal assignment to avx element

2012-03-12 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52572

 Bug #: 52572
   Summary: suboptimal assignment to avx element
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


For the following program:
#include 
__m256d f(__m256d x){
  x[0]=0;
  return x;
}

gcc -O3 generates:
vmovlpd.LC0(%rip), %xmm0, %xmm1
vinsertf128$0x0, %xmm1, %ymm0, %ymm0
or with -Os:
vxorps%xmm2, %xmm2, %xmm2
vmovsd%xmm2, %xmm0, %xmm1
vinsertf128$0x0, %xmm1, %ymm0, %ymm0

If I understand correctly, it first constructs {0,x[1],0,0} and then merges it
with the upper part of x. However, using the legacy movlpd instruction would
avoid zeroing the upper 128 bits and thus the vinsertf128 wouldn't be needed.

Is there a policy not to generate the non-VEX instructions anymore, or is this
a missed optimization?

Setting x[1] is similar. For x[2] or x[3], we get extract+mov+insert, but it
might be better to do something with vblendpd.

[Bug target/52568] New: suboptimal __builtin_shuffle on cycles with AVX

2012-03-12 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52568

 Bug #: 52568
   Summary: suboptimal __builtin_shuffle on cycles with AVX
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


Hello,
I compiled the following with -O3 (or -Os) and -mavx

#include 
__m256d left(__m256d x){
  __m256i mask={1,2,3,0};
  return __builtin_shuffle(x,mask);
}

(by the way, for some reason, gcc insists that 'mask' is set but not used with
-Wall)

and got:
vunpckhpd%xmm0, %xmm0, %xmm3
vmovapd%xmm0, %xmm1
vextractf128$0x1, %ymm0, %xmm0
vmovaps%xmm0, %xmm2
vunpckhpd%xmm0, %xmm0, %xmm0
vunpcklpd%xmm1, %xmm0, %xmm1
vunpcklpd%xmm2, %xmm3, %xmm0
vinsertf128$0x1, %xmm1, %ymm0, %ymm0
ret

That doesn't really match the code I currently use to do this:
#ifdef __AVX2__
__m256d d=_mm256_permute4x64_pd(x,1+2*4+3*16+0*64);
#else
__m256d b=_mm256_shuffle_pd(x,x,5);
__m256d c=_mm256_permute2f128_pd(b,b,1);
__m256d d=_mm256_blend_pd(b,c,10);
#endif

Could something recognizing this permutation pattern (and the right cyclic
shift) be added? I know there are too many shuffles to hand-code them all, but
cycles seem like they shouldn't be too uncommon.

With -mavx2, I get a single vpermq, which is close enough to the expected
vpermpd.

[Bug c++/52567] constant expression not recognized as being constant

2012-03-12 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52567

--- Comment #1 from Marc Glisse  2012-03-12 
18:10:16 UTC ---
1<<31 overflows and is thus not a constant. Try maybe 1LL<<31 ?

[Bug c++/52521] New: [C++11] user defined literals and order of declaration

2012-03-07 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52521

 Bug #: 52521
   Summary: [C++11] user defined literals and order of declaration
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: marc.gli...@normalesup.org


#include 
int operator "" _w(const char*);
int operator "" _w(const char*, std::size_t);
int main() {
  123_w;
}

a.cc: In function 'int main()':
a.cc:5:3: error: unable to find numeric literal operator 'operator"" _w'

The problem disappears if I switch the 2 declarations...

Btw, mangling these operators like functions called li_w taking the same
arguments is strange, I could have such a function in my code.

[Bug libstdc++/22200] numeric_limits::is_modulo is inconsistent with gcc

2012-02-29 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22200

Marc Glisse  changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #40 from Marc Glisse  2012-02-29 
12:32:10 UTC ---
I haven't seen it mentioned in the discussion here, but in C++11, the
definition of is_modulo was clarified as:

"True if the type is modulo. A type is modulo if, for any operation involving
+, -, or * on values of that type whose result would fall outside the range
[min(),max()], the value returned differs from the true value by an integer
multiple of max() - min() + 1."

Do people have objections to switching numeric_limits::is_modulo to
false (setting it to true when -fwrapv is used can still be discussed
afterwards)?

[Bug libstdc++/51785] gets not anymore declared

2012-02-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51785

--- Comment #12 from Marc Glisse  2012-02-28 
15:47:04 UTC ---
(In reply to comment #10)
> If the libstdc++ people are going to do something for 4.7, it really needs 
> to be done very soon.

The question is: what do the glibc people want? By removing the gets prototype,
they are explicitly going against the C++ standard. Seems to me that libstdc++
should respect that choice (add a test in configure to see if gets is provided,
and protect "using ::gets;" with #ifdef) and not provide gets. The alternative
is to disagree with the glibc developers and fixinclude stdio.h.

1 2 3 4 >

1 - 100 of 308 matches

Mail list logo