illegal insn created in ira

2011-05-08 Thread roy rosen
Hi, In my port I have an error: Before ira I have the following insn: (insn 3859 4277 4366 57 (set (reg:BI 2038) (subreg:BI (reg/v:SI 181 [ realsz ]) 3)) 76 {movbi} (expr_list:REG_EQUAL (const_int 1 [0x1]) (nil))) During ira this insn is transformed (I guess because reg

Re: inline assembly vs. intrinsic functions

2011-03-28 Thread roy rosen
2011/3/24 Ian Lance Taylor i...@google.com: roy rosen roy.1ro...@gmail.com writes: You build a RECORD_TYPE holding the fields you want to return.  You define the appropriate builtin functions to return that record type. How is that done? using define_insn? How do I tell it to return a struct

Re: inline assembly vs. intrinsic functions

2011-03-24 Thread roy rosen
2011/3/22 Ian Lance Taylor i...@google.com: roy rosen roy.1ro...@gmail.com writes: 2010/10/26 Ian Lance Taylor i...@google.com: roy rosen roy.1ro...@gmail.com writes: I am trying to demonstrate my port capabilities. I am writing an application which needs to use instructions like max a,b,c

Re: inline assembly vs. intrinsic functions

2011-03-17 Thread roy rosen
2010/10/26 Ian Lance Taylor i...@google.com: roy rosen roy.1ro...@gmail.com writes: I am trying to demonstrate my port capabilities. I am writing an application which needs to use instructions like max a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs. Is that possible to write

Re: register allocation

2011-01-05 Thread roy rosen
2011/1/3 Jeff Law l...@redhat.com: On 12/27/10 08:43, roy rosen wrote: I'd recommend to try ira-improv branch.  I think that part of the problem is in usage of cover classes.  The branch removes the cover classes and permits IRA to use intersected register classes and that helps to assign

Re: register allocation

2010-12-27 Thread roy rosen
2010/12/23 Vladimir Makarov vmaka...@redhat.com: On 12/23/2010 03:13 AM, roy rosen wrote: Hi All, I am looking at the code generated by my port and it seems that I have a problem that too many copies between registers are generated. I looked a bit at the register allocation and wanted

register allocation

2010-12-23 Thread roy rosen
Hi All, I am looking at the code generated by my port and it seems that I have a problem that too many copies between registers are generated. I looked a bit at the register allocation and wanted to verify that I understand its behavior. Is that true that it first chooses a register class for

Re: software pipelining

2010-12-08 Thread roy rosen
, On 10.11.2010 12:32, roy rosen wrote: Hi, I was wondering if gcc has software pipelining. I saw options -fsel-sched-pipelining -fselective-scheduling -fselective-scheduling2 but I don't see any pipelining happening (tried with ia64). Is there a gcc VLIW port in which I can see it working? You

combine two load insns

2010-12-04 Thread roy rosen
Hi, If I have two load SI insns. Is there any way to combine them into one load DI insn? Not using peephole which can catch only this limited case of being sequential insns. I have seen something done in ARM (*arith_adjacentmem) but it is very awkward and would not be realistic if the DI is being

Re: inline assembly vs. intrinsic functions

2010-11-15 Thread roy rosen
Is there any another way to give attributes to inline assembly insns? 2010/10/26 Ian Lance Taylor i...@google.com: roy rosen roy.1ro...@gmail.com writes: If I want the compiler to understand the inline assembly is it possible to write define_insn which would match the pattern that GCC

Re: inline assembly vs. intrinsic functions

2010-11-15 Thread roy rosen
no matter what I do? 2010/11/15 Joern Rennecke amyl...@spamcop.net: Quoting roy rosen roy.1ro...@gmail.com: Is there any another way to give attributes to inline assembly insns? See define_asm_attributes.

Re: pipeline description

2010-11-11 Thread roy rosen
: roy rosen roy.1ro...@gmail.com writes: I am writing now the pipeline description in order to get a parallel code. My machine has many restrictions regarding which instruction can be parallelized with another. I am under the assumption that for each insn only one define_insn_reservation

software pipelining

2010-11-10 Thread roy rosen
Hi, I was wondering if gcc has software pipelining. I saw options -fsel-sched-pipelining -fselective-scheduling -fselective-scheduling2 but I don't see any pipelining happening (tried with ia64). Is there a gcc VLIW port in which I can see it working? For an example function like int nor(char*

Re: UNITS_PER_SIMD_WORD

2010-11-08 Thread roy rosen
This is what I done. It works well. Thanks to everybody. 2010/11/8 Michael Meissner meiss...@linux.vnet.ibm.com: On Mon, Nov 01, 2010 at 04:52:28PM +0200, roy rosen wrote: Hi All, Is it possible to define UNITS_PER_SIMD_WORD as a global variable and to set this varibale using a pragma (even

Re: define_split

2010-11-08 Thread roy rosen
2010/11/8 Michael Meissner meiss...@linux.vnet.ibm.com: On Thu, Oct 28, 2010 at 09:11:44AM +0200, roy rosen wrote: Hi all, I am trying to use define_split, but it seems to me that I don't understand how it is used. It says in the gccint.pdf (which I use as my tutorial (is there anything

UNITS_PER_SIMD_WORD

2010-11-01 Thread roy rosen
Hi All, Is it possible to define UNITS_PER_SIMD_WORD as a global variable and to set this varibale using a pragma (even once for a compilation) and that way to be able to compile one file with UNITS_PER_SIMD_WORD = 8 and another file with UNITS_PER_SIMD_WORD = 16? Thanks, Roy.

define_split

2010-10-28 Thread roy rosen
Hi all, I am trying to use define_split, but it seems to me that I don't understand how it is used. It says in the gccint.pdf (which I use as my tutorial (is there anything better or more up to date?)) that the combiner only uses the define_split if it doesn't find any define_insn to match. This

Re: define_split

2010-10-28 Thread roy rosen
2010/10/29 Ian Lance Taylor i...@google.com: roy rosen roy.1ro...@gmail.com writes: I am trying to use define_split, but it seems to me that I don't understand how it is used. It says in the gccint.pdf (which I use as my tutorial (is there anything better or more up to date?)) Assuming you

Re: inline assembly vs. intrinsic functions

2010-10-26 Thread roy rosen
I didn't give the full details of the instruction but for example a max instruction which gets an array and returns both the max value and its index in the array will need to return more than one argument. 2010/10/26 Ian Lance Taylor i...@google.com: roy rosen roy.1ro...@gmail.com writes: I am

Re: inline assembly vs. intrinsic functions

2010-10-26 Thread roy rosen
If I want the compiler to understand the inline assembly is it possible to write define_insn which would match the pattern that GCC creates for the inline assembly and then GCC would be able to 'know' some attributes about this insn and would be able to parallelize it? 2010/10/26 roy rosen roy

combiner

2010-10-25 Thread roy rosen
In my port I get to such a situation: (insn 60 59 61 4 a.c:65 (set (subreg:SI (reg:HI 129 [ __prephitmp_4 ]) 0) (zero_extract:SI (subreg:SI (reg/v:DI 138 [ v4hi1 ]) 4) (const_int 16 [0x10]) (const_int 16 [0x10]))) 53 {extzv} (nil)) (insn 61 60 62 4 a.c:65 (set

inline assembly vs. intrinsic functions

2010-10-25 Thread roy rosen
Hi, I am trying to demonstrate my port capabilities. I am writing an application which needs to use instructions like max a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs. Is that possible to write an intrinsic function for that? I think not because that means that I need to pass d,e,f by

complex numbers in gcc

2010-08-17 Thread roy rosen
Hi all, In my port the architecture has some specific instructions that can handle complex arithmetic. I tried to use them but I see that pass_lower_complex decompose the complex numbers. I tried to remove this pass from the passes' list but I saw that the subsequent passes require that this pass

Re: constraints and predicates

2010-08-05 Thread roy rosen
I haven't mentioned that I am using gcc 4.6 latest version. To generalize the question. If I use an operand like lc_operand (below) and leave the constraint open, is it guaranteed that the register that would be chosen would be of class lc? 2010/8/3 roy rosen roy.1ro...@gmail.com: Hi All, If I

constraints and predicates

2010-08-03 Thread roy rosen
Hi All, If I don't use a constraint, is it possible that during ira I get a register which is not acceptable by the predicate? In my port I have the following to support HW loops: (define_predicate lc_operand (match_operand 0 register_operand) { unsigned int regno; if

vectorization

2010-07-18 Thread roy rosen
Hi, In my architecture I have simd instructions with several simd levels. I have load and store which operate on 8 half words. I have add and sub for 4 half words I have mul which operates on 2 half words. How can I utilize all of them? Is that enough just to describe each one of these

Re: invalid insn generated

2010-06-30 Thread roy rosen
Taylor i...@google.com: roy rosen roy.1ro...@gmail.com writes: In my port I get to gen_reload to the lines /* If IN is a simple operand, use gen_move_insn. */ else if (OBJECT_P (in) || GET_CODE (in) == SUBREG) { static int xxx; xxx = OBJECT_P (in); tem

invalid insn generated

2010-06-23 Thread roy rosen
Hi, In my port I get to gen_reload to the lines /* If IN is a simple operand, use gen_move_insn. */ else if (OBJECT_P (in) || GET_CODE (in) == SUBREG) { static int xxx; xxx = OBJECT_P (in); tem = emit_insn (gen_move_insn (out, in)); /* IN may contain a

complex arithmetics

2010-06-10 Thread roy rosen
Hi All, I was wondering if there is any architecture which implemented complex arithmetic in GCC i.e. used modes like CHI or HC. I would really like to look at an example for that. Thanks, Roy.

vectorization issue

2010-05-26 Thread roy rosen
Hi, I have tried vectorization and encountered a problem which I can see is common to some ports (I tried ia64 and bfin). For this function: #define ts unsigned short void f(ts* __restrict__ a, ts* __restrict__ b, ts* __restrict__ x) { int i; for (i=0;i1024;i++) x[i] = a[i] +

scheduling on VLIW architecture

2010-05-06 Thread roy rosen
Hi all. I work on a VLIW architecture. The sched2 pass adds a TImode to insns which should start a new issue group. But, after this pass, other passes change the insns, so the sched2 work that was done is not correct anymore (the groups of insns might be invalid). In particular I see that the

Re: peephole optimizations

2010-05-04 Thread roy rosen
Hi, 2010/5/3, Ian Lance Taylor i...@google.com: roy rosen roy.1ro...@gmail.com writes: 1. Is that true that if I try to match in the pattern two insns and in my code between these insns there is another insn which does not have any dependency connection to the other two, Is that true

peephole optimizations

2010-05-03 Thread roy rosen
Hi All, I have tried to write some peephole patterns and I now have some questions regarding the way it is working. 1. Is that true that if I try to match in the pattern two insns and in my code between these insns there is another insn which does not have any dependency connection to the other

Re: vectorization, scheduling and aliasing

2010-04-27 Thread roy rosen
Hi, I have looked a bit more and tried also ia-64 and bfin and actually I can't find a single example where vectorized code using __restrict__ variables would break the dependency between stores and loads. for this simple program: unsigned short xxx(unsigned short* __restrict__ a, unsigned

Re: vectorization, scheduling and aliasing

2010-04-26 Thread roy rosen
Hi Richard, 2010/4/23, Richard Guenther richard.guent...@gmail.com: On Thu, Apr 22, 2010 at 6:04 PM, roy rosen roy.1ro...@gmail.com wrote: Hi Richard, 2010/4/14, Richard Guenther richard.guent...@gmail.com: On Wed, Apr 14, 2010 at 8:48 AM, roy rosen roy.1ro...@gmail.com wrote: Hi All

Re: vectorization, scheduling and aliasing

2010-04-26 Thread roy rosen
Hi Richard, Here is the relevant block from the dump: bb 3: __vect_var__26_6 = *__vect_p_14_19; *__vect_p_18_25 = __vect_var__26_6; # PT = nonlocal { __PARM_RESTRICT_2 } (restr) __vect_p_22_11 = __vect_p_14_19 + 8; # PT = nonlocal { __PARM_RESTRICT_1 } (restr) __vect_p_27_12 =

Re: vectorization, scheduling and aliasing

2010-04-22 Thread roy rosen
Hi Richard, 2010/4/14, Richard Guenther richard.guent...@gmail.com: On Wed, Apr 14, 2010 at 8:48 AM, roy rosen roy.1ro...@gmail.com wrote: Hi All, I have implemented some vectorization features in my gcc port. In the generated code for this function I can see a scheduling problem

vectorization, scheduling and aliasing

2010-04-14 Thread roy rosen
Hi All, I have implemented some vectorization features in my gcc port. In the generated code for this function I can see a scheduling problem: int xxx(int* __restrict__ a, int* __restrict__ b) { int __restrict__ i; for (i = 0; i 8; i++) { a[i] = b[i]; } return 0; }

compiler operations research

2010-04-07 Thread roy rosen
Hi, Are there any known methodologies/tools/flows that enable operations research on the compiler generated assembly? The reasoning behind the question is that compiler heuristics complexity are restricted by compilation time, while test environment can run for a long time taking into account

Re: compiler operations research

2010-04-07 Thread roy rosen
Thanks Dave, I'll have a look at these. Roy. 2010/4/7, Dave Korn dave.korn.cyg...@googlemail.com: On 07/04/2010 12:29, roy rosen wrote: Hi, Are there any known methodologies/tools/flows that enable operations research on the compiler generated assembly? Something like MILEPOST+ICI

Re: lower subreg optimization

2010-04-07 Thread roy rosen
2010/4/6, Jim Wilson wil...@codesourcery.com: On 04/06/2010 02:24 AM, roy rosen wrote: (insn 33 32 34 7 a.c:25 (set (subreg:V2HI (reg:V4HI 114) 0) (plus:V2HI (subreg:V2HI (reg:V4HI 112) 0) (subreg:V2HI (reg:V4HI 113) 0))) 118 {addv2hi3} (nil)) Only subregs

lower subreg optimization

2010-04-06 Thread roy rosen
Hi, I have encountered several problems with lower subreg optimization in my port. In some cases I noticed that insns are decomposed in subreg1 pass and do not get recomposed later which causes at the end using two insns instead of one. For example I have the following dump before subreg1

implementing load 8 byte instruction

2010-03-18 Thread roy rosen
Hi, I am trying to implement a simple load 8 bytes instruction. I tried to use movdi so that it would allocate two sequential registers for the load. It starts well but in pass subreg1 the insns are decomposed and all DI operands are replaced with SI. I understand that this is a desireable