Hi all, Below is the proposal of this gSoc project. I'd really like you review and comment on this and then I can plan this project better.
Thanks, Li Feng ---------------------------------------------------------------------------------------- #title Automatic parallelization in Graphite * Synopsis With the integration of Graphite to GCC4.4, a strong loop nest analysis and transformation engine was introduced. But now it does only non-parallel loop generation. My work is to detect synchronization free parallel loops and generate parallel code for them, which will mainly enables programs to run faster. * The Project In GCC there already exists an auto-parallelization pass, but it can't works on some not-simple loops. e.g. The following code can't be handled by autopar, which yields a scev_not_known dependency. <src lang="c"> int Z[100][100]; int main(void) { int i,j; for (i = 0; i <= 10; i++) for (j = 0; j <=10; j++) Z[i][j] = Z[j+10][i+11]; return 0; } </src> I made some experimental work on "What kind of loops can autopar handle and what it can't" ,you can find it here: http://gcc.gnu.org/wiki/AutoparRelated. Graphite can analyze every loop, condition that can be analyzed by polyhedral analysis, and even complicated loop nests. Under Graphite, we could detect synchronization free parallel loops and generate parallel code for them. During Google Summer of Code, I want to solve the two issues. First of all, I will write test cases for different kind of loops that can be parallelized under Graphite. This work will be done by some discussions about data dependency analysis under polyhedral model with Graphite developers. The first issue will be parallel loop detection. This means that Graphite will recognize simple parallel loops using SCoP detection and data dependency analysis. SCoP detection is well supported by Graphite and Konrad Trifunic is now working on data dependency analysis based on polyhedral model, which will be done about 3-4 weeks. The place for parallel loop detection will be after CLOOG generate the new loops, either CLAST(cloog AST after call cloog on polyhedra) or after clast_to_gimple. At this time as we have the polyhedral information (poly_bb_p) still available during code generation, we can try to update the dependency information using the restrictions CLOOG added and the polyhedral dependency analysis to check if there is any dependency in the generated loops. So the basic step of parallel loop detection will be: 1. Get the poly_bb_p after CLOOG 2. Check that if there are dependencies. 3. If dependency doesn't exist, mark the loop as parallel. We may add annotations of more detailed information here. e.g. shared/private objects. The second issue will be parallel code generation. At this part, I will try to integrate autopar's code generation to Graphite. This work will be done precisely after cloog_to_gimple. I will make sure that the loops Graphite creates can be handled by the autopar's code generation. Currently the autopar pass in GCC lower the OMP_FOR and OMP_PARALLEL to libgomp functions. Still, maybe we could call autopar's reduction analysis, if the scalar analysis determines that the loop is still parallelizable, then the code generation will be called. * Success Criteria 1. Graphite can recognize and mark loops that can be parallelized 2. Graphite will generate parallelized code for them 3. Pass test cases for Graphite's auto-parallelization * Road-map 1. Since data dependency analysis is not available, I will firstly integrate autopar's code generation to Graphite. This work will be done step by step.(Mid June) - Introduce a flag -fgraphite-parallelize that forces auto-parallelization for all loops. - Make sure the loops Graphite creates can be handled by the autopar's code generation. - Call autopar for every loop.(The loops that can not be paralleled will just fail/crash.) 2. Write test cases for the loops that can be parallelized. This will take a few discussions with Graphite developers to see which kind of loops we will should detect and can be auto-parallelized.(End June) 3. Write code for parallel code detection. This part of job will based on SCoP detection and data dependency, and at this time, data dependency analysis should have been done. This part of work will take most of the time.(First week of August) 4. Code cleaning and write documents.(Second week of August) * Profit for GCC - When the auto-parallelization has been done in Graphite, developer can mostly take their effort to loop transformations. Graphite will in charge of optimizations(generate parallelism) and the autopar code in Graphite will just detect and generate code for them.