Hi all,

Here is my second evaluation report, together with a simple program that
I was able to compile with my parallel version of GCC. Keep in mind that
I still have lots of concurrent issues inside the compiler and therefore
my branch will fail to compile pretty much anything else.

To reproduce my current branch, use the following steps:

1-) Clone https://gitlab.com/flusp/gcc/tree/giulianob_parallel

2-) Edit gcc/graphunit.c's variable `num_threads` to 1.

3-) Compile with --disable-bootstrap --enable-languages=c

4-) make

5-) Edit gcc/graphunit.c's variable `num_threads` to 2, for instance.

6-) make install DESTDIR="somewhere_else_that_doesnt_break_your_gcc"

7-) compile the program using -O2

I a attaching my report in markdown format, which you can convert to pdf
using `pandoc` if you find it difficult to read in the current format.

I am also open to suggestions. Please do not hesitate to comment :)

Thank you,
Giuliano.
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <time.h>
#include <stdint.h>

#include <time.h>
#include <sys/time.h>

#define MAX 1000

double
//__attribute__ ((noinline, optimize("-O0")))
sinatan2(double x, double y)
{
    return sin(atan2(x, y));
}

#define EPS 1e-160
#define HUGE 1e150
#define INF 1/0.


double
//__attribute__ ((noinline))
sinatan2_(double x, double y)
{
   static double hue; 
    //if (!(*((uint64_t*) &x) & 0x7FFFFFFFFFFFFFFFULL))
    if (fpclassify(x) == FP_ZERO)
        return x;
    else
    {
        if (y == 0 || x > INF || x < -INF)
            return copysign(1., x);
        else
        {
            double x_abs = fabs(x);
            double y_abs = fabs(y);
            if (x_abs < EPS || y_abs < EPS)
            {
                double alpha = 1/EPS; 
                x = alpha*x;
                y = alpha*y;
            }
            if (x_abs > HUGE || y_abs > HUGE)
            {
                double alpha = 1/HUGE;
                x = alpha*x;
                y = alpha*y;
            }
        }
        printf("x = %le, sqrt = %le\n", x, sqrt(x*x + y*y));
        return x / sqrt(x*x + y*y);
    }
}

static void generate_random_array(int n, double a[])
{
    int i;

    for (i = 0; i < n; ++i)
        a[i] = (double) rand() / (double) RAND_MAX;
}

int
main(int argc, char* argv[])
{
    //struct timespec start, end;
    unsigned long long delta_t;

   
    volatile double a;
    int i, j;

    double x, y, z1, z2;

    if (argc != 3)
        return 1;

    x = atof(argv[1]);
    y = atof(argv[2]);

    z1 = sinatan2(x, y);
    z2 = sinatan2_(x, y);

    printf("sin(atan2(x, y))  = %le\n", z1);
    printf("x / sqrt(x² + y²) = %le\n", z2);
/*
    clock_gettime(CLOCK_MONOTONIC_RAW, &start);
    clock_gettime(CLOCK_MONOTONIC_RAW, &end);
    
    clock_gettime(CLOCK_MONOTONIC_RAW, &start);
    for (i = 0; i < MAX; ++i)
    {
        for (j = 0; j < MAX; ++j)
        {
            a = sinatan2_(A[i], A[j]);
        }
    }
    clock_gettime(CLOCK_MONOTONIC_RAW, &end);
    delta_t = end.tv_nsec - start.tv_nsec;
    printf("time x/sqrt(x*x + y*y) = %llu\n", delta_t);

    clock_gettime(CLOCK_MONOTONIC_RAW, &start);
    for (i = 0; i < MAX; ++i)
    {
        for (j = 0; j < MAX; ++j)
        {
            a = sinatan2(A[i], A[j]);
        }
    }
    clock_gettime(CLOCK_MONOTONIC_RAW, &end);
    delta_t = (end.tv_nsec - start.tv_nsec);
    printf("time sin(atan2(x, y))  = %llu\n", delta_t);    
*/

    return 0;
}
# Parallelize GCC with Threads: Second Evaluation Report


## Tasks accomplished until the Second Evaluation:

These are the tasks which I accomplished until now:

  - Keep on mapping the GIMPLE global variables.

  - Compile a small program in parallel.

Although I aimed to have a correct version of the parallel GCC by this
evaluation, I couldn't archive this result yet.

Here I explain some technical problems that I faced during this phase.

## Chapter 1 -- Global Variables

In this chapter, I discuss some global variables
that I found while debugging my parallel prototype.

By using the "algorithm" that I stated in the first evaluation and sometimes
using helgrind, or aided by gdb with the crash point, I was able to find some
more global variables. See the table in Appendex for more details. There is
still lots of variables that I have not reached yet, and some that fell through
the cracks that I only find references to these after debugging the crash.

## Chapter 2 -- Garbage Collector (GGC)

I inserted a single mutex in GGC to ensure that memory allocation is done
sequentially, therefore avoiding me to rewrite the GGC with threading support
for now. Of course, this will require a better strategy in the future, as doing
memory allocation sequentially might impact performance negatively.

## Chapter 3 -- Object Pools

Some passes use object pools to store objects, avoiding to allocate and
deallocate memory frequently. My previous approach was to have a threadsafe
memory pool that locked a mutex each time a new object is requested or released
to this pool. This turned out to be a bad approach, as (1) it serialized the
requests to this pool, and (2) there were some race conditions that released all
objects of the pool while another thread was still holding pointers to objects
of this pool, causing memory faults.

My current approach is simpler but required some code refactoring. For each
thread, at ::execute time, I allocate a new memory pool, and try to release it
after the pass is done. When it isn't possible, I initialize the thread pool
pointer with NULL, and lazily initialize it. Unfortunately, this last case
causes a memory leak, but it is under control.

## Chapter 4 -- Memory Pools

In memory-block.h and memory-pool.cc, we have a memory pool which is used by
object pools. This again had some race conditions and required locking when
allocating and deallocating objects. The trivial solution was to use a mutex,
however, this drove the compiler to be unusably slow. Therefore I used a
distributed approach: instead of having a singleton memory pool for the entire
program, now we have a singleton memory pool for each thread. If necessary,
then I will implement a merge feature for distinct memory pools.

## Chapter 5 -- Helgrind

After the commit adding the per-thread memory pool, helgrind always crashes cc1
when analyzing GCC. There is no error message other than SEGFAULT, and attaching
GDB to it wasn't helpful.

## Chapter 6 -- Compiling Program

I am attaching a small program which I managed
to get it working on my branch. I have not measured speedups yet, and I won't
be surprised if there is some part of the code generated by the compiler is
wrong.

## Appendex -- Global Variable Table


|       file name            |                 variable                      |
|----------------------------|-----------------------------------------------|
| bitmap.c                   | bitmap_default_obstack                        |
| bitmap.c                   | bitmap_default_obstack_depth                  |
|                            |                                               |
| gimple-match-head.c        | depth (several `static` occurences            |
|                            |                                               |
| tree-ssa-sccvn.c           | last_vuse_ptr                                 |
| tree-ssa-sccvn.c           | vn_walk_kind                                  |
| tree-ssa-sccvn.c           | default_vn_walk_kind                          |
| tree-ssa-sccvn.c           | constant_to_vaule_id                          |
| tree-ssa-sccvn.c           | constant_value_ids                            |
| tree-ssa-sccvn.c           | vn_tables_obstack                             |
| tree-ssa-sccvn.c           | vn_tables_insert_obstack                      |
| tree-ssa-sccvn.c           | shared_lookup_references                      |
| tree-ssa-sccvn.c           | last_inserted_ref                             |
| tree-ssa-sccvn.c           | last_inserted_nary                            |
| tree-ssa-sccvn.c           | last_inserted_phi                             |
| tree-ssa-sccvn.c           | valid_info                                    |
| tree-ssa-sccvn.c           | next_value_id                                 |
| tree-ssa-sccvn.c           | vn_ssa_aux_hash                               |
| tree-ssa-sccvn.c           | vn_ssa_aux_obstack                            |
| tree-ssa-sccvn.c           | rpo_avail                                     |
| tree-ssa-sccvn.c           | vn_context_bb                                 |
| tree-ssa-sccvn.c           | vn_valueize                                   |
|                            |                                               |
| tree-ssa-loop-ivcannon.c   | loops_to_unloop                               |
| tree-ssa-loop-ivcannon.c   | loops_to_unloop_nunroll                       |
| tree-ssa-loop-ivcannon.c   | edges_to_remove                               |
| tree-ssa-loop-ivcannon.c   | peeled_loops                                  |
|                            |                                               |
| tree-scalar-evolution.c    | nb_set_scev                                   |
| tree-scalar-evolution.c    | nb_get_scev                                   |
| tree-scalar-evolution.c    | chrec_not_analyzed_yet                        |
|                            |                                               |
| tree-scalar-evolution.c    | scalar_evolution_info                         |
|                            |                                               |
| tree-ssa-threadedge.c      | stmt_count                                    |
| tree-ssa-threadedge.c      | ssa_name_values                               |
|                            |                                               |
| tree-vrp.c                 | live                                          |
| tree-vrp.c                 | need_assert_for                               |
| tree-vrp.c                 | asserts_for                                   |
|                            |                                               |
| tree-ssa-copy.c            | copy_of                                       |
| tree-ssa-copy.c            | n_copy_of                                     |
|                            |                                               |
| tree-ssa-dce.c             | worklist                                      |
| tree-ssa-dce.c             | processed                                     |
| tree-ssa-dce.c             | last_stmt_necessary                           |
| tree-ssa-dce.c             | bb_contains_live_stmts                        |
| tree-ssa-dce.c             | cd                                            |
| tree-ssa-dce.c             | visited_control_parents                       |
| tree-ssa-dce.c             | cfg_altered                                   |
| tree-ssa-dce.c             | bb_postorder                                  |
| tree-ssa-dce.c             | visited                                       |
| tree-ssa-dce.c             | longest_chain                                 |
| tree-ssa-dce.c             | total_chain                                   |
| tree-ssa-dce.c             | nr_walks                                      |
| tree-ssa-dce.c             | chain_ovfl                                    |
|                            |                                               |
| tree-ssa-dom.c             | cfg_altered                                   |
| tree-ssa-dom.c             | need_eh_cleanup                               |
| tree-ssa-dom.c             | need_noreturn_fixup                           |
| tree-ssa-dom.c             | x_vr_values                                   |
| tree-ssa-dom.c             | opt_stats                                     |
|                            |                                               |
| tree-sra.c                 | access_pool                                   |
| tree-sra.c                 | assign_link_pool                              |
| tree-sra.c                 | base_access_vec                               |
| tree-sra.c                 | candidate_bitmap                              |
| tree-sra.c                 | candidates                                    |
| tree-sra.c                 | should_scalarize_away_bitmap                  |
| tree-sra.c                 | cannot_scalarize_away_bitmap                  |
| tree-sra.c                 | disqualified_constants                        |
| tree-sra.c                 | name_obstacks                                 |
| tree-sra.c                 | work_queue_head                               |
| tree-sra.c                 | func_param_count                              |
| tree-sra.c                 | encountered_apply_args                        |
| tree-sra.c                 | encountered_recursive_call                    |
| tree-sra.c                 | encountered_unchargable_recursive_call        |
| tree-sra.c                 | bb_dereferences                               |
| tree-sra.c                 | final_bbs                                     |
|                            |                                               |
| cfg.c                      | bb_original                                   |
| cfg.c                      | bb_copy                                       |
| cfg.c                      | loop_copy                                     |
| cfg.c                      | original_copy_bb_pool                         |
| cfg.c                      | block_aux_obstack                             |
| cfg.c                      | first_block_aux_obj                           |
| cfg.c                      | edge_aux_obstack                              |
| cfg.c                      | first_edge_aux_obj                            |
| cfg.c                      | initialized (several entries in some function)|
|                            |                                               |
| gimple-ssa-isolate-paths.c | cfg_altered                                   |
|                            |                                               |
| tree-ssa-dse.c             | need_eh_cleanup                               |
| tree-ssa-dse.c             | x_vr_values                                   |
|                            |                                               |
| domwalk.c                  | bb_postorder                                  |
|                            |                                               |
| tree-ssa-reasso.c          | reassoc_insert_powi_p                         |
| tree-ssa-reasso.c          | operand_entry_pool                            |
| tree-ssa-reasso.c          | next_operand_entry_id;                        |
| tree-ssa-reasso.c          | bb_rank                                       |
| tree-ssa-reasso.c          | operand_rank                                  |
| tree-ssa-reasso.c          | reassoc_branch_fixups                         |
| tree-ssa-reasso.c          | plus_negates                                  |
| tree-ssa-reasso.c          | cvec                                          |
| tree-ssa-reasso.c          | repeat_factor_vec                             |
|                            |                                               |
| tree-ssa.c                 | edge_var_maps                                 |
|                            |                                               |
| tree-into-ssa.c            | block_defs_stack                              |
| tree-into-ssa.c            | old_ssa_names                                 |
| tree-into-ssa.c            | new_ssa_names                                 |
| tree-into-ssa.c            | interesting_blocks                            |
| tree-into-ssa.c            | names_to_release                              |
| tree-into-ssa.c            | phis_to_rewrite                               |
| tree-into-ssa.c            | blocks_with_phis_to_rewrite                   |
| tree-into-ssa.c            | update_ssa_initialized_fn                     |
| tree-into-ssa.c            | var_infos                                     |
| tree-into-ssa.c            | info_for_ssa_name                             |
| tree-into-ssa.c            | current_info_for_ssa_name_age                 |
| tree-into-ssa.c            | update_ssa_obstack                            |
| tree-into-ssa.c            | blocks_to_update                              |
| tree-into-ssa.c            | symbols_to_rename_set                         |
| tree-into-ssa.c            | symbols_to_rename                             |
| tree-into-ssa.c            | names_to_release                              |
|                            |                                               |
| tree-ssa-strlen.c          | strinfo_pool                                  |
| tree-ssa-strlen.c          | stridx_to_strinfo                             |
| tree-ssa-strlen.c          | decl_to_strdxlist_htab                        |
| tree-ssa-strlen.c          | strlen_to_stridx                              |
| tree-ssa-strlen.c          | stridx_obstack                                |
|                            |                                               |
| tree-ssa-threadupdate.c    | removed_edges                                 |
| tree-ssa-threadupdate.c    | paths                                         |
| tree-ssa-threadupdate.c    | thread_stats                                  |
| tree-ssa-threadupdate.c    | redirection_data                              |
| tree-ssa-threadupdate.c    | dbds_ce_stop                                  |
|                            |                                               |
| tree-object-size.c         | object_sizes                                  |
| tree-object-size.c         | computed                                      |
| tree-object-size.c         | offset_limit                                  |
|                            |                                               |
| tree-ssa-structalias.c     | fake_var_decl_obstack                         |
| tree-ssa-structalias.c     | variable_info_pool                            |
| tree-ssa-structalias.c     | constraint_pool                               |
|                            |                                               |
| gimple-ssa-strength-red... | cand_vec                                      |
| gimple-ssa-strength-red... | stmt_cand_map                                 |
| gimple-ssa-strength-red... | cand_obstack                                  |
| gimple-ssa-strength-red... | chain_obstack                                 |
| gimple-ssa-strength-red... | incr_vec                                      |
| gimple-ssa-strength-red... | incr_vec_len                                  |
| gimple-ssa-strength-red... | address_arithmetic_p                          |
| gimple-ssa-strength-red... | base_cand_map                                 |
| gimple-ssa-strength-red... | name_expansions                               |
| gimple-ssa-strength-red... | alt_base_map                                  |
|                            |                                               |
| tree-ssa-math-opts.c       | reciprocal_stats                              |
| tree-ssa-math-opts.c       | occ_head                                      |
| tree-ssa-math-opts.c       | occ_pool                                      |
|                            |                                               |
| tree-ssa-uncprop.c         | val_ssa_equiv                                 |
|                            |                                               |
| tree-ssa-pre.c             | next_expression_id                            |
| tree-ssa-pre.c             | expressions                                   |
| tree-ssa-pre.c             | name_to_id                                    |
| tree-ssa-pre.c             | do_partial_partial                            |
| tree-ssa-pre.c             | phi_translate_table                           |
| tree-ssa-pre.c             | inserted_exprs                                |
| tree-ssa-pre.c             | value_expressions                             |
| tree-ssa-pre.c             | bitmap_set_pool                               |
| tree-ssa-pre.c             | grand_bitmap_obstack                          |
|                            |                                               |
| tree-tailcall.c            | m_acc                                         |
| tree-tailcall.c            | a_acc                                         |
| tree-tailcall.c            | live_vars                                     |
| tree-tailcall.c            | live_vars_vec                                 |
|                            |                                               |
| gimple-ssa-sprintf.c       | warn_lev                                      |
| gimple-ssa-sprintf.c       | lsm_tmp_name                                  |
| gimple-ssa-sprintf.c       | lsm_tmp_name_length                           |
|                            |                                               |
| tree-vect-generic.c        | vector_inner_type                             |
| tree-vect-generic.c        | vector_last_type                              |
| tree-vect-generic.c        | vector_last_nunits                            |
|                            |                                               |
| passes.c                   | in_gimple_form                                |
|                            |                                               |
| statistics.c               | statistics_hashes                             |
| statistics.c               | nr_statistics_hashes                          |
|                            |                                               |
| tree-cfg.c                 | eh_error_found                                |
|                            |                                               |
| tree-ssa-loop-im.c         | lim_aux_data_map                              |
| tree-ssa-loop-im.c         | mem_ref_obstack                               |
|                            |                                               |
| cfgloop.c                  | mfb_reis_set                                  |
|                            |                                               |
| tree-ssa-tailmerge.c       | same_succ_htab                                |
| tree-ssa-tailmerge.c       | same_succ_edge_flags                          |
| tree-ssa-tailmerge.c       | deleted_bbs                                   |
| tree-ssa-tailmerge.c       | deleted_bb_preds                              |
| tree-ssa-tailmerge.c       | worklist                                      |
| tree-ssa-tailmerge.c       | all_clusters                                  |
| tree-ssa-tailmerge.c       | update_bbs                                    |
| tree-ssa-tailmerge.c       | ter_bitmap_obstack                            |
|                            |                                               |

Reply via email to