Joon Hee Choi <[email protected]> writes:

>>Did you run in parallel?
>
> Yes. I tested my code with 2 nodes and 4 ppn(processors per node).

Then print what you are passing in and what you get back on each
process.  The PetscSplitOwnership code is short.

>>> Also, tuples are the total pairs of (row dimension, column dimension,
>>> element), values of sparse matrix, which are read from a file. 
>
>>Using files this way is a massive bottleneck that you'll have to
>>eliminate if you want your code to be scalable.
>
> What do you mean a massive bottleneck? 

Writing to a file is at least 1000 times slower than accessing memory
and it does not parallelize as well.

> Actually, I succeeded in setting up the large matrix using seqaij type
> and calculating products between huge matrices. However, I need to use
> multiple machines because I have to run loops of the product many
> times. And I am a novice of mpi.

I'm not sure what you're asking, but you should create the matrices in
memory, preferably in parallel and mostly where the entries will need to
be stored.

>>> The tuples are sorted and distributed. 
>
>>When you distribute, are you sure that each process really gets the
>>entire row, or would it be possible to cut in the middle of a row?
>
> I think each process gets the entire row. The following is the code for 
> getting tuples from text file:
>
>    ierr = PetscFOpen(PETSC_COMM_WORLD, inputFile, "r", &file); CHKERRQ(ierr);
>    while (fscanf(file, "%d %d %d", &ii, &jj, &vv) == 3)
>    {
>       tups.push_back(std::tr1::make_tuple (ii-1, jj-1, vv));
>       if (ii > I) I = ii;
>       if (jj > J) J = jj;
>    }
>    ierr = PetscFClose(PETSC_COMM_WORLD, file); CHKERRQ(ierr);

Nothing in this code snippet indicates that the row partition is 
non-overlapping.

Attachment: pgpsYmJo42Ouw.pgp
Description: PGP signature

Reply via email to