Petsc Developers/Users I having some performance issues with preallocation in a fully unstructured FE code. It would be very helpful if those using FE codes can comment.
For a problem of size 100K nodes and 600K tet elements (on 1 cpu) 1. If I calculate the _exact_ number of non-zeros per row (using a running list in Fortran) by looping over nodes & elements, the code takes 17 mins (to calculate nnz's/per row, assemble and solve). 2. If I dont use a running list and simply get the average of the max number of nodes a node might be connected to (again by looping over nodes & elements but not using a running list) then it takes 8 mins 3. If I just magically guess the right value calculated in 2 and use that as average nnz per row then it only takes 25 secs. Basically in all cases Assembly and Solve are very fast (few seconds) but the nnz calculation itself (in 2 and 3) takes a long time. How can this be cut down? Is there a heuristic way to estimate the number (as done in 3) even if it slightly overestimates the nnz's per row or are efficient ways to do step 1 or 2. Right now I have do i=1,num_nodes; do j=1,num_elements ... which obviously is slow for large number of nodes/elements. Thanks in advance Tabrez
