On 1/11/2015 12:47 AM, Matthew Knepley wrote:
On Sat, Oct 31, 2015 at 11:34 AM, TAY wee-beng <[email protected] <mailto:[email protected]>> wrote:

    Hi,

    I understand that as mentioned in the faq, due to the limitations
    in memory, the scaling is not linear. So, I am trying to write a
    proposal to use a supercomputer.

    Its specs are:

    Compute nodes: 82,944 nodes (SPARC64 VIIIfx; 16GB of memory per node)

    8 cores / processor

    Interconnect: Tofu (6-dimensional mesh/torus) Interconnect

    Each cabinet contains 96 computing nodes,

    One of the requirement is to give the performance of my current
    code with my current set of data, and there is a formula to
    calculate the estimated parallel efficiency when using the new
    large set of data

    There are 2 ways to give performance:
    1. Strong scaling, which is defined as how the elapsed time varies
    with the number of processors for a fixed
    problem.
    2. Weak scaling, which is defined as how the elapsed time varies
    with the number of processors for a
    fixed problem size per processor.

    I ran my cases with 48 and 96 cores with my current cluster,
    giving 140 and 90 mins respectively. This is classified as strong
    scaling.

    Cluster specs:

    CPU: AMD 6234 2.4GHz

    8 cores / processor (CPU)

    6 CPU / node

    So 48 Cores / CPU

    Not sure abt the memory / node


    The parallel efficiency ‘En’ for a given degree of parallelism ‘n’
    indicates how much the program is
    efficiently accelerated by parallel processing. ‘En’ is given by
    the following formulae. Although their
    derivation processes are different depending on strong and weak
    scaling, derived formulae are the
    same.

    From the estimated time, my parallel efficiency using  Amdahl's
    law on the current old cluster was 52.7%.

    So is my results acceptable?

    For the large data set, if using 2205 nodes (2205X8cores), my
    expected parallel efficiency is only 0.5%. The proposal recommends
    value of > 50%.

The problem with this analysis is that the estimated serial fraction from Amdahl's Law changes as a function of problem size, so you cannot take the strong scaling from one problem and apply it to another without a
model of this dependence.

Weak scaling does model changes with problem size, so I would measure weak scaling on your current cluster, and extrapolate to the big machine. I realize that this does not make sense for many scientific
applications, but neither does requiring a certain parallel efficiency.
Ok I check the results for my weak scaling it is even worse for the expected parallel efficiency. From the formula used, it's obvious it's doing some sort of exponential extrapolation decrease. So unless I can achieve a near > 90% speed up when I double the cores and problem size for my current 48/96 cores setup, extrapolating from about 96 nodes to 10,000 nodes will give a much lower expected parallel efficiency for the new case.

However, it's mentioned in the FAQ that due to memory requirement, it's impossible to get >90% speed when I double the cores and problem size (ie linear increase in performance), which means that I can't get >90% speed up when I double the cores and problem size for my current 48/96 cores setup. Is that so?

So is it fair to say that the main problem does not lie in my programming skills, but rather the way the linear equations are solved?

Thanks.

  Thanks,

     Matt

    Is it possible for this type of scaling in PETSc (>50%), when
    using 17640 (2205X8) cores?

    Btw, I do not have access to the system.




    Sent using CloudMagic Email
    
<https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.4.10&pv=5.0.2&source=email_footer_2>





--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

Reply via email to