Re: [OMPI users] SM btl slows down bandwidth?

2008-08-13 Thread Ron Brightwell
> [...]
> 
> MPICH2 manages to get about 5GB/s in shared memory performance on the
> Xeon 5420 system.

Does the sm btl use a memcpy with non-temporal stores like MPICH2?
This can be a big win for bandwidth benchmarks that don't actually
touch their receive buffers at all...

-Ron




Re: [OMPI users] openmpi credits for eager messages

2008-02-04 Thread Ron Brightwell

> Is what George says accurate? If so, it sounds to me like OpenMPI
> does not comply with the MPI standard on the behavior of eager
> protocol. MPICH is getting dinged in this discussion because they
> have complied with the requirements of the MPI standard. IBM MPI
> also complies with the standard.
> 
> If there is any debate about whether the MPI standard does (or
> should) require the behavior I describe below then we should move
> the discussion to the MPI 2.1 Forum and get a clarification.
> [...]

The MPI Standard also says the following about resource limitations:

  Any pending communication operation consumes system resources that are
  limited. Errors may occur when lack of resources prevent the execution
  of an MPI call. A quality implementation will use a (small) fixed amount
  of resources for each pending send in the ready or synchronous mode and
  for each pending receive. However, buffer space may be consumed to store
  messages sent in standard mode, and must be consumed to store messages
  sent in buffered mode, when no matching receive is available. The amount
  of space available for buffering will be much smaller than program data
  memory on many systems. Then, it will be easy to write programs that
  overrun available buffer space.

Since I work on MPI implementations that are expected to allow applications
to scale to tens of thousands of processes, I don't want the overhead of
a user-level flow control protocol that penalizes scalable applications in
favor of non-scalable ones.  I also don't want an MPI implementation that
hides such non-scalable application behavior, but rather exposes it at lower
processor counts -- preferably in a way that makes the application developer
aware of the resources requirements of their code and allows them to make
the appropriate choice regarding the structure of their code, the underlying
protocols, and the amount of buffer resources.

But I work in a place where codes are expected to scale and not just work.
Most of the vendors aren't allowed to have this perspective

-Ron




[OMPI users] Cluster'07 - Early registration ending

2007-08-30 Thread Ron Brightwell

 Early Registration Ends on Friday, 8/31 

_

   2007 IEEE International Conference on Cluster Computing (Cluster 2007)

   September 17-20, 2007 
   Austin, Texas
Omni Austin Hotel Downtown
http://www.cluster2007.org
_

We cordially invite you to attend Cluster 2007, an open forum where the
challenges, technologies, and new innovations in all areas of cluster computing
will be presented and discussed. This year continues the tradition of a strong
technical program punctuated by two keynote talks from luminaries in the field.

Early registration is open now through August 31!

Register online at http://www.cluster2007.org; click on "Registration" link.

Highlights of this year's conference include:

  - Keynotes:
- Andreas Bechtolsheim, Sun Microsystems
- Dr. Mark Seager, Lawrence Livermore National Laboratory
  - 43 technical paper presentations
  - Best Papers session
  - 4 tutorial sessions
  - 1 "Green" workshop
  - Poster session
  - HeteroPar'07 workshop
  - Panel session: "The Future of Multicore Technology"
  - Complimentary social event and live music 

This year's program includes papers on the following topics:

  - Parallel I/O, File Systems & File Management
  - Resource Management
  - Application & Program Paradigms
  - MPI and Networking
  - Power/Thermal Management
  - Scaling in HPC
  - Grid Computing and Grid Clusters

The Sixth International Workshop on Algorithms, Models and Tools for Parallel
Computing on Heterogeneous Networks (HeteroPar'07) is held in conjunction
with Cluster 2007.

We look forward to your participation! For complete details, please visit:

  http://www.cluster2007.org




[OMPI users] Cluster'07 Call for Participation

2007-08-10 Thread Ron Brightwell

Early Registration Now Open!

_

   2007 IEEE International Conference on Cluster Computing (Cluster 2007)

   September 17-20, 2007 
   Austin, Texas
Omni Austin Hotel Downtown
http://www.cluster2007.org
_

We cordially invite you to attend Cluster 2007, an open forum where the
challenges, technologies, and new innovations in all areas of cluster computing
will be presented and discussed. This year continues the tradition of a strong
technical program punctuated by two keynote talks from luminaries in the field.

Early registration is open now through August 24!

Register online at http://www.cluster2007.org; click on "Registration" link.

Highlights of this year's conference include:

  - Keynotes:
- Andreas Bechtolsheim, Sun Microsystems
- Dr. Mark Seager, Lawrence Livermore National Laboratory
  - 43 technical paper presentations
  - Best Papers session
  - 4 tutorial sessions
  - 1 "Green" workshop
  - Poster session
  - HeteroPar'07 workshop
  - Complimentary social event and live music 

This year's program includes papers on the following topics:

  - Parallel I/O, File Systems & File Management
  - Resource Management
  - Application & Program Paradigms
  - MPI and Networking
  - Power/Thermal Management
  - Scaling in HPC
  - Grid Computing and Grid Clusters

The Sixth International Workshop on Algorithms, Models and Tools for Parallel
Computing on Heterogeneous Networks (HeteroPar'07) is held in conjunction
with Cluster 2007.

We look forward to your participation! For complete details, please visit:

  http://www.cluster2007.org




[OMPI users] CFP: 2007 IEEE International Conference on Cluster Computing (Cluster2007)

2007-03-20 Thread Ron Brightwell
inute papers: 13 Jul 2007


Organization:

General Chair
  Karl W. Schulz, University of Texas, USA
Program Chair
  Kent Milfeld, University of Texas, USA
Program Vice Chairs
  Toni Cortes, Barcelona Supercomputing Center, Spain
  Barney Maccabe, University of New Mexico, USA
  Mitsuhisa Sato, University of Tsukuba, Japan
Steering Committee Liaison
  Daniel S. Katz, Louisiana State University, USA
Tutorial Chair
  Ira Pramanick, SUN Microsystems, USA
Workshop Chair
  Dan Stanzione, Arizona State University, USA
Poster Chair
  Henry Tufo, National Center for Atmospheric Research, USA
Publicity Chair
  Ron Brightwell, Sandia National Labs, USA
Publication Chair
  Marcin Paprzycki, Warsaw School of Social Psychology, Poland
Exhibit/Sponsors Chair
  Ivan Judson, Argonne National Lab, USA
Finance Chair
  Janet McCord, University of Texas, USA
Local Arrangements Chair
  Faith Singer-Villalobos, University of Texas, USA




Re: [O-MPI users] direct openib btl and latency

2006-02-09 Thread Ron Brightwell
> [...]
> 
> >From an adoption perspective, though, the ability to shine in
> micro-benchmarks is important, even if it means using an ad-hoc tuning.
> There is some justification for it after all. There are small clusters
> out there (many more than big ones, in fact) so taking maximum advantage
> of a small scale is relevant.

I'm obliged to point out that you jumped to a conclusion -- possibly true
in some cases, but not always.

You assumed that a performance increase for a two-node micro-benchmark
would result in an application performance increase for a small cluster.
Using RDMA for short messages is the default on small clusters *because*
of the two-node micro-benchmark, not because the cluster is small.

I've seen plenty of cases where doing the scalable thing, rather than the
optimized for micro-benchmarks thing, leads to increases in application
performance even at a small scale.

-Ron