Re: [OMPI users] SEGV in libopal during MPI_Alltoall

2006-07-20 Thread Frank Gruellich
Hi George,

George Bosilca wrote:
> It is what I suspected. You can see that the envio array is smaller than
> it should. It was created as an array of doubles with the size t_max, when
> it should have been created as an array of double with the size t_max *
> nprocs.

Ah, yes, I see (and even understand).  Great, thank you very much, it
works now.

Kind regards,
-- 
Frank Gruellich
HPC-Techniker

Tel.:   +49 3722 528 42
Fax:+49 3722 528 15
E-Mail: frank.gruell...@megware.com

MEGWARE Computer GmbH
Vertrieb und Service
Nordstrasse 19
09247 Chemnitz/Roehrsdorf
Germany
http://www.megware.com/


Re: [OMPI users] SEGV in libopal during MPI_Alltoall

2006-07-20 Thread George Bosilca
It is what I suspected. You can see that the envio array is smaller than 
it should. It was created as an array of doubles with the size t_max, when 
it should have been created as an array of double with the size t_max * 
nprocs. If you look how the recibe array is created you can notice that 
it's size if t_max * nprocs (allocate(recibe(t_max*nproc))). As on the 
all-to-all operation everybody send and receive exactly the same amount of 
data, both the send and receive array should have the same size.


I propose the following fix:

- instead of

 double precision,dimension(t_max)::envio
 double precision,dimension(:),allocatable::recibe

do a

 double precision,dimension(:),allocatable::envio
 double precision,dimension(:),allocatable::recibe


- then, when the recibe array is created add the allocation for envio too

 allocate(recibe(t_max*nproc))
 allocate(envio(t_max*nproc))


Now your program should work just fine.

  george.

On Thu, 20 Jul 2006, Frank Gruellich wrote:


Hi,

George Bosilca wrote:

On the all-to-all collective the send and receive buffers has to be able
to contain all the information you try to send. On this particular case,
as you initialize the envio variable to a double I suppose it is defined
as a double. If it's the case then the error is that the send operation
will send more data than the amount available on the envio variable.

If you want to be able to do correctly the all-to-all in your example,
make sure the envio variable has a size at least equal to:
tam * sizeof(byte) * NPROCS, where NPROCS is the number of procs available
on the mpi_comm_world communicator.


I'm unfortunately not that Fortran guy.  Maybe the best would be to
submit the whole function at the beginning, it's neither secret nor big:

module alltoall
 use globales
 implicit none

contains
subroutine All_to_all

 integer,parameter :: npuntos = 24
 integer,parameter :: t_max = 2**(npuntos-1)
 integer siguiente,anterior,tam,rep,p_1,p_2,i,j,ndatos,rep2,o,k
 double precision time1,time2,time,ov,tmin,tavg
 double precision,dimension(t_max)::envio
 double precision,dimension(:),allocatable::recibe
 double precision,dimension(npuntos)::m_tmin,m_tavg
 double precision,dimension(npuntos)::tams

 rep2 = 10
 tag1 = 1
 tag2 = 2
 rep = 3

 allocate(recibe(t_max*nproc))
 siguiente = my_id + 1
 if (my_id == nproc -1) siguiente = 0
 anterior = my_id - 1
 if (my_id == 0) anterior = nproc- 1

 do i = 1,npuntos
   print *,'puntos',i
   tam = 2**(i-1)
   tmin = 1e5
   tavg = 0.0d0
   do j = 1,rep
 envio = 8.0d0*j
 call mpi_barrier(mpi_comm_world,ierr)
 time1 = mpi_wtime()
 do k = 1,rep2
   call 
mpi_alltoall(envio,tam,mpi_byte,recibe,tam,mpi_byte,mpi_comm_world,ierr)
 end do
 call mpi_barrier(mpi_comm_world,ierr)
 time2 = mpi_wtime()
 time = (time2 - time1)/(rep2)
 if (time < tmin) tmin = time
 tavg = tavg + time
   end do
   m_tmin(i) = tmin
   m_tavg(i) = tavg/rep
 end do
 call mpi_barrier(mpi_comm_world,ierr)
 print *,"acaba"

 if (my_id == 0) then
   open (1,file='Alltoall.dat')
   write (1,*) "#Prueba All to all entre todos los procesadores(",nproc,")"
   write (1,*) "#Precision del reloj:",mpi_wtick()*1.0d6,"(muS)"
   do i =1,npuntos
 write(1,900) 2*nproc*2**(i-1),m_tmin(i),m_tavg(i)!,ov
   end do
   close(1)
 end if
 900 FORMAT(I10,F14.8,F14.8)
 800 FORMAT(I10,F14.8,F14.8)
end subroutine
end module

Can you read this?  (Sorry, I can't.) But the size_of envio seems to be
2**32 = 8388608 doubles, isn't it?  I don't understand, why it should
depend on the number of MPI nodes, as you said.

Thanks for your help.  Kind regards,



"We must accept finite disappointment, but we must never lose infinite
hope."
  Martin Luther King



Re: [OMPI users] SEGV in libopal during MPI_Alltoall

2006-07-20 Thread Frank Gruellich
Hi,

George Bosilca wrote:
> On the all-to-all collective the send and receive buffers has to be able
> to contain all the information you try to send. On this particular case,
> as you initialize the envio variable to a double I suppose it is defined
> as a double. If it's the case then the error is that the send operation
> will send more data than the amount available on the envio variable.
>
> If you want to be able to do correctly the all-to-all in your example,
> make sure the envio variable has a size at least equal to:
> tam * sizeof(byte) * NPROCS, where NPROCS is the number of procs available
> on the mpi_comm_world communicator.

I'm unfortunately not that Fortran guy.  Maybe the best would be to
submit the whole function at the beginning, it's neither secret nor big:

module alltoall
  use globales
  implicit none

contains
subroutine All_to_all

  integer,parameter :: npuntos = 24
  integer,parameter :: t_max = 2**(npuntos-1)
  integer siguiente,anterior,tam,rep,p_1,p_2,i,j,ndatos,rep2,o,k
  double precision time1,time2,time,ov,tmin,tavg
  double precision,dimension(t_max)::envio
  double precision,dimension(:),allocatable::recibe
  double precision,dimension(npuntos)::m_tmin,m_tavg
  double precision,dimension(npuntos)::tams

  rep2 = 10
  tag1 = 1
  tag2 = 2
  rep = 3

  allocate(recibe(t_max*nproc))
  siguiente = my_id + 1
  if (my_id == nproc -1) siguiente = 0
  anterior = my_id - 1
  if (my_id == 0) anterior = nproc- 1

  do i = 1,npuntos
print *,'puntos',i
tam = 2**(i-1)
tmin = 1e5
tavg = 0.0d0
do j = 1,rep
  envio = 8.0d0*j
  call mpi_barrier(mpi_comm_world,ierr)
  time1 = mpi_wtime()
  do k = 1,rep2
call 
mpi_alltoall(envio,tam,mpi_byte,recibe,tam,mpi_byte,mpi_comm_world,ierr)
  end do
  call mpi_barrier(mpi_comm_world,ierr)
  time2 = mpi_wtime()
  time = (time2 - time1)/(rep2)
  if (time < tmin) tmin = time
  tavg = tavg + time
end do
m_tmin(i) = tmin
m_tavg(i) = tavg/rep
  end do
  call mpi_barrier(mpi_comm_world,ierr)
  print *,"acaba"

  if (my_id == 0) then
open (1,file='Alltoall.dat')
write (1,*) "#Prueba All to all entre todos los procesadores(",nproc,")"
write (1,*) "#Precision del reloj:",mpi_wtick()*1.0d6,"(muS)"
do i =1,npuntos
  write(1,900) 2*nproc*2**(i-1),m_tmin(i),m_tavg(i)!,ov
end do
close(1)
  end if
  900 FORMAT(I10,F14.8,F14.8)
  800 FORMAT(I10,F14.8,F14.8)
end subroutine
end module

Can you read this?  (Sorry, I can't.) But the size_of envio seems to be
2**32 = 8388608 doubles, isn't it?  I don't understand, why it should
depend on the number of MPI nodes, as you said.

Thanks for your help.  Kind regards,
-- 
Frank Gruellich
HPC-Techniker

Tel.:   +49 3722 528 42
Fax:+49 3722 528 15
E-Mail: frank.gruell...@megware.com

MEGWARE Computer GmbH
Vertrieb und Service
Nordstrasse 19
09247 Chemnitz/Roehrsdorf
Germany
http://www.megware.com/


Re: [OMPI users] SEGV in libopal during MPI_Alltoall

2006-07-20 Thread Frank Gruellich
Hi,

shen T.T. wrote:
>   Do you have the other compiler? Could you check the error and report it ?

I don't used other Intel Compilers at the moment, but I'm going to give
gfortran a try today.

Kind regards,
-- 
Frank Gruellich
HPC-Techniker

Tel.:   +49 3722 528 42
Fax:+49 3722 528 15
E-Mail: frank.gruell...@megware.com

MEGWARE Computer GmbH
Vertrieb und Service
Nordstrasse 19
09247 Chemnitz/Roehrsdorf
Germany
http://www.megware.com/