It seems that there are some default configuration files, which you must
edit them manually.
I'm not sure but It looks like a default name is being called instead of
real name of node.

In some steps of installation, those default names must be changed but it
seems that they are still remains unchanged.
Do a "grep" and find where are they (*oscar-rhel5.osl.iu.edu*) come from?



Sincerely yours,
Siavash Ghiasvand



On Tue, Feb 1, 2011 at 16:33, Patrick Schmid <patrick.p.sch...@gmail.com>wrote:

> Hi everybody
>
> It's me again ;)
> First: My Oscar / CentOS 5 cluster is up and running, mostly...
>
> Now I have problems with the parallel environment.
> I tried to run some MPI based scripts like the following:
>
> ---
> [root@lcc102 helloworld]# cat helloworld.c
> #include <stdio.h>
> #include <mpi.h>
>
> int main(int argc, char *argv[]) {
>  int numprocs, rank, namelen;
>  char processor_name[MPI_MAX_PROCESSOR_NAME];
>
>  MPI_Init(&argc, &argv);
>  MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
>  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>  MPI_Get_processor_name(processor_name, &namelen);
>
>  printf("Process %d on %s out of %d\n", rank, processor_name, numprocs);
>
>  MPI_Finalize();
> }
> ---
>
> It's a sample script from OpenMPI so it should work...
>
> I started the MPI script with a bash script:
>
> ---
> [root@lcc102 helloworld]# cat openmpi-test
> #!/bin/bash
> #$ -N openmpi-helloworld
> # Here we tell the queue that we want the orte parallel enivironment
> and request 5 slots
> # This option take the following form: -pe nameOfEnv min-Max
> # Where you request a min and max number of slots
> #$ -pe make 6-10
> #$ -cwd
> #$ -j y
> /opt/mpich-ch_p4-gcc-1.2.7/bin/mpirun -n $NSLOTS helloworld
> exit 0
> ---
>
> After submitting the job to SGE, it ends after a few minutes and
> writes the following to the output / error:
>
> ---
> [root@lcc102 helloworld]# cat openmpi-helloworld.o129
> Warnung: kein Zugriff auf Tty (Ungültiger Dateideskriptor).
> Daher keine Job Control in dieser Shell.
> p0_5868: (1999.558060) Procgroup:
> p0_5868: (1999.558127)     entry 0: lcc105.ch.power.alstom.com 0 0
> /home/helloworld/helloworld root
> p0_5868: (1999.558137)     entry 1: oscar-rhel5.osl.iu.edu 1 1
> /home/helloworld/helloworld root
> p0_5868: (1999.558144)     entry 2: oscar-rhel5.osl.iu.edu 1 2
> /home/helloworld/helloworld root
> p0_5868: (1999.558151)     entry 3: oscar-rhel5.osl.iu.edu 1 3
> /home/helloworld/helloworld root
> p0_5868: (1999.558158)     entry 4: oscar-rhel5.osl.iu.edu 1 4
> /home/helloworld/helloworld root
> p0_5868: (1999.558165)     entry 5: oscar-rhel5.osl.iu.edu 1 5
> /home/helloworld/helloworld root
> p0_5868: (1999.558172)     entry 6: oscar-rhel5.osl.iu.edu 1 6
> /home/helloworld/helloworld root
> p0_5868: (1999.558179)     entry 7: oscar-rhel5.osl.iu.edu 1 7
> /home/helloworld/helloworld root
> p0_5868: (1999.558186)     entry 8: oscar-rhel5.osl.iu.edu 1 8
> /home/helloworld/helloworld root
> p0_5868: (1999.558192)     entry 9: oscar-rhel5.osl.iu.edu 1 9
> /home/helloworld/helloworld root
> p0_5868:  p4_error: Could not gethostbyname for host
> oscar-rhel5.osl.iu.edu; may be invalid name
> : 1999
> ---
>
> The first two lines are normal, like I read, but the rest sounds
> strange... Has somebody ever seen an error like this?
> The name "oscar-rhel5.osl.iu.edu" isn't used in my cluster and could
> not get resolved by and DNS. I don't know from where the system get
> this name...
>
> The MPI script should return a "Hello I'm process x of y".
>
> I think it's related to OpenMPI but I'm not sure.
> I'm using the following versions:
>
> ---
> [root@lcc102 helloworld]# rpm -qa | grep oscar-base
> oscar-base-6.0.5-1
> oscar-base-server-6.0.5r9167-1
> oscar-base-lib-6.0.5-1
> oscar-base-scripts-6.0.5-1
> oscar-base-client-6.0.5-1
>
> [root@lcc102 helloworld]# rpm -qa | grep sge
> opkg-sge-server-6.1.4-1
> sge-6.0u9-9oscar
> opkg-sge-6.1.4-1
> sge-modulefile-6.0u9-9oscar
>
> [root@lcc102 helloworld]# rpm -qa | grep mpi
> openmpi-switcher-modulefile-1.2.4-1
> opkg-openmpi-server-1.2.4-1
> mpi-selector-1.0.2-1.el5
> opkg-mpich-1.2.7-9
> opkg-mpich-server-1.2.7-9
> openmpi-libs-1.4-4.el5
> opkg-openmpi-client-1.2.4-1
> mpich-ch_p4-gcc-oscar-module-1.2.7-8
> opkg-openmpi-1.2.4-1
> mpich-ch_p4-gcc-oscar-1.2.7-8
> openmpi-1.4-4.el5
> ---
>
> Can somebody help?
>
> cheers
> Patrick
>
>
> 2011/1/28 Patrick Schmid <patrick.sch...@encodingit.ch>:
> > I could solve the problem by myself but thanks for your help.
> >
> > The problem was, that gethostname returns the FQDN (name + domain).
> > And the script gethostbyname use the FQDN to ask for an ip...
> > But the hosts file has allocated the ip to only the name (not FQDN) so
> > gethostbyname failed because lcc103 isn't same like
> > lcc103.ch.power.alstom.com.
> >
> > But after I modified the hosts file like this everything worked:
> >
> > #  addresses
> > 10.128.88.103        lcc103.ch.power.alstom.com lcc103
> > 10.128.88.104        lcc104.ch.power.alstom.com lcc104
> > 10.128.88.105        lcc105.ch.power.alstom.com lcc105
> >
> > (Befor there were only the entries for lcc103, lcc104 and lcc105).
> >
> > Thanks, now my cluster is up and running.
> >
> > For everybody who's interested here
> > (
> http://blog.encodingit.ch/2011/01/linux-high-performance-cluster-mit-oscar/
> )
> > I wrote a howto about oscar (in German).
> >
> > cheers
> > Patrick
> >
> > 2011/1/28 siavash ghiasvand <siavash.ghiyasv...@gmail.com>:
>  >> It was my pleasure ;)
> >> I don't know how! but it seems that "gethostname -name" not works
> correctly.
> >> take a look at "/etc/hosts.conf" to see if these lines are exist:
> >>
> >> # Lookup names via DNS first then fall back to /etc/hosts.
> >> order bind,hosts
> >>
> >> The above line will tell the "gethostname": 1st check the DNS and then
> check
> >> my local /etc/hosts entries.
> >> p.s:
> >> 1- You can remove "bind" from /etc/hosts.conf to check the correctness
> of
> >> /etc/hosts.
> >> 2- "10.128.88.102" is in a Private IP range but,
> >> "lcc102.ch.power.alstom.com" (If exists) will be resolved as an Pubic
> IP so
> >> with two IPs (one private and one public) ip resolve process will
> >> permanently fails! (You can change "lcc102.ch.power.alstom.com" with
> any
> >> other name to correct it).
> >>
> >> Sincerely yours,
> >> Siavash Ghiasvand
> >>
> >>
> ------------------------------------------------------------------------------
> >> Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
> >> Finally, a world-class log management solution at an even better
> price-free!
> >> Download using promo code Free_Logger_4_Dev2Dev. Offer expires
> >> February 28th, so secure your free ArcSight Logger TODAY!
> >> http://p.sf.net/sfu/arcsight-sfd2d
> >> _______________________________________________
> >> Oscar-users mailing list
> >> Oscar-users@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/oscar-users
> >>
> >>
> >
> >
> >
> > --
> > Patrick Schmid
> >
> > www.encodingit.ch
> > patrick.sch...@encodingit.ch
> >
>
>
>
> --
> Patrick Schmid
>
>
> ------------------------------------------------------------------------------
> Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
> Finally, a world-class log management solution at an even better
> price-free!
> Download using promo code Free_Logger_4_Dev2Dev. Offer expires
> February 28th, so secure your free ArcSight Logger TODAY!
> http://p.sf.net/sfu/arcsight-sfd2d
> _______________________________________________
> Oscar-users mailing list
> Oscar-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oscar-users
>
------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world? 
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to