That all looks ok. From x-math20 could you run "lctl ping [EMAIL PROTECTED]"?

On Jan 2, 2008, at 8:36 AM, Avi Gershon wrote:

Hi, I get this:
***************************************************************************
[EMAIL PROTECTED] ~]# lctl list_nids
[EMAIL PROTECTED]
[EMAIL PROTECTED] ~]# ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:02:B3:2D:A6:BF
inet addr:132.66.176.211 Bcast:132.66.255.255 Mask:255.255.0.0
inet6 addr: fe80::202:b3ff:fe2d:a6bf/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:9448397 errors:0 dropped:0 overruns:0 frame:0
TX packets:194259 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1171910501 (1.0 GiB) TX bytes:40500450 (38.6 MiB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:8180 errors:0 dropped:0 overruns:0 frame:0
TX packets:8180 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:3335243 (3.1 MiB) TX bytes:3335243 (3.1 MiB)

sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

[EMAIL PROTECTED] ~]# cat /etc/modprobe.conf
alias eth0 e100
alias usb-controller uhci-hcd
alias scsi_hostadapter ata_piix
alias lustre llite
options lnet networks=tcp0
[EMAIL PROTECTED] ~]#

***********************************************************************************************************8

On 1/2/08, Aaron Knister <[EMAIL PROTECTED]> wrote:
On the host x-math20 could you run an "lctl list_nids" and also an "ifconfig -a". I want to see if lnet is listening on the correct interface. Oh could you also post the contents of your /etc/ modprobe.conf.

Thanks!

-Aaron

On Jan 2, 2008, at 4:42 AM, Avi Gershon wrote:

Hello to every one and happy new year..
I think I have reduce my problem to this: lctl ping [EMAIL PROTECTED] don't work for me for some strange reason
as you can see:
***********************************************************************************
[EMAIL PROTECTED] ~]# lctl ping [EMAIL PROTECTED]
failed to ping [EMAIL PROTECTED]: Input/output error
[EMAIL PROTECTED] ~]# ping 132.66.176.211
PING 132.66.176.211 ( 132.66.176.211) 56(84) bytes of data.
64 bytes from 132.66.176.211: icmp_seq=0 ttl=64 time=0.152 ms
64 bytes from 132.66.176.211: icmp_seq=1 ttl=64 time=0.130 ms
64 bytes from 132.66.176.211: icmp_seq=2 ttl=64 time=0.131 m
--- 132.66.176.211 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2018ms
rtt min/avg/max/mdev = 0.130/0.137/0.152/0.016 ms, pipe 2
[EMAIL PROTECTED] ~]#
*****************************************************************************************


On 12/24/07, Avi Gershon <[EMAIL PROTECTED]> wrote:
Hi,
here is the "iptables -L  " results:

 NODE 1 132.66.176.212
Scientific Linux CERN SLC release 4.6 (Beryllium)
[EMAIL PROTECTED]'s password:
Last login: Sun Dec 23 22:01:18 2007 from x-fishelov.tau.ac.il
[EMAIL PROTECTED] ~]#
[EMAIL PROTECTED] ~]#
[EMAIL PROTECTED] ~]# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
************************************************************************************************
 MDT 132.66.176.211

Last login: Mon Dec 24 11:51:57 2007 from dynamic136-91.tau.ac.il
[EMAIL PROTECTED] ~]# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
*************************************************************************

NODE 2 132.66.176.215
Last login: Mon Dec 24 11:01:22 2007 from erezlab.tau.ac.il
[EMAIL PROTECTED] ~]# iptables -L

Chain INPUT (policy ACCEPT)
target     prot opt source               destination
RH-Firewall-1-INPUT  all  --  anywhere             anywhere
Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
RH-Firewall-1-INPUT  all  --  anywhere             anywhere

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain RH-Firewall-1-INPUT (2 references)
target     prot opt source               destination
ACCEPT     all  --  anywhere             anywhere
ACCEPT     icmp --  anywhere             anywhere            icmp any
ACCEPT     ipv6-crypt--  anywhere             anywhere
ACCEPT     ipv6-auth--  anywhere             anywhere
ACCEPT udp -- anywhere 224.0.0.251 udp dpt:5353 ACCEPT udp -- anywhere anywhere udp dpt:ipp ACCEPT all -- anywhere anywhere state RELATED,ESTAB
LISHED
ACCEPT tcp -- anywhere anywhere state NEW tcp dpts:
30000:30101
ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:s
sh
ACCEPT udp -- anywhere anywhere state NEW udp dpt:a
fs3-callback
REJECT all -- anywhere anywhere reject- with icmp-ho
st-prohibited
[EMAIL PROTECTED] ~]#

************************************************************
one more thing....
Do you use TCP protocol? or do you use UDP?

Regards Avi,
P.S I think a beginning of a beautiful friendship.. :-)



On Dec 24, 2007 5:29 PM, Aaron Knister < [EMAIL PROTECTED]> wrote:
That sounds like quite a task! Could you show me the contents of your
firewall rules on the systems mentioned below? (iptables -L) on each.
That would help to diagnose the problem further.

-Aaron

On Dec 24, 2007, at 1:21 AM, Yan Benhammou wrote:

> Hi Aaron and thank you for you fast answwers.
> We are working (Avi,Meny and me) on the israeli GRID and we need to
> create a single huge file system for this GRID.
>     cheers
>          Yan
>
> ________________________________
>
> From: Aaron Knister [mailto: [EMAIL PROTECTED]
> Sent: Sun 12/23/2007 8:27 PM
> To: Avi Gershon
> Cc: [email protected] ; Yan Benhammou; Meny Ben moshe
> Subject: Re: [Lustre-discuss] help needed.
>
>
> Can you check the firewall on each of those machines ( iptables - L )
> and paste that here. Also, is this network dedicated to Lustre?
> Lustre can easily saturate a network interface under load to the
> point it becomes difficult to login to a node if it only has one
> interface. I'd recommend using a different interface if you can.
>
> On Dec 23, 2007, at 11:03 AM, Avi Gershon wrote:
>
>
>       node 1 132.66.176.212 < http://132.66.176.212/>
>       node 2 132.66.176.215 < http://132.66.176.215/>
>
> [EMAIL PROTECTED] ~]# ssh 132.66.176.215 < http://132.66.176.215/ >
>       [EMAIL PROTECTED]'s password:
>       ssh(21957) Permission denied, please try again.
>       [EMAIL PROTECTED] 's password:
> Last login: Sun Dec 23 14:32:51 2007 from x- math20.tau.ac.il <http://x-math20.tau.ac.il/
> >
>       [EMAIL PROTECTED] ~]#  lctl ping [EMAIL PROTECTED]
>       failed to ping [EMAIL PROTECTED]: Input/output error
>       [EMAIL PROTECTED] ~]#  lctl list_nids
>       [EMAIL PROTECTED]
> [EMAIL PROTECTED] ~]# ssh 132.66.176.212 <http://132.66.176.212/ > > The authenticity of host ' 132.66.176.212 <http://132.66.176.212/ >
> ( 132.66.176.212 <http://132.66.176.212/> )' can't be established.
> RSA1 key fingerprint is 85:2a:c1:47:84:b7:b5:a6:cd:c4:57:86:af:ce:
> 7e:74.
>       Are you sure you want to continue connecting (yes/no)? yes
>       ssh(11526) Warning: Permanently added ' 132.66.176.212 < 
http://132.66.176.212/
> > ' (RSA1) to the list of kno
>       wn hosts.
>       [EMAIL PROTECTED]'s password:
> Last login: Sun Dec 23 15:24:41 2007 from x- math20.tau.ac.il <http://x-math20.tau.ac.il/
> >
>       [EMAIL PROTECTED] ~]# lctl ping [EMAIL PROTECTED]
>       failed to ping [EMAIL PROTECTED]: Input/output error
>       [EMAIL PROTECTED] ~]# lctl list_nids
>       [EMAIL PROTECTED]
>       [EMAIL PROTECTED] ~]#
>
>
>       thanks for helping!!
>       Avi
>
>
> On Dec 23, 2007 5:32 PM, Aaron Knister < [EMAIL PROTECTED]> wrote:
>
>
> On the oss can you ping the mds/mgs using this command--
>
>               lctl ping [EMAIL PROTECTED]
>
> If it doesn't ping, list the nids on each node by running
>
>               lctl list_nids
>
>               and tell me what comes back.
>
>               -Aaron
>
>
>               On Dec 23, 2007, at 9:22 AM, Avi Gershon wrote:
>
>
>                       HI I could use some help.
>                       I installed lustre on 3 computers
>                        mdt/mgs :
>
>
> ************************************************************************************8 > [EMAIL PROTECTED] ~]#mkfs.lustre --reformat -- fsname spfs --mdt --
> mgs /dev/hdb
>
>                          Permanent disk data:
>                       Target:     spfs-MDTffff
>                       Index:      unassigned
>                       Lustre FS:  spfs
>                       Mount type: ldiskfs
>                       Flags:      0x75
> (MDT MGS needs_index first_time update ) > Persistent mount opts: errors=remount- ro,iopen_nopriv,user_xattr
>                       Parameters:
>
>                       device size = 19092MB
> formatting backing filesystem ldiskfs on / dev/hdb
>                               target name  spfs-MDTffff
>                               4k blocks     0
> options -J size=400 -i 4096 - I 512 -q -O dir_index
> -F
> mkfs_cmd = mkfs.ext2 -j -b 4096 -L spfs- MDTffff -J size=400 -i
> 4096 -I 512 -q -O dir_index -F /dev/hdb
>                       Writing CONFIGS/mountdata
>                       [ [EMAIL PROTECTED] ~]# df
> Filesystem 1K-blocks Used Available Use% Mounted on > /dev/hda1 19228276 4855244 13396284 27% / > none 127432 0 127432 0% /dev/shm > /dev/hdb 17105436 455152 15672728 3% /mnt/test/
> mdt
> [EMAIL PROTECTED] ~]# cat /proc/fs/lustre/ devices
>                         0 UP mgs MGS MGS 5
>                         1 UP mgc [EMAIL PROTECTED]
> 5f5ba729-6412-3843-2229-1310a0b48f71 5
>                         2 UP mdt MDS MDS_uuid 3
>                         3 UP lov spfs-mdtlov spfs-mdtlov_UUID 4
>                         4 UP mds spfs-MDT0000 spfs-MDT0000_UUID 3
>                       [ [EMAIL PROTECTED] ~]#
> *************************************************************end
> mdt******************************8
>                       so you can see that the MGS is up
>                       ond on the ost's I get an error!! plz help...
>
>                       ost:
>
> ********************************************************************** > [ [EMAIL PROTECTED] ~]# mkfs.lustre --reformat --fsname spfs --ost --
> mgsnode=132.66. [EMAIL PROTECTED] /dev/hdb1
>
>                          Permanent disk data:
>                       Target:     spfs-OSTffff
>                       Index:      unassigned
>                       Lustre FS:  spfs
>                       Mount type: ldiskfs
>                       Flags:      0x72
> (OST needs_index first_time update ) > Persistent mount opts: errors=remount- ro,extents,mballoc
>                       Parameters: [EMAIL PROTECTED]
>
>                       device size = 19594MB
> formatting backing filesystem ldiskfs on / dev/hdb1
>                               target name  spfs-OSTffff
>                               4k blocks     0
> options -J size=400 -i 16384 -I 256 -q -O
> dir_index -F
> mkfs_cmd = mkfs.ext2 -j -b 4096 -L spfs- OSTffff -J size=400 -i
> 16384 -I 256 -q -O dir_index -F /dev/hdb1
>                       Writing CONFIGS/mountdata
>                       [ [EMAIL PROTECTED] ~]# /CONFIGS/mountdata
> -bash: /CONFIGS/mountdata: No such file or directory > [EMAIL PROTECTED] ~]# mount -t lustre /dev/ hdb1 /mnt/test/ost1 > mount.lustre: mount /dev/hdb1 at /mnt/test/ ost1 failed: Input/
> output error
>                       Is the MGS running?
> ***********************************************end
> ost********************************
>
>                       can any one point out the problem?
>                       thanks Avi.
>
>
>
> _______________________________________________
>                       Lustre-discuss mailing list
>                       [email protected]
>                       
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
>
>
>
>
>               Aaron Knister
>               Associate Systems Administrator/Web Designer
>               Center for Research on Environment and Water
>
>               (301) 595-7001
>               [EMAIL PROTECTED]
>
>
>
>
>
>
> Aaron Knister
> Associate Systems Administrator/Web Designer
> Center for Research on Environment and Water
>
> (301) 595-7001
> [EMAIL PROTECTED]
>
>
>

Aaron Knister
Associate Systems Administrator/Web Designer
Center for Research on Environment and Water

(301) 595-7001
[EMAIL PROTECTED]





_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
[EMAIL PROTECTED]






Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
[EMAIL PROTECTED]




_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to