no, that don't work also :-( .. thanks for answering so fast Avi
On 1/2/08, Aaron Knister <[EMAIL PROTECTED]> wrote: > > That all looks ok. From x-math20 could you run "lctl ping > [EMAIL PROTECTED]"? > > On Jan 2, 2008, at 8:36 AM, Avi Gershon wrote: > > *Hi, I get this:* > > *************************************************************************** > [EMAIL PROTECTED] ~]# lctl list_nids > [EMAIL PROTECTED] > [EMAIL PROTECTED] ~]# ifconfig -a > eth0 Link encap:Ethernet HWaddr 00:02:B3:2D:A6:BF > inet addr:132.66.176.211 Bcast:132.66.255.255 Mask:255.255.0.0 > inet6 addr: fe80::202:b3ff:fe2d:a6bf/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:9448397 errors:0 dropped:0 overruns:0 frame:0 > TX packets:194259 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:1171910501 (1.0 GiB) TX bytes:40500450 (38.6 MiB) > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > inet6 addr: ::1/128 Scope:Host > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:8180 errors:0 dropped:0 overruns:0 frame:0 > TX packets:8180 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:3335243 (3.1 MiB) TX bytes:3335243 (3.1 MiB) > > sit0 Link encap:IPv6-in-IPv4 > NOARP MTU:1480 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) > > [EMAIL PROTECTED] ~]# cat /etc/modprobe.conf > alias eth0 e100 > alias usb-controller uhci-hcd > alias scsi_hostadapter ata_piix > alias lustre llite > options lnet networks=tcp0 > [EMAIL PROTECTED] ~]# > > ***********************************************************************************************************8 > > > On 1/2/08, Aaron Knister <[EMAIL PROTECTED]> wrote: > > > > On the host x-math20 could you run an "lctl list_nids" and also an > > "ifconfig -a". I want to see if lnet is listening on the correct interface. > > Oh could you also post the contents of your /etc/modprobe.conf. > > > > Thanks! > > > > -Aaron > > > > On Jan 2, 2008, at 4:42 AM, Avi Gershon wrote: > > > > Hello to every one and happy new year.. > > I think I have reduce my problem to this: lctl ping [EMAIL PROTECTED]'t > > work for me for some strange reason > > as you can see: > > *********************************************************************************** > > > > [EMAIL PROTECTED] ~]# lctl ping [EMAIL PROTECTED] > > failed to ping [EMAIL PROTECTED]: Input/output error > > [EMAIL PROTECTED] ~]# ping 132.66.176.211 > > PING 132.66.176.211 ( 132.66.176.211) 56(84) bytes of data. > > 64 bytes from 132.66.176.211: icmp_seq=0 ttl=64 time=0.152 ms > > 64 bytes from 132.66.176.211: icmp_seq=1 ttl=64 time=0.130 ms > > 64 bytes from 132.66.176.211: icmp_seq=2 ttl=64 time=0.131 m > > --- 132.66.176.211 ping statistics --- > > 3 packets transmitted, 3 received, 0% packet loss, time 2018ms > > rtt min/avg/max/mdev = 0.130/0.137/0.152/0.016 ms, pipe 2 > > [EMAIL PROTECTED] ~]# > > ***************************************************************************************** > > > > > > > > On 12/24/07, Avi Gershon <[EMAIL PROTECTED]> wrote: > > > > > > Hi, > > > here is the "iptables -L " results: > > > > > > NODE 1 132.66.176.212 <[EMAIL PROTECTED]> > > > Scientific Linux CERN SLC release 4.6 (Beryllium) > > > [EMAIL PROTECTED]'s password: > > > Last login: Sun Dec 23 22:01:18 2007 from x-fishelov.tau.ac.il > > > [EMAIL PROTECTED] ~]# > > > [EMAIL PROTECTED] ~]# > > > [EMAIL PROTECTED] ~]# iptables -L > > > Chain INPUT (policy ACCEPT) > > > target prot opt source destination > > > > > > Chain FORWARD (policy ACCEPT) > > > target prot opt source destination > > > Chain OUTPUT (policy ACCEPT) > > > target prot opt source destination > > > ************************************************************************************************ > > > > > > MDT 132.66.176.211 > > > > > > Last login: Mon Dec 24 11:51:57 2007 from dynamic136-91.tau.ac.il > > > [EMAIL PROTECTED] ~]# iptables -L > > > Chain INPUT (policy ACCEPT) > > > target prot opt source destination > > > > > > Chain FORWARD (policy ACCEPT) > > > target prot opt source destination > > > Chain OUTPUT (policy ACCEPT) > > > target prot opt source destination > > > > > > ************************************************************************* > > > > > > NODE 2 132.66.176.215 <[EMAIL PROTECTED]> > > > Last login: Mon Dec 24 11:01:22 2007 from erezlab.tau.ac.il > > > [EMAIL PROTECTED] ~]# iptables -L > > > Chain INPUT (policy ACCEPT) > > > target prot opt source destination > > > RH-Firewall-1-INPUT all -- anywhere anywhere > > > > > > Chain FORWARD (policy ACCEPT) > > > target prot opt source destination > > > RH-Firewall-1-INPUT all -- anywhere anywhere > > > > > > Chain OUTPUT (policy ACCEPT) > > > target prot opt source destination > > > Chain RH-Firewall-1-INPUT (2 references) > > > target prot opt source destination > > > ACCEPT all -- anywhere anywhere > > > ACCEPT icmp -- anywhere anywhere icmp any > > > ACCEPT ipv6-crypt-- anywhere anywhere > > > ACCEPT ipv6-auth-- anywhere anywhere > > > ACCEPT udp -- anywhere 224.0.0.251 udp > > > dpt:5353 > > > ACCEPT udp -- anywhere anywhere udp > > > dpt:ipp > > > ACCEPT all -- anywhere anywhere state > > > RELATED,ESTAB > > > LISHED > > > ACCEPT tcp -- anywhere anywhere state NEW > > > tcp dpts: > > > 30000:30101 > > > ACCEPT tcp -- anywhere anywhere state NEW > > > tcp dpt:s > > > sh > > > ACCEPT udp -- anywhere anywhere state NEW > > > udp dpt:a > > > fs3-callback > > > REJECT all -- anywhere anywhere > > > reject-with icmp-ho > > > st-prohibited > > > [EMAIL PROTECTED] ~]# > > > > > > ************************************************************ > > > one more thing.... > > > Do you use TCP protocol? or do you use UDP? > > > > > > Regards Avi, > > > P.S I think a beginning of a beautiful friendship.. :-) > > > > > > > > > > > > On Dec 24, 2007 5:29 PM, Aaron Knister < [EMAIL PROTECTED]> wrote: > > > > > > > That sounds like quite a task! Could you show me the contents of > > > > your > > > > firewall rules on the systems mentioned below? (iptables -L) on > > > > each. > > > > That would help to diagnose the problem further. > > > > > > > > -Aaron > > > > > > > > On Dec 24, 2007, at 1:21 AM, Yan Benhammou wrote: > > > > > > > > > Hi Aaron and thank you for you fast answwers. > > > > > We are working (Avi,Meny and me) on the israeli GRID and we need > > > > to > > > > > create a single huge file system for this GRID. > > > > > cheers > > > > > Yan > > > > > > > > > > ________________________________ > > > > > > > > > > From: Aaron Knister [mailto: [EMAIL PROTECTED] > > > > > Sent: Sun 12/23/2007 8:27 PM > > > > > To: Avi Gershon > > > > > Cc: [email protected] ; Yan Benhammou; Meny Ben moshe > > > > > Subject: Re: [Lustre-discuss] help needed. > > > > > > > > > > > > > > > Can you check the firewall on each of those machines ( iptables -L > > > > ) > > > > > and paste that here. Also, is this network dedicated to Lustre? > > > > > Lustre can easily saturate a network interface under load to the > > > > > point it becomes difficult to login to a node if it only has one > > > > > interface. I'd recommend using a different interface if you can. > > > > > > > > > > On Dec 23, 2007, at 11:03 AM, Avi Gershon wrote: > > > > > > > > > > > > > > > node 1 132.66.176.212 < http://132.66.176.212/> > > > > > node 2 132.66.176.215 < http://132.66.176.215/> > > > > > > > > > > [EMAIL PROTECTED] ~]# ssh 132.66.176.215 > > > > > <http://132.66.176.215/> > > > > > [EMAIL PROTECTED]'s password: > > > > > ssh(21957) Permission denied, please try again. > > > > > [EMAIL PROTECTED] 's password: > > > > > Last login: Sun Dec 23 14:32:51 2007 from x-math20.tau.ac.il > > > > <http://x-math20.tau.ac.il/ > > > > > > > > > > > [EMAIL PROTECTED] ~]# lctl ping [EMAIL PROTECTED] > > > > > failed to ping [EMAIL PROTECTED]: Input/output error > > > > > [EMAIL PROTECTED] ~]# lctl list_nids > > > > > [EMAIL PROTECTED] > > > > > [EMAIL PROTECTED] ~]# ssh 132.66.176.212 < > > > > http://132.66.176.212/> > > > > > The authenticity of host ' 132.66.176.212 < > > > > http://132.66.176.212/> > > > > > ( 132.66.176.212 <http://132.66.176.212/> )' can't be established. > > > > > RSA1 key fingerprint is > > > > 85:2a:c1:47:84:b7:b5:a6:cd:c4:57:86:af:ce: > > > > > 7e:74. > > > > > Are you sure you want to continue connecting (yes/no)? yes > > > > > ssh(11526) Warning: Permanently added ' 132.66.176.212 > > > > > <http://132.66.176.212/ > > > > > > ' (RSA1) to the list of kno > > > > > wn hosts. > > > > > [EMAIL PROTECTED]'s password: > > > > > Last login: Sun Dec 23 15:24:41 2007 from x-math20.tau.ac.il< > > > > http://x-math20.tau.ac.il/ > > > > > > > > > > > [EMAIL PROTECTED] ~]# lctl ping [EMAIL PROTECTED] > > > > > failed to ping [EMAIL PROTECTED]: Input/output error > > > > > [EMAIL PROTECTED] ~]# lctl list_nids > > > > > [EMAIL PROTECTED] > > > > > [EMAIL PROTECTED] ~]# > > > > > > > > > > > > > > > thanks for helping!! > > > > > Avi > > > > > > > > > > > > > > > On Dec 23, 2007 5:32 PM, Aaron Knister < [EMAIL PROTECTED]> > > > > wrote: > > > > > > > > > > > > > > > On the oss can you ping the mds/mgs using this > > > > command-- > > > > > > > > > > lctl ping [EMAIL PROTECTED] > > > > > > > > > > If it doesn't ping, list the nids on each node by > > > > running > > > > > > > > > > lctl list_nids > > > > > > > > > > and tell me what comes back. > > > > > > > > > > -Aaron > > > > > > > > > > > > > > > On Dec 23, 2007, at 9:22 AM, Avi Gershon wrote: > > > > > > > > > > > > > > > HI I could use some help. > > > > > I installed lustre on 3 computers > > > > > mdt/mgs : > > > > > > > > > > > > > > > > > > > ************************************************************************************8 > > > > > > > > > [EMAIL PROTECTED] ~]#mkfs.lustre --reformat > > > > --fsname spfs --mdt -- > > > > > mgs /dev/hdb > > > > > > > > > > Permanent disk data: > > > > > Target: spfs-MDTffff > > > > > Index: unassigned > > > > > Lustre FS: spfs > > > > > Mount type: ldiskfs > > > > > Flags: 0x75 > > > > > (MDT MGS needs_index > > > > first_time update ) > > > > > Persistent mount opts: > > > > errors=remount-ro,iopen_nopriv,user_xattr > > > > > Parameters: > > > > > > > > > > device size = 19092MB > > > > > formatting backing filesystem ldiskfs on > > > > /dev/hdb > > > > > target name spfs-MDTffff > > > > > 4k blocks 0 > > > > > options -J size=400 -i 4096 > > > > -I 512 -q -O dir_index > > > > > -F > > > > > mkfs_cmd = mkfs.ext2 -j -b 4096 -L > > > > spfs-MDTffff -J size=400 -i > > > > > 4096 -I 512 -q -O dir_index -F /dev/hdb > > > > > Writing CONFIGS/mountdata > > > > > [ [EMAIL PROTECTED] ~]# df > > > > > Filesystem 1K-blocks Used > > > > Available Use% Mounted on > > > > > /dev/hda1 19228276 4855244 > > > > 13396284 27% / > > > > > none 127432 0 > > > > 127432 0% /dev/shm > > > > > /dev/hdb 17105436 455152 > > > > 15672728 3% /mnt/test/ > > > > > mdt > > > > > [EMAIL PROTECTED] ~]# cat > > > > /proc/fs/lustre/devices > > > > > 0 UP mgs MGS MGS 5 > > > > > 1 UP mgc [EMAIL PROTECTED] > > > > > 5f5ba729-6412-3843-2229-1310a0b48f71 5 > > > > > 2 UP mdt MDS MDS_uuid 3 > > > > > 3 UP lov spfs-mdtlov spfs-mdtlov_UUID 4 > > > > > 4 UP mds spfs-MDT0000 spfs-MDT0000_UUID 3 > > > > > [ [EMAIL PROTECTED] ~]# > > > > > > > > > *************************************************************end > > > > > mdt******************************8 > > > > > so you can see that the MGS is up > > > > > ond on the ost's I get an error!! plz > > > > help... > > > > > > > > > > ost: > > > > > > > > > > > > > > ********************************************************************** > > > > > [ [EMAIL PROTECTED] ~]# mkfs.lustre --reformat > > > > --fsname spfs --ost -- > > > > > mgsnode=132.66. [EMAIL PROTECTED] /dev/hdb1 > > > > > > > > > > Permanent disk data: > > > > > Target: spfs-OSTffff > > > > > Index: unassigned > > > > > Lustre FS: spfs > > > > > Mount type: ldiskfs > > > > > Flags: 0x72 > > > > > (OST needs_index first_time > > > > update ) > > > > > Persistent mount opts: > > > > errors=remount-ro,extents,mballoc > > > > > Parameters: [EMAIL PROTECTED] > > > > > > > > > > device size = 19594MB > > > > > formatting backing filesystem ldiskfs on > > > > /dev/hdb1 > > > > > target name spfs-OSTffff > > > > > 4k blocks 0 > > > > > options -J size=400 -i 16384 > > > > -I 256 -q -O > > > > > dir_index -F > > > > > mkfs_cmd = mkfs.ext2 -j -b 4096 -L > > > > spfs-OSTffff -J size=400 -i > > > > > 16384 -I 256 -q -O dir_index -F /dev/hdb1 > > > > > Writing CONFIGS/mountdata > > > > > [ [EMAIL PROTECTED] ~]# /CONFIGS/mountdata > > > > > -bash: /CONFIGS/mountdata: No such file or > > > > directory > > > > > [EMAIL PROTECTED] ~]# mount -t lustre > > > > /dev/hdb1 /mnt/test/ost1 > > > > > mount.lustre: mount /dev/hdb1 at > > > > /mnt/test/ost1 failed: Input/ > > > > > output error > > > > > Is the MGS running? > > > > > > > > > ***********************************************end > > > > > ost******************************** > > > > > > > > > > can any one point out the problem? > > > > > thanks Avi. > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Lustre-discuss mailing list > > > > > [email protected] > > > > > > > > > > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Aaron Knister > > > > > Associate Systems Administrator/Web Designer > > > > > Center for Research on Environment and Water > > > > > > > > > > (301) 595-7001 > > > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Aaron Knister > > > > > Associate Systems Administrator/Web Designer > > > > > Center for Research on Environment and Water > > > > > > > > > > (301) 595-7001 > > > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > Aaron Knister > > > > Associate Systems Administrator/Web Designer > > > > Center for Research on Environment and Water > > > > > > > > (301) 595-7001 > > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > Lustre-discuss mailing list > > [email protected] > > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > > > > > > Aaron Knister > > Associate Systems Analyst > > Center for Ocean-Land-Atmosphere Studies > > > > (301) 595-7000 > > [EMAIL PROTECTED] > > > > > > > > > > > > Aaron Knister > Associate Systems Analyst > Center for Ocean-Land-Atmosphere Studies > > (301) 595-7000 > [EMAIL PROTECTED] > > > > >
_______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
