All - I just checked an amd64 server and a ppc client and dont see any of the sock entries from above. I dont see *any* sock entries from the pvfs2-client process. Did you build your IB with the flag for disabling bmi-tcp? No idea if that could be the culprit, but we have that disabled here. It doesnt appear to be manifested here in IB.
Kyle Schochenmaier On Mon, Jan 12, 2009 at 8:40 AM, Phil Carns <[email protected]> wrote: > Ah, Ok. I didn't realize that you were using infiniband. Can any IB gurus > on the list confirm if it is responsible for the extra "sock" entries lsof? > > You can always increase the number of available file descriptors in your > init script before starting pvfs2-client if you need to ("ulimit -n 100000", > for example). That might be all you need to do as long as the number of > file descriptors isn't growing without bound. > > -Phil > > Kumar, Amit H. wrote: >> >> Hi Phil, >> I do see other open files, I just did not included it. Of what you listed >> I don't see anything related to IPV4. May be because I am mounting it over >> Infiniband. Here it is ... >> pvfs2-cli 14792 root cwd DIR 8,1 4096 >> 196609 /root >> pvfs2-cli 14792 root rtd DIR 8,1 4096 >> 2 / >> pvfs2-cli 14792 root txt REG 8,6 46624 >> 5701636 /opt/pvfs2/sbin/pvfs2-client >> pvfs2-cli 14792 root mem REG 8,1 130304 >> 229708 /lib64/ld-2.5.so >> pvfs2-cli 14792 root mem REG 8,1 1687464 >> 229709 /lib64/libc-2.5.so >> pvfs2-cli 14792 root mem REG 8,1 23360 >> 229710 /lib64/libdl-2.5.so >> pvfs2-cli 14792 root mem REG 8,1 141344 >> 229714 /lib64/libpthread-2.5.so >> pvfs2-cli 14792 root mem REG 8,1 241006 >> 2392128 /usr/lib64/libibverbs.so.1.0.0 >> pvfs2-cli 14792 root 0r CHR 1,3 >> 1520 /dev/null >> pvfs2-cli 14792 root 1w CHR 1,3 >> 1520 /dev/null >> pvfs2-cli 14792 root 2w CHR 1,3 >> 1520 /dev/null >> pvfs2-cli 14792 root 3w REG 8,5 43162 >> 98311 /tmp/pvfs2-client.log (deleted) >> pvfs2-cli 14793 root cwd DIR 8,1 4096 >> 196609 /root >> pvfs2-cli 14793 root rtd DIR 8,1 4096 >> 2 / >> pvfs2-cli 14793 root txt REG 8,6 2722000 >> 5701637 /opt/pvfs2/sbin/pvfs2-client-core >> pvfs2-cli 14793 root mem REG 8,1 130304 >> 229708 /lib64/ld-2.5.so >> pvfs2-cli 14793 root mem REG 8,1 1687464 >> 229709 /lib64/libc-2.5.so >> pvfs2-cli 14793 root mem REG 8,1 23360 >> 229710 /lib64/libdl-2.5.so >> pvfs2-cli 14793 root mem REG 8,1 141344 >> 229714 /lib64/libpthread-2.5.so >> pvfs2-cli 14793 root mem REG 8,1 241006 >> 2392128 /usr/lib64/libibverbs.so.1.0.0 >> pvfs2-cli 14793 root mem CHR 231,192 >> 5658 /dev/infiniband/uverbs0 >> pvfs2-cli 14793 root mem REG 8,1 156563 >> 1222167 /usr/lib64/libmlx4-rdmav2.so >> pvfs2-cli 14793 root mem REG 8,1 173084 >> 1222165 /usr/lib64/libmthca-rdmav2.so >> pvfs2-cli 14793 root mem REG 8,1 118406 >> 1222169 /usr/lib64/libcxgb3-rdmav2.so >> pvfs2-cli 14793 root mem REG 8,1 69644 >> 1222174 /usr/lib64/libipathverbs-rdmav2.so >> pvfs2-cli 14793 root mem REG 8,1 68419 >> 1222172 /usr/lib64/libnes-rdmav2.so >> pvfs2-cli 14793 root mem REG 8,1 53880 >> 229404 /lib64/libnss_files-2.5.so >> pvfs2-cli 14793 root 0r CHR 1,3 >> 1520 /dev/null >> pvfs2-cli 14793 root 1w CHR 1,3 >> 1520 /dev/null >> pvfs2-cli 14793 root 2w CHR 1,3 >> 1520 /dev/null >> pvfs2-cli 14793 root 3w REG 8,5 43162 >> 98311 /tmp/pvfs2-client.log (deleted) >> pvfs2-cli 14793 root 4w REG 8,5 43162 >> 98311 /tmp/pvfs2-client.log (deleted) >> pvfs2-cli 14793 root 5u CHR 253,0 >> 12918 /dev/pvfs2-req >> pvfs2-cli 14793 root 6u CHR 231,192 >> 5658 /dev/infiniband/uverbs0 >> pvfs2-cli 14793 root 7r DIR 0,20 0 >> 5654 infinibandevent >> pvfs2-cli 14793 root 8r DIR 0,20 0 >> 5654 infinibandevent >> >> Thank you, >> Amit >> >>> -----Original Message----- >>> From: Phil Carns [mailto:[email protected]] On Behalf Of Phil Carns >>> Sent: Thursday, January 08, 2009 2:25 PM >>> To: Kumar, Amit H. >>> Cc: 'Rob Ross'; [email protected] >>> Subject: Re: [Pvfs2-developers] pvfs2-cli can't identify protocol >>> >>> Hi Amit, >>> >>> In your lsof output, do you see any other types of open files from >>> pvfs2-client besides "sock"? The output that you are showing is >>> unusual. Normally everything that pvfs2-client has open will show up >>> as >>> IPV4, REG, CHR, or DIR. >>> >>> Are you using tcp for PVFS communication? >>> >>> -Phil >>> >>> Kumar, Amit H. wrote: >>>> >>>> Hi Rob, >>>> I am using the latest version available for download (pvfs2-v2.7.1) >>>> # netstat -tan >>>> Active Internet connections (servers and established) >>>> Proto Recv-Q Send-Q Local Address Foreign Address >>> >>> State >>>> >>>> tcp 0 0 0.0.0.0:2049 0.0.0.0:* >>> >>> LISTEN >>>> >>>> tcp 0 0 0.0.0.0:677 0.0.0.0:* >>> >>> LISTEN >>>> >>>> tcp 0 0 0.0.0.0:57447 0.0.0.0:* >>> >>> LISTEN >>>> >>>> tcp 0 0 127.0.0.1:199 0.0.0.0:* >>> >>> LISTEN >>>> >>>> tcp 0 0 0.0.0.0:8649 0.0.0.0:* >>> >>> LISTEN >>>> >>>> tcp 0 0 0.0.0.0:938 0.0.0.0:* >>> >>> LISTEN >>>> >>>> tcp 0 0 0.0.0.0:111 0.0.0.0:* >>> >>> LISTEN >>>> >>>> tcp 0 0 127.0.0.1:25 0.0.0.0:* >>> >>> LISTEN >>>> >>>> tcp 0 0 0.0.0.0:953 0.0.0.0:* >>> >>> LISTEN >>>> >>>> tcp 0 0 127.0.0.1:51598 127.0.0.1:199 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.237:862 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.223:878 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.221:850 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 127.0.0.1:199 127.0.0.1:51598 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.207:675 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.235:949 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.233:677 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.227:708 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.205:1003 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.243:991 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.213:718 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.249:1023 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.204:814 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.232:776 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.248:896 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.240:916 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:718 172.25.24.100:2049 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.226:950 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.224:698 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.216:751 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.250:963 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.220:1009 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.206:995 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 172.25.24.251:2049 172.25.24.222:976 >>> >>> ESTABLISHED >>>> >>>> tcp 0 0 :::80 :::* >>> >>> LISTEN >>>> >>>> tcp 0 0 :::22 :::* >>> >>> LISTEN >>>> >>>> tcp 0 0 :::443 :::* >>> >>> LISTEN >>>> >>>> tcp 0 0 :::1311 :::* >>> >>> LISTEN >>>> >>>> tcp 0 0 ::ffff:172.25.24.251:22 >>> >>> ::ffff:172.25.24.210:43811 ESTABLISHED >>>> >>>> Thank you, >>>> Amit >>>>> >>>>> -----Original Message----- >>>>> From: Rob Ross [mailto:[email protected]] >>>>> Sent: Wednesday, January 07, 2009 5:03 PM >>>>> To: Kumar, Amit H. >>>>> Cc: [email protected] >>>>> Subject: Re: [Pvfs2-developers] pvfs2-cli can't identify protocol >>>>> >>>>> Hi Amit, >>>>> >>>>> What version of PVFS is this? >>>>> >>>>> What does the output of netstat -tan look like? >>>>> >>>>> Thanks, >>>>> >>>>> Rob >>>>> >>>>> On Jan 7, 2009, at 2:03 PM, Kumar, Amit H. wrote: >>>>> >>>>>> Hello All, >>>>>> I am trying to understand the following output from "lsof". >>>>>> All/most of our compute nodes (pvfs2 clients) have the following >>>>>> output as reported by "lsof" >>>>>> Few of the pvfs2-client nodes have greater than 1024 open sockets >>>>>> for just the pvfs2-client process. Current ulimit for maximum >>> >>> number >>>>>> >>>>>> of open files per process is set to 1024 on all of our computer >>>>>> nodes. I see this as a potential performance problem. I was >>>>>> wondering if any of you can help me interpret the output and fix >>> >>> any >>>>>> >>>>>> issues that this could be causing. >>>>>> <lsof output> >>>>>> ............... >>>>>> pvfs2-cli 27278 root 121u sock >>>>>> 0,5 13574284 can't identify protocol >>>>>> pvfs2-cli 27278 root 122u sock >>>>>> 0,5 13574285 can't identify protocol >>>>>> pvfs2-cli 27278 root 123u sock >>>>>> 0,5 13574286 can't identify protocol >>>>>> pvfs2-cli 27278 root 124u sock >>>>>> 0,5 13574287 can't identify protocol >>>>>> pvfs2-cli 27278 root 125u sock >>>>>> 0,5 13574288 can't identify protocol >>>>>> pvfs2-cli 27278 root 126u sock >>>>>> 0,5 13574289 can't identify protocol >>>>>> pvfs2-cli 27278 root 127u sock >>>>>> 0,5 13574290 can't identify protocol >>>>>> pvfs2-cli 27278 root 128u sock >>>>>> 0,5 13574291 can't identify protocol >>>>>> pvfs2-cli 27278 root 129u sock >>>>>> 0,5 13574292 can't identify protocol >>>>>> pvfs2-cli 27278 root 130u sock >>>>>> 0,5 13574303 can't identify protocol >>>>>> pvfs2-cli 27278 root 131u sock >>>>>> 0,5 13574304 can't identify protocol >>>>>> pvfs2-cli 27278 root 132u sock >>>>>> 0,5 13574326 can't identify protocol >>>>>> pvfs2-cli 27278 root 133u sock >>>>>> 0,5 13574327 can't identify protocol >>>>>> pvfs2-cli 27278 root 134u sock >>>>>> 0,5 13574328 can't identify protocol >>>>>> pvfs2-cli 27278 root 135u sock >>>>>> 0,5 13574329 can't identify protocol >>>>>> pvfs2-cli 27278 root 136u sock >>>>>> 0,5 13574330 can't identify protocol >>>>>> pvfs2-cli 27278 root 137u sock >>>>>> 0,5 13574331 can't identify protocol >>>>>> pvfs2-cli 27278 root 138u sock >>>>>> 0,5 13574332 can't identify protocol >>>>>> pvfs2-cli 27278 root 139u sock >>>>>> 0,5 13574333 can't identify protocol >>>>>> pvfs2-cli 27278 root 140u sock >>>>>> 0,5 13574334 can't identify protocol >>>>>> pvfs2-cli 27278 root 141u sock >>>>>> 0,5 13574336 can't identify protocol >>>>>> pvfs2-cli 27278 root 142u sock >>>>>> 0,5 13574337 can't identify protocol >>>>>> pvfs2-cli 27278 root 143u sock >>>>>> 0,5 13574338 can't identify protocol >>>>>> pvfs2-cli 27278 root 144u sock >>>>>> 0,5 13574344 can't identify protocol >>>>>> pvfs2-cli 27278 root 145u sock >>>>>> 0,5 13574345 can't identify protocol >>>>>> pvfs2-cli 27278 root 146u sock >>>>>> 0,5 13574346 can't identify protocol >>>>>> pvfs2-cli 27278 root 147u sock >>>>>> 0,5 13574357 can't identify protocol >>>>>> ............ >>>>>> </lsof output> >>>>>> Thank you, >>>>>> Amit >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Pvfs2-developers mailing list >>>>>> [email protected] >>>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2- >>> >>> developers >>>> >>>> _______________________________________________ >>>> Pvfs2-developers mailing list >>>> [email protected] >>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >> > > _______________________________________________ > Pvfs2-developers mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
