Packet send failed to 62.101.92.90(137) ERRNO=Network is unreachable

[2004/05/03 10:27:45, 0] nmbd/nmbd_packets.c:reply_netbios_packet(975)



Hi list,
once again looking for some good advice. I apologize for the long email and for the 
mistakes I could have done.

I'm running Samba 3.0.2-6.3E on White box linux 3.0 kernel 2.4.21-4.EL  (clone of Red 
Hat EL 3.0), the environment is a Win 2k ADS domain. Users log into the shares (just 
common folders - no home dir) using winbind. Everything works fine for days, but a 
couple time the server froze and I was unable to even log in from console. Last time 
it happened, I could have a look at processes running on the box cause I had an 
already opened shell, and I found winbindd was using 149M memory. I couldn't ping from 
or to the box (1 ping success on 6 circa) and restarting winbindd and network solved 
the problem. Now everything is working fine, the server has not been restarted.

I had a look at the logs, and this is what I think is more important:

log.smbd

a lot of:
[2004/05/05 14:54:54, 0] lib/util_sock.c:get_peer_addr(952)

getpeername failed. Error was Transport endpoint is not connected   



I read that this means a broken connection, but then how can it be that just 
restarting network services all goes back working fine? Couldn't it be that in any way 
samba interacting with network produced a network collapse? 



log.winbind

[2004/05/05 14:50:25, 1] libads/ldap.c:ads_connect(222)

Failed to get ldap server info

[2004/05/05 14:55:54, 1] libads/ldap.c:ads_connect(222)

Failed to get ldap server info

[2004/05/05 14:57:16, 0] rpc_client/cli_pipe.c:rpc_api_pipe(424)

cli_pipe: return critical error. Error was Call timed out: server did not respond 
after 10000 milliseconds

[2004/05/05 15:00:47, 1] libads/ldap.c:ads_connect(222)

Failed to get ldap server info

[2004/05/05 15:01:13, 1] libsmb/cliconnect.c:cli_start_connection(1372)

failed negprot

[2004/05/05 15:01:23, 1] libsmb/cliconnect.c:cli_start_connection(1372)

failed negprot

[2004/05/05 15:01:37, 1] libsmb/cliconnect.c:cli_start_connection(1372)

failed negprot

[2004/05/05 15:05:48, 1] libads/ldap.c:ads_connect(222)

Failed to get ldap server info

[2004/05/05 15:13:10, 0] rpc_client/cli_pipe.c:rpc_api_pipe(424)

cli_pipe: return critical error. Error was Call timed out: server did not respond 
after 10000 milliseconds

[2004/05/05 15:17:32, 0] rpc_client/cli_pipe.c:cli_nt_session_open(1437)

cli_nt_session_open: cli_nt_create failed on pipe \NETLOGON to machine FBCSRVDC01. 
Error was Call timed out: server did not respond after 10000 milliseconds

[2004/05/05 15:21:02, 0] lib/pidfile.c:pidfile_create(84)

ERROR: winbindd is already running. File /var/run/winbindd.pid exists and process id 
1127 is running.

[2004/05/05 15:21:32, 1] libads/ldap.c:ads_connect(222)

Failed to get ldap server info

[2004/05/05 15:22:25, 1] libads/ldap.c:ads_connect(222)

Failed to get ldap server info

[2004/05/05 15:22:25, 1] libads/ldap_utils.c:ads_do_search_retry(77)

ads_search_retry: failed to reconnect (Can't contact LDAP server)

[2004/05/05 15:22:25, 1] libads/ads_ldap.c:ads_name_to_sid(58)

name_to_sid ads_search: Can't contact LDAP server

[2004/05/05 15:22:25, 1] nsswitch/winbindd_group.c:winbindd_getgroups(954)

user 'fbcrompc116$' does not exist

[2004/05/05 15:22:25, 0] lib/fault.c:fault_report(36)

===============================================================

[2004/05/05 15:22:25, 0] lib/fault.c:fault_report(37)

INTERNAL ERROR: Signal 11 in pid 1127 (3.0.2-6.3E)

Please read the appendix Bugs of the Samba HOWTO collection

[2004/05/05 15:22:25, 0] lib/fault.c:fault_report(39)

===============================================================

[2004/05/05 15:22:25, 0] lib/util.c:smb_panic(1422)

PANIC: internal error

[2004/05/05 15:22:25, 0] lib/util.c:smb_panic(1430)

BACKTRACE: 14 stack frames:

#0 winbindd(smb_panic+0x13f) [0x80cb96f]

#1 winbindd [0x80b7428]

#2 /lib/tls/libc.so.6 [0xb73b3c08]

#3 winbindd(ads_name_to_sid+0x5c) [0x8181a6c]

#4 winbindd [0x8084980]

#5 winbindd [0x807912c]

#6 winbindd(winbindd_lookup_sid_by_name+0x66) [0x8074cf6]

#7 winbindd(winbindd_getpwnam+0x249) [0x806f0f9]

#8 winbindd(strftime+0x14bc) [0x806d6d8]

#9 winbindd(winbind_process_packet+0x2f) [0x806da2f]

#10 winbindd(strftime+0x2197) [0x806e3b3]

#11 winbindd(main+0x43e) [0x806e97e]

#12 /lib/tls/libc.so.6(__libc_start_main+0xf8) [0xb73a1748]

#13 winbindd(chroot+0x35) [0x806cdf1]

[2004/05/05 15:23:18, 1] libads/ldap.c:ads_connect(222)

Failed to get ldap server info

[2004/05/05 15:23:38, 1] libsmb/cliconnect.c:cli_start_connection(1372)

failed negprot

[2004/05/05 15:24:55, 0] rpc_client/cli_pipe.c:rpc_api_pipe(424)

cli_pipe: return critical error. Error was Call timed out: server did not respond 
after 10000 milliseconds

[2004/05/05 15:24:55, 1] nsswitch/winbindd_util.c:add_trusted_domain(166)

Added domain FBCMEDIA FBCMEDIA.COM S-0-0

[2004/05/05 15:24:55, 1] libsmb/clikrb5.c:ads_krb5_mk_req(269)

krb5_cc_get_principal failed (No credentials cache found)



Is it ok that  SID for domain is FBCMEDIA.COM S-0-0 ?? If I do net getlocalsid 
fbcmedia I get S-1-5-21-735.....and so on. 

All net commands and groupmappings are working, wbinfo ok.



messages.log

May 5 14:52:44 fbcsrvsmb01 smbd[8786]: write_socket_data: write failure. Error = 
Broken pipe 

May 5 14:52:44 fbcsrvsmb01 smbd[8786]: [2004/05/05 14:52:44, 0] 
lib/util_sock.c:write_socket(413) 

May 5 14:52:44 fbcsrvsmb01 smbd[8786]: write_socket: Error writing 61503 bytes to 
socket 5: ERRNO = Broken pipe 

May 5 14:52:44 fbcsrvsmb01 smbd[8786]: [2004/05/05 14:52:44, 0] 
lib/util_sock.c:send_smb(605) 

May 5 14:52:44 fbcsrvsmb01 smbd[8786]: Error writing 61503 bytes to client. -1. 
(Broken pipe) 

May 5 14:52:50 fbcsrvsmb01 smbd[8915]: [2004/05/05 14:52:50, 0] 
lib/util_sock.c:read_socket_data(342) 

May 5 14:52:50 fbcsrvsmb01 smbd[8915]: read_socket_data: recv failure for 4. Error = 
Connection reset by peer 

May 5 14:53:29 fbcsrvsmb01 smbd[3587]: [2004/05/05 14:53:28, 0] 
lib/util_sock.c:read_socket_data(342) 

May 5 14:53:29 fbcsrvsmb01 smbd[3587]: read_socket_data: recv failure for 4. Error = 
Connection reset by peer 

May 5 14:54:25 fbcsrvsmb01 smbd[8953]: [2004/05/05 14:54:25, 0] 
lib/util_sock.c:read_socket_data(342) 

May 5 14:54:25 fbcsrvsmb01 smbd[8953]: read_socket_data: recv failure for 4. Error = 
Connection reset by peer 

May 5 14:54:34 fbcsrvsmb01 smbd[8959]: [2004/05/05 14:54:34, 0] 
lib/util_sock.c:read_socket_data(342) 

May 5 14:54:34 fbcsrvsmb01 smbd[8959]: read_socket_data: recv failure for 4. Error = 
Connection reset by peer 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: [2004/05/05 14:54:54, 0] 
lib/util_sock.c:get_peer_addr(952) 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: getpeername failed. Error was Transport 
endpoint is not connected 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: [2004/05/05 14:54:54, 0] 
lib/util_sock.c:get_peer_addr(952) 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: getpeername failed. Error was Transport 
endpoint is not connected 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: [2004/05/05 14:54:54, 0] 
lib/access.c:check_access(328) 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: [2004/05/05 14:54:54, 0] 
lib/util_sock.c:get_peer_addr(952) 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: getpeername failed. Error was Transport 
endpoint is not connected 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: Denied connection from (0.0.0.0) 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: [2004/05/05 14:54:54, 0] 
lib/util_sock.c:get_peer_addr(952) 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: getpeername failed. Error was Transport 
endpoint is not connected 

May 5 14:54:54 fbcsrvsmb01 smbd[8969]: Connection denied from 0.0.0.0 



What does it mean connection denied from 0.0.0.0? I have logs 0.0.0.0.log in the log 
dir, what does it mean? 



I have been looking in the mailing list and googling in the last two days, but I 
couldn't find a final answer. It looks like it can be related to network problems (but 
restarting service network wouldn't fix it I think) or iptables, but it looks and 
manifest like a random issue. It has been working fine for many days, and nothing has 
been changed lately.

If you're still there, thanks for reading. Any idea is really welcome, and much more 
welcome if possible, would be a hint on how to monitor the linux box (for ex how can I 
understand what froze the network?) , which tools to use (I can figure out myself how 
to use them, not asking for a tutorial), so that I can be much more useful to the list 
than just ask for help  ;-)



Thanks for you time

Simone








---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.677 / Virus Database: 439 - Release Date: 04/05/2004

Errore Apertura DB
--
To unsubscribe from this list go to the following URL and read the
instructions:  http://lists.samba.org/mailman/listinfo/samba

Reply via email to