Hi, Dan Thanks for the replay. The code of NetworkManger is downloaded from gnome's ftp server with version 0.7.0. I've tried this NM-0.7 with both dhcpcd-4.0.10 and dhclient-3.1.0.
Following your advice, I add some log output code, and find the device->state in function <nm_dhcp_manager_begin_transaction> is DHC_NBI (0) or DHC_END (15). In the case of dhcpcd-4.0.10, I see some log messages like: "wlan0: dhcpcd already running on pid 1626 (/var/run/dhcpcd-wlan0.pid)", and NM forgets some dhcpcd processes. Also I see log messages like "dhcpcd[1630]: write signal 15: Bad file descriptor", does this suggests that the dhcpcd process can't write the pid file?? In the case of dhclient-3.1.0, (in my testing period), no <dhcp_watch_cb> called (doesn't like the case of dhcpcd-4.0.10), but still, log messages like "nm_dhcp_manager_handle_event(): Received DHCP event from unexpected PID 19924 (expected 0)" appear. I see only one log message like "dhclient: There is already a pid file /usr/var/run/dhclient-wlan0.pid with pid 19911". This time, NM forgets dhclient processes more easily. When I switch connections between APs, the number of dhclient processes increases to 4 or 5 in a short time (which doesn't like the case of dhcpcd-4.0.10). Do these suggest both of the two dhcp clients above having problems? Is the pidfile reliable?(<nm_dhcp_manager_cancel_transaction_real> uses device->pid to kill the dhcp process, but <nm_dhcp_client> uses pidfile) BTW, thanks for your work, I will try the code with the tag 0.7.1rc1 later. 2009/2/19 Dan Williams <[email protected]> > On Wed, 2009-02-18 at 17:10 +0800, cee1 wrote: > > Hi, > > I'm using NetwokManager-0.7(using dhcpcd) with wifi, and see a > > lot of messages like "Received DHCP event from unexpected PID xxxx > > (expected xxxx)" in the syslog. > > What version of NM specifically? The final NM release from late > November 2008, or a snapshot from some other point in time? > > > I also see more than one dhcpcd processes running. Sometimes, > > only one dhcpcd process lefts at last, but sometimes not. (searching > > google, find the same problem someone has reported: > > https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/255829) > > Following the source code(NM-0.7), in the function > > <nm_dhcp_client_start> (nm-dhcp-dhcpcd.c), it kills the previous > > dhcpcd process and saves the new one's pid in device->pid. But the > > function <dhcp_watch_cb>, knowing the termination of the previous > > dhcpd process, sets device->pid to zero. So the current dhcp process's > > pid is lost, and <nm_dhcp_manager_handle_event> will print log like > > "Received DHCP event from unexpected PID 8812 (expected 0)". > > Obviously the previous one isn't getting cleaned up correctly. It might > be that dhcpcd returns a state NM doesn't expect, and thus isn't getting > cleaned up in nm_dhcp_manager_begin_transaction(): > > if (state_is_bound (device->state) || (device->state == DHC_START)) > { > /* Cancel any DHCP transaction already in progress */ > nm_dhcp_manager_cancel_transaction_real (device); > } > > Basically, it should *always* be the case that a previous dhcp client > instance is cleaned up before starting a new one, so the watch callback > should certainly be removed before starting a new dhcp client run. > > > Also the previous dhcpcd process's time_out handler may expire, > > and affects the current running dhcpcd process? > > The timeout gets cleared whenever DHCP options get returned, or when the > dhcp device gets destroyed, or when nm_dhcp_manager_cancel_transaction() > is called to clean up any ongoing DHCP transaction. > cancel_transaction() gets called when the device is deactivated. > > In any case, I've fixed up a few issues with the DHCP client handling > code and made it somewhat more robust against this issue. If you can > possibly test the latest git, or at least figure out what state the DHCP > client is left in (device->state) when > nm_dhcp_manager_begin_transaction() gets called, that would be great. > > Dan > > > It seems the function <supplicant_interface_acquire> > > (nm-device-wifi.c) registers a function > > <supplicant_iface_connection_state_cb> as the handler of the signal > > "connection-state" . > > And: > > <supplicant_iface_connection_state_cb> shcedules > > <supplicant_iface_connection_state_cb_handler>, which calls > > <nm_device_activate_schedule_stage3_ip_config_start>, which shcedules > > <nm_device_activate_stage3_ip_config_start> , which calls > > <real_act_stage3_ip_config_start>, which calls > > <nm_dhcp_manager_begin_transaction>. > > So it is possiable that more than one dhcpcd processes running > > meanwhile. > > > > To fix this, shoud I do: > > 1) Protect the corresponding data struct > > or > > 2) Add a clean step when a second dhcpcd process to be started > > > > Any ideas? Thanks. > > _______________________________________________ > > NetworkManager-list mailing list > > [email protected] > > http://mail.gnome.org/mailman/listinfo/networkmanager-list > >
_______________________________________________ NetworkManager-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/networkmanager-list
