Re: ModemManager: MMCallbackInfo scheduled twice when device removed

2011-06-06 Thread Aleksander Morgado

 All the stuff in the protect-callbacks branch looks good, please push.
 I couldn't trigger crashes on a number of devices just pulling them out
 in the middle of a tight loop hammering the modem for info.  Good work.
 

This has been pushed to both git master and MM_05.

Cheers,

-- 
Aleksander

___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list


Re: ModemManager: MMCallbackInfo scheduled twice when device removed

2011-06-03 Thread Dan Williams
On Tue, 2011-05-17 at 09:57 +0200, Aleksander Morgado wrote:
 Hi Dan,
 

Each MMCallbackInfo holds a weak reference to the MMModem to which the
AT command was sent. When the MMModem is destroyed,
mm-callback-info.c::modem_destroyed_cb() gets called and the modem
pointer in the callback info is reset to NULL, to avoid having a pointer
to an already disposed MMModem. See [1].

But, in addition to setting the modem pointer to NULL, an explicit error
is just set directly in the callback info, and it gets scheduled. The
problem here is that after destroying the modem, the ports it held also
get closed as part of the destruction, and during port closing there's
the task of finalizing all pending commands (see
mm-serial-port.c::mm_serial_port_close()), which is done by passing an
error to the specific response handler, and which will then (usually)
propagate that error to the MMCallbackInfo and schedule it right away.

That ends up in a memory leak (most response handlers will just set
info-error without assuming there may already be a GError set); plus a
critical in the second call to mm_callback_info_schedule() because a
callback info can't be scheduled twice.

 
 
 Got the new approach developed in the 'protect-callbacks' branch in the
 following Gitorious repo, ready for review:
 git://gitorious.org/lanedo/modemmanager.git
  
 It includes the following fixes:
 
  * A new mm_callback_info_check_modem_removed() to be used instead of
 mm_modem_check_removed(). All response callbacks should use this new
 method to check whether info-modem is NULL and just return if so,
 without scheduling the info.
 https://gitorious.org/lanedo/modemmanager/commit/a3d9298391b39445755b230144b1fc5cda5c60ed
 
  * A fix in modem_destroyed_cb() to ensure that ERROR_REMOVED is always
 set.
 https://gitorious.org/lanedo/modemmanager/commit/3acce64ad591a1a436f2ea29c85f706a2da24e4c
 
  * A fix in several plugins to avoid using a custom DisableInfo
 structure, and use a MMCallbackInfo instead. I ended up doing this using
 a custom invoke method, which calls parent disable passing the callback,
 instead of the set_data()/get_data() approach. Not sure if you'll like
 this :-).
 https://gitorious.org/lanedo/modemmanager/commit/19395ffa8f5a5927bca2e441cd135b72248c9a15

All the stuff in the protect-callbacks branch looks good, please push.
I couldn't trigger crashes on a number of devices just pulling them out
in the middle of a tight loop hammering the modem for info.  Good work.

Dan


___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list


Re: ModemManager: MMCallbackInfo scheduled twice when device removed

2011-05-12 Thread Aleksander Morgado

  
  Each MMCallbackInfo holds a weak reference to the MMModem to which the
  AT command was sent. When the MMModem is destroyed,
  mm-callback-info.c::modem_destroyed_cb() gets called and the modem
  pointer in the callback info is reset to NULL, to avoid having a pointer
  to an already disposed MMModem. See [1].
  
  But, in addition to setting the modem pointer to NULL, an explicit error
  is just set directly in the callback info, and it gets scheduled. The
  problem here is that after destroying the modem, the ports it held also
  get closed as part of the destruction, and during port closing there's
  the task of finalizing all pending commands (see
  mm-serial-port.c::mm_serial_port_close()), which is done by passing an
  error to the specific response handler, and which will then (usually)
  propagate that error to the MMCallbackInfo and schedule it right away.
  
  That ends up in a memory leak (most response handlers will just set
  info-error without assuming there may already be a GError set); plus a
  critical in the second call to mm_callback_info_schedule() because a
  callback info can't be scheduled twice.
  
  I've been looking at this problem some time, and the best way to fix it
  seems to be to just avoid setting the first error and scheduling in
  modem_destroyed_cb():
  
  diff --git a/src/mm-callback-info.c b/src/mm-callback-info.c
  index 1986bb5..8f8ecca 100644
  --- a/src/mm-callback-info.c
  +++ b/src/mm-callback-info.c
  @@ -53,13 +53,9 @@ modem_destroyed_cb (gpointer data, GObject
  *destroyed)
   {
   MMCallbackInfo *info = data;
   
  +/* Just set the modem pointer to NULL, do not further process the
  + * callback info */
   info-modem = NULL;
  -if (!info-pending_id) {
  -info-error = g_error_new_literal (MM_MODEM_ERROR,
  -   MM_MODEM_ERROR_REMOVED,
  -   The modem was removed.);
  -mm_callback_info_schedule (info);
  -}
   }
  
  Comments?
 
 How does that play into the mm_modem_check_removed() checks which I
 think are one of the more common places this error is looked for?  I
 think the motivation here was to be *sure* that any callback handler
 waiting for the result got an error.  The thing we want to ensure here
 is that if a D-Bus client of MM made a request when the modem was pulled
 out or crashed (which can happen pretty often) that the client gets a
 reply instead of a timed-out D-Bus method return error.  I'm sure
 there's room for cleanup here, but that was the original motivation.  Do
 we just need to audit the callback handlers to make sure they do the
 right thing with a ERROR_REMOVED?
 

You're right, removing the error setting in modem_destroyed_cb() is not
the right thing to do, if we want to keep the ERROR_REMOVED logic
around.

I will try to explain the issue with an example flow of execution, and
suggest another approach below:

 (1) MM sends an AT command to the modem, like the periodic ones to get
signal strength.

 (2) Modem is disconnected, reply to the sent AT command was not
received yet, so it gets disconnected in the middle of a send-receive
operation.

 (3) MMModem object destruction starts. Due to the weak reference to the
modem in the MMCallbackInfo, modem_destroyed_cb() gets called.
info-modem is set to NULL, info-error is set to ERROR_REMOVED and the
callback info is scheduled.

 (4) During modem object destruction (in the same loop iteration as step
3!), ports in the modem also get destroyed. During port destruction, all
pending commands which didn't get reply are processed, and a SEND_FAILED
error is passed to the response-received callback. This callback then
uses mm_modem_check_removed() (well, actually not all use it), and as
info-modem is NULL, it will overwrite info-error with a newly created
ERROR_REMOVED error. Once the new error is set, the callback info is
scheduled, again.

So always ends up with a memleak and a re-schedule of the callback info.

From what I see, there are some things to fix here then:

 (a) Not all response-received callbacks use mm_modem_check_removed(),
they should all use it.

 (b) The response-received callback should not overwrite info-error if
already set (avoiding memleak).

 (c) If ERROR_REMOVED is already set in the MMCallbackInfo, the
response-received callback should *not* re-schedule the callback info,
it should assume it was already scheduled. Another option would be to
modify mm_callback_info_schedule() to just allow being called twice,
silently returning without doing anything the second time it gets
called.

What do you think of this?

Cheers!

-- 
Aleksander

___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list


Re: ModemManager: MMCallbackInfo scheduled twice when device removed

2011-05-11 Thread Dan Williams
On Wed, 2011-05-11 at 16:20 +0200, Aleksander Morgado wrote:
 Hi all,
 
 Each MMCallbackInfo holds a weak reference to the MMModem to which the
 AT command was sent. When the MMModem is destroyed,
 mm-callback-info.c::modem_destroyed_cb() gets called and the modem
 pointer in the callback info is reset to NULL, to avoid having a pointer
 to an already disposed MMModem. See [1].
 
 But, in addition to setting the modem pointer to NULL, an explicit error
 is just set directly in the callback info, and it gets scheduled. The
 problem here is that after destroying the modem, the ports it held also
 get closed as part of the destruction, and during port closing there's
 the task of finalizing all pending commands (see
 mm-serial-port.c::mm_serial_port_close()), which is done by passing an
 error to the specific response handler, and which will then (usually)
 propagate that error to the MMCallbackInfo and schedule it right away.
 
 That ends up in a memory leak (most response handlers will just set
 info-error without assuming there may already be a GError set); plus a
 critical in the second call to mm_callback_info_schedule() because a
 callback info can't be scheduled twice.
 
 I've been looking at this problem some time, and the best way to fix it
 seems to be to just avoid setting the first error and scheduling in
 modem_destroyed_cb():
 
 diff --git a/src/mm-callback-info.c b/src/mm-callback-info.c
 index 1986bb5..8f8ecca 100644
 --- a/src/mm-callback-info.c
 +++ b/src/mm-callback-info.c
 @@ -53,13 +53,9 @@ modem_destroyed_cb (gpointer data, GObject
 *destroyed)
  {
  MMCallbackInfo *info = data;
  
 +/* Just set the modem pointer to NULL, do not further process the
 + * callback info */
  info-modem = NULL;
 -if (!info-pending_id) {
 -info-error = g_error_new_literal (MM_MODEM_ERROR,
 -   MM_MODEM_ERROR_REMOVED,
 -   The modem was removed.);
 -mm_callback_info_schedule (info);
 -}
  }
 
 Comments?

How does that play into the mm_modem_check_removed() checks which I
think are one of the more common places this error is looked for?  I
think the motivation here was to be *sure* that any callback handler
waiting for the result got an error.  The thing we want to ensure here
is that if a D-Bus client of MM made a request when the modem was pulled
out or crashed (which can happen pretty often) that the client gets a
reply instead of a timed-out D-Bus method return error.  I'm sure
there's room for cleanup here, but that was the original motivation.  Do
we just need to audit the callback handlers to make sure they do the
right thing with a ERROR_REMOVED?

Dan

___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list