Pete Wyckoff wrote:
[EMAIL PROTECTED] wrote on Tue, 04 Mar 2008 17:35 -0600:
It looks like the IB BMI layer is ending up double-freeing the method_addr structure on the BMI_ib_set_info function, but it only happens when the Metadata server is also a data server.

If you look at the following GDB output, the last two entries have the same method_addr, and I can't figure out a good way to tell in BMI_set_info if the method_address has already been freed. It also looks like the id_string has been mangled or freed somewhere earlier as well.

All your deadref were different values there, so I'm not seeing the
double-free aspect.  But I have no doubt that you're on to something
in here.  Also, at this location, the id_string and method_addr have
already been freed, so we shouldn't count on them having reasonable
values in them.

I've always had a hard time keeping these references straight.  Can
you verify that you're getting to these spots via dealloc_ref_st(),
and maybe a couple steps up from there, for sanity?

Trying to figure out what other devices do in the DROP_ADDR handler.
MX goes and calls bmi_method_addr_forget_callback() in there, but
that doesn't seem right, as it will just wind around through
dealloc_ref_st() again.  It looks like TCP is doing more or less
what IB is doing.

Is your situation funky because you're using the comma-list notation
for addresses, perhaps?  I think that's pretty uncommon these days.

                -- Pete
It does appear the comma-list notation is causing some strangeness..
The situation I had before was that there were 4 calls to dealloc_ref_st, and the last two pointed to the same method_addr. (I've verified this with some debug() prints in BMI_ib_set_info as well)

If the comma-list notation is that problematic, we should probably remove support for it.
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to