Sasha Khapyorsky wrote: I'm working on the changes below. I will send them all as one patch
EZ > Hi Eitan, > > On 15:19 Thu 15 Jun , Eitan Zahavi wrote: > >>>>+/* >>>>+* PARAMETERS >>>>+* p_physp >>>>+* [in] Pointer to an osm_physp_t object. >>>>+* >>>>+* RETURN VALUES >>>>+* The pointer to the P_Key table object. >>>>+* >>>>+* NOTES >>>>+* >>>>+* SEE ALSO >>>>+* Port, Physical Port >>>>+*********/ >>>>+ >>> >>> >>>Is not this simpler to remove 'const' from existing >>>osm_physp_get_pkey_tbl() function instead of using new one? >> >>There are plenty of const functions using this function internally >>so I would have need to fix them too. > > > You are right. Maybe separate patch for this? > I think it is preferable to keep the const function. > >>>>@@ -118,14 +121,29 @@ void osm_pkey_tbl_sync_new_blocks( >>>> p_block = cl_ptr_vector_get(&p_pkey_tbl->blocks, b); >>>> if ( b < new_blocks ) >>>> p_new_block = cl_ptr_vector_get(&p_pkey_tbl->new_blocks, b); >>>>- else { >>>>+ else >>>>+ { >>>> p_new_block = (ib_pkey_table_t *)malloc(sizeof(*p_new_block)); >>>> if (!p_new_block) >>>> break; >>>>+ cl_ptr_vector_set(&((osm_pkey_tbl_t >>>>*)p_pkey_tbl)->new_blocks, + >>>> b, >>>>p_new_block); >>>>+ } >>>>+ >>>> memset(p_new_block, 0, sizeof(*p_new_block)); >>>>- cl_ptr_vector_set(&((osm_pkey_tbl_t *)p_pkey_tbl)->new_blocks, b, >>>>p_new_block); >>>> } >>>>- memcpy(p_new_block, p_block, sizeof(*p_new_block)); >>>>+} >>> >>> >>>You changed this function so it does not do any sync anymore. Should >>>function name be changed too? >> >>Yes correct I will change it. Is a better name: >>osm_pkey_tbl_init_new_blocks ? > > > Great name. > > >>>>+ to show that on the "old" blocks >>>>+*/ >>>>+int >>>>+osm_pkey_tbl_set_new_entry( >>>>+ IN osm_pkey_tbl_t *p_pkey_tbl, >>>>+ IN uint16_t block_idx, >>>>+ IN uint8_t pkey_idx, >>>>+ IN uint16_t pkey) >>>>+{ >>>>+ ib_pkey_table_t *p_old_block; >>>>+ ib_pkey_table_t *p_new_block; >>>>+ >>>>+ if (osm_pkey_tbl_make_block_pair( >>>>+ p_pkey_tbl, block_idx, &p_old_block, &p_new_block)) >>>>+ return 1; >>>>+ >>>>+ cl_map_insert( &p_pkey_tbl->keys, >>>>+ ib_pkey_get_base(pkey), >>>>+ >>>>&(p_old_block->pkey_entry[pkey_idx])); >>> >>> >>>Here you map potentially empty pkey entry. Why? "old block" will be >>>remapped anyway on pkey receiving. >> >>The reason I did this was that if the GetResp will fail I still want to >>represent >>the settings in the map.But actually it might be better not to do that so >>next >>time we run we will not find it without a GetResp. > > > Agree. > > >>>>+ IN uint16_t *p_pkey, >>>>+ OUT uint32_t *p_block_idx, >>>>+ OUT uint8_t *p_pkey_index) >>>>+{ >>>>+ uint32_t num_of_blocks; >>>>+ uint32_t block_index; >>>>+ ib_pkey_table_t *block; >>>>+ >>>>+ CL_ASSERT( p_pkey_tbl ); >>>>+ CL_ASSERT( p_block_idx != NULL ); >>>>+ CL_ASSERT( p_pkey_idx != NULL ); >>> >>> >>>Why last two CL_ASSERTs? What should be problem with uninitialized >>>pointers here? >>> >> >>These are the outputs of the function. It does not make sense to call the >>functions with >>null output pointers (calling by ref) . Anyway instead of putting the check >>in the free build >>I used an assert > > > I see. Actually I've overlooked that addresses and not values are > checked. Please ignore this comment. > > >>>>+ >>>>+ p_pkey_tbl = osm_physp_get_mod_pkey_tbl( p_physp ); >>>>+ if (! p_pkey_tbl) >>> >>> ^^^^^^^^^^^^^ >>>Is it possible? >> >>Yes it is ! I run into it during testing. The port did not have any pkey >>table. > > > static inline osm_pkey_tbl_t * > osm_physp_get_mod_pkey_tbl( IN osm_physp_t* const p_physp ) > { > ... > return( &p_physp->pkeys ); > }; > > This returns the address of physp's pkeys field. Right? > Then if ( &p_physp->pkeys == NULL ) p_physp pointer should be equal to > unsigned equivalent of -(offset of pkey field in physp struct). Correct. I will remove the check. > > >>>>+ "Fail to allocate new pending pkey >>>>entry for node " >>>>+ "0x%016" PRIx64 " port %u\n", >>>>+ cl_ntoh64( osm_node_get_node_guid( >>>>p_node ) ), >>>>+ osm_physp_get_port_num( p_physp ) ); >>>>+ return; >>>>+ } >>>>+ p_pending->pkey = pkey; >>>>+ p_orig_pkey = cl_map_get( &p_pkey_tbl->keys, ib_pkey_get_base( pkey >>>>) ); >>>>+ if ( !p_orig_pkey || >>>>+ (ib_pkey_get_base(*p_orig_pkey) != ib_pkey_get_base(pkey) >>>>)) >>> >>> >>>There the cases of new pkey and updated pkey membership is mixed. Why? >> >>I am not following your question. >>The specific case I am trying to catch is the one that for some reason the >>map points to >>a pkey entry that was modified somehow and is different then the one you >>would expect by >>the map. > > > Didn't understand it at first pass, now it is clearer. > > If pkey entry was modified somehow (how? bugs?), the assumption is that > mapping still be valid? Then it is not new entry (or we will change > pkey's index in the real table). > PKey table mismatch between the block and map should never happen. I will remove the check and replace that with an ASSERT so I catch the bug if we hit it. > >>>>+ { >>>>+ p_pending->is_new = TRUE; >>>>+ cl_qlist_insert_tail(&p_pkey_tbl->pending, >>>>(cl_list_item_t*)p_pending); >>>>+ stat = "inserted"; >>>>+ } >>>>+ else >>>>+ { >>>>+ p_pending->is_new = FALSE; >>>>+ if (osm_pkey_tbl_get_block_and_idx(p_pkey_tbl, p_orig_pkey, >>>>+ >>>>&p_pending->block, &p_pending->index)) >>> >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>AFAIK in this function there were CL_ASSERTs which check for uinitialized >>>pointers. >> >>True. So the asserts are not required in this case. > > > Up to you. Actually this my comment may be ignored, as stated above I > didn't read this correctly. > > >>> >>>>+ { >>>>+ osm_log( p_log, OSM_LOG_ERROR, >>>>+ "pkey_mgr_process_physical_port: >>>>ERR 0503: " >>>>+ "Fail to obtain P_Key 0x%04x >>>>block and index for node " >>>>+ "0x%016" PRIx64 " port %u\n", >>>>+ cl_ntoh64( >>>>osm_node_get_node_guid( p_node ) ), >>>>+ osm_physp_get_port_num( >>>>p_physp ) ); >>>>+ return; >>>>+ } >>>>+ cl_qlist_insert_head(&p_pkey_tbl->pending, >>>>(cl_list_item_t*)p_pending); >>>>+ stat = "updated"; >>> >>> >>>Is it will be updated? It is likely "already there" case. No? >>> >>>Also in this case you can already put the pkey in new_block instead of >>>holding it in pending list. Then later you will only need to add new >>>pkeys. This may simplify the flow and even save some mem. >> >>True but in my mind it does not simplify - on the contrary it makes the >>partition between >>populating each port pending list and actually setting the pkey tables >>mixed. > > > I meant new_block filling, not actual setting. You will be able to > remove whole if { } else { } flow, as well as is_new, block and index > fields from 'pending' structure (actually only pkey value itself will > matter) - is it not nice simplification? I still prefer the clear staging: append to list when scanning the partitions and filling in the tables when looping on all ports. > > >>I do not think the memory impact deserves this mix of staging >> >> >>> > >>>>+ max_num_of_blocks = pkey_mgr_get_physp_max_blocks( p_req->p_subn, >>>>p_physp ); >>>>+ if ( p_pkey_tbl->max_blocks > max_num_of_blocks ) >>>> { >>>>- block = osm_pkey_tbl_new_block_get( p_pkey_tbl, block_index ); >>>>- for ( i = 0; i < IB_NUM_PKEY_ELEMENTS_IN_BLOCK; i++ ) >>>>+ osm_log( p_log, OSM_LOG_INFO, >>>>+ "pkey_mgr_update_port: " >>>>+ "Max number of blocks reduced from >>>>%u to %u " + "for node 0x%016" >>>>PRIx64 " >>>>port %u\n", >>>>+ p_pkey_tbl->max_blocks, >>>>max_num_of_blocks, >>>>+ cl_ntoh64( osm_node_get_node_guid( >>>>p_node ) ), >>>>+ osm_physp_get_port_num( p_physp ) ); >>>>+ } >>>>+ p_pkey_tbl->max_blocks = max_num_of_blocks; >>>>+ >>>>+ osm_pkey_tbl_sync_new_blocks( p_pkey_tbl ); >>>>+ cl_map_remove_all( &p_pkey_tbl->keys ); >>> >>> >>>What is the reason to drop map here? AFAIK it will be reinitialized later >>>anyway when pkey blocks will be received. >> >>What if it is not received? > > > Then we will have unreliable data there. > > Maybe I know why you wanted this - this is part of "use pkey tables > before sending/receiving to/from ports" idea? > > >>>>@@ -255,24 +443,36 @@ pkey_mgr_update_peer_port( >>>> if (enforce == FALSE) >>>> return FALSE; >>>> >>>>- p_pkey_tbl = osm_physp_get_pkey_tbl( p ); >>>>- p_peer_pkey_tbl = osm_physp_get_pkey_tbl( peer ); >>>>+ p_pkey_tbl = osm_physp_get_pkey_tbl( p_physp ); >>>>+ p_peer_pkey_tbl = osm_physp_get_mod_pkey_tbl( peer ); >>>> num_of_blocks = osm_pkey_tbl_get_num_blocks( p_pkey_tbl ); >>>>- if ( num_of_blocks > osm_pkey_tbl_get_num_blocks( p_peer_pkey_tbl ) ) >>>>- num_of_blocks = osm_pkey_tbl_get_num_blocks( p_peer_pkey_tbl ); >>>>+ peer_max_blocks = pkey_mgr_get_physp_max_blocks( p_subn, peer ); >>>>+ if (peer_max_blocks < p_pkey_tbl->used_blocks) >>>>+ { >>>>+ osm_log( p_log, OSM_LOG_ERROR, >>>>+ "pkey_mgr_update_peer_port: ERR >>>>0508: " >>>>+ "not enough entries (%u < %u) on >>>>switch 0x%016" PRIx64 >>>>+ " port %u\n", >>>>+ peer_max_blocks, num_of_blocks, >>>>+ cl_ntoh64( osm_node_get_node_guid( >>>>p_node ) ), >>>>+ osm_physp_get_port_num( peer ) ); >>>>+ return FALSE; >>> >>> >>>Do you think it is the best way, just to skip update - partitions are >>>enforced already on the switch. May be better to truncate pkey tables >>>in order to meet peer's capabilities? >> >>You are right about that - Its a bug! >>I think the best approach here is to turn off the enforcement on the switch. >>If we truncate the table we actually impact connectivity of the fabric. >>I prefer a softer approach - an error in the log. > > > Yes this should be good way to handle this. > > >>> >>>>+ } >>>> >>>>- for ( block_index = 0; block_index < num_of_blocks; block_index++ ) >>>>+ p_peer_pkey_tbl->used_blocks = p_pkey_tbl->used_blocks; >>>>+ for ( block_index = 0; block_index < p_pkey_tbl->used_blocks; >>>>block_index++) >>>> { >>>> block = osm_pkey_tbl_new_block_get( p_pkey_tbl, block_index ); >>>> peer_block = osm_pkey_tbl_block_get( p_peer_pkey_tbl, block_index ); >>>> if ( memcmp( peer_block, block, sizeof( *peer_block ) ) ) >>>> { >>>>+ osm_pkey_tbl_set(p_peer_pkey_tbl, block_index, >>>>block); >>> >>> >>>Why this (osm_pkey_tbl_set())? This will be called by receiver. >> >>Same as the above note about updating the map >>I wanted to avoid to wait for the GetResp. >>I think it is a mistake and we can actually remove it. > > > Agree. > > Sasha. _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general