Hello Ben, Hannes I'm sorry for the late reply.
> You can't just get the wwid with no work (look at all work uev_add_path > does, specifically alloc_path_with_pathinfo). Now you could reorder > this, but there isn't much point, since it is doing useful things, like > checking if this is a spurious uevent, and necessary things, like > figuring out the device type and using that that the configuration to > figure out HOW to get the wwid. IMO, WWID can be geted from uevent, since it has a ID_SERIAL filed in uevent body. > It seems like what you want to do is to > call uev_add_path multiple times, but defer most of the work that > ev_add_path does (creating or updating the multipath device), until > you've processed all that paths. Not exactly, the input parameter "struct uevent *uev" is a batch of merged uevents, so uev_add_path() is called once to process the merged uevent. > split off uev_add_path() and > ev_add_path(). > Then uev_add_path could generate a list of fully-formed paths which > ev_add_path() would process. > IE generalize coalesce_paths() to work on a passed-in list rather than > the normal vecs->pathvec. Hannes, I think my thoughts are close to your idea now, In uev_add_path(), we get all information of the merged paths, and then call ev_add_path() to create or update the multipath device. Maybe the different between us is "processing all the same type uevents(add, etc.) in ev_add_path()" or "processing all the same type uevents(add, etc.) which came from the same LUN in ev_add_path()". I think your idea can also reach the goal, and reduce the DM reload times. So we will try to code as your idea, and in list_merger_uevents(&uevq_tmp) only merger the same type(add, etc.) uevents, add stop mergering when another type uevents are occured. Thanks all, Tang 发件人: Hannes Reinecke <h...@suse.de> 收件人: Benjamin Marzinski <bmarz...@redhat.com>, tang.jun...@zte.com.cn, 抄送: dm-devel@redhat.com, zhang.ka...@zte.com.cn, Martin Wilck <mwi...@suse.com>, Bart Van Assche <bart.vanass...@sandisk.com> 日期: 2016/11/28 23:46 主题: Re: [dm-devel] Improve processing efficiency for addition and deletion of multipath devices 发件人: dm-devel-boun...@redhat.com On 11/28/2016 04:25 PM, Benjamin Marzinski wrote: > On Mon, Nov 28, 2016 at 10:19:15AM +0800, tang.jun...@zte.com.cn wrote: >> Hello Christophe, Ben, Hannes, Martin, Bart, >> I am a member of host-side software development team of ZXUSP storage >> project >> in ZTE Corporation. Facing the market demand, our team decides to write >> code to >> promote multipath efficiency next month. The whole idea is in the mail >> below.We >> hope to participate in and make progress with the open source community, >> so any >> suggestion and comment would be welcome. > > Like I mentioned before, I think this is a good idea in general, but the > devil is in the details here. > >> >> Thanks, >> Tang >> >> ------------------------------------------------------------------------------------------------------------------------------ >> ------------------------------------------------------------------------------------------------------------------------------ >> 1. Problem >> In these scenarios, multipath processing efficiency is low: >> 1) Many paths exist in each multipath device, >> 2) Devices addition or deletion during iSCSI login/logout or FC link >> up/down. > > <snip> > >> 4. Proposal >> Other than processing uevents one by one, uevents which coming from the >> same LUN devices can be mergered to one, and then uevent processing >> thread only needs to process it once, and it only produces one DM addition >> uevent which could reduce system resource consumption. >> >> The example in Chapter 2 is continued to use to explain the proposal: >> 1) Multipath receives block device addition uevents from udev: >> UDEV [89068.806214] add >> /devices/platform/host3/session44/target3:0:0/3:0:0:0/block/sdc (block) >> UDEV [89068.909457] add >> /devices/platform/host3/session44/target3:0:0/3:0:0:2/block/sdg (block) >> UDEV [89068.944956] add >> /devices/platform/host3/session44/target3:0:0/3:0:0:1/block/sde (block) >> UDEV [89068.959215] add >> /devices/platform/host5/session46/target5:0:0/5:0:0:0/block/sdh (block) >> UDEV [89068.978558] add >> /devices/platform/host5/session46/target5:0:0/5:0:0:2/block/sdk (block) >> UDEV [89069.004217] add >> /devices/platform/host5/session46/target5:0:0/5:0:0:1/block/sdj (block) >> UDEV [89069.486361] add >> /devices/platform/host4/session45/target4:0:0/4:0:0:1/block/sdf (block) >> UDEV [89069.495194] add >> /devices/platform/host4/session45/target4:0:0/4:0:0:0/block/sdd (block) >> UDEV [89069.511628] add >> /devices/platform/host4/session45/target4:0:0/4:0:0:2/block/sdi (block) >> UDEV [89069.716292] add >> /devices/platform/host6/session47/target6:0:0/6:0:0:0/block/sdl (block) >> UDEV [89069.748456] add >> /devices/platform/host6/session47/target6:0:0/6:0:0:1/block/sdm (block) >> UDEV [89069.789662] add >> /devices/platform/host6/session47/target6:0:0/6:0:0:2/block/sdn (block) >> >> 2) Multipath merges these 12 uevents to 3 internal uvents >> UEVENT add sdc sdh sdd sdl >> UEVENT add sde sdj sdf sdm >> UEVENT add sdg sdk sdi sdn >> >> 3) Multipath process these 3 uevents one by one, and only produce 3 >> addition >> DM uvents, no dm change uevent exists. >> KERNEL[89068.899614] add /devices/virtual/block/dm-2 (block) >> KERNEL[89068.955364] add /devices/virtual/block/dm-3 (block) >> KERNEL[89069.018903] add /devices/virtual/block/dm-4 (block) > > Just because I'm pedantic: There will, of cource, be dm change events. > Without them, you couldn't have a multipath device. Whenever you load a > table in a dm device (including during the initial creation), you get a > change event. > >> 4) Udev process these uevents above, and transfer it to multipath >> UDEV [89068.926428] add /devices/virtual/block/dm-2 (block) >> UDEV [89069.007511] add /devices/virtual/block/dm-3 (block) >> UDEV [89069.098054] add /devices/virtual/block/dm-4 (block) > > multipathd ignores add events for dm devices (look at uev_trigger). A dm > device isn't set up until it's initial change event happens. > >> 5) Multipath processes these uevents above, and finishes the creation of >> multipath >> devices. >> >> 5. Coding >> After taking over uevents form uevent listening thread, uevent processing >> thread can >> merger these uevents before processing�� >> int uevent_dispatch(int (*uev_trigger)(struct uevent *, void * >> trigger_data), >> void * trigger_data) >> { >> ... >> while (1) { >> ... >> list_splice_init(&uevq, &uevq_tmp); >> ... >> list_merger_uevents(&uevq_tmp); >> service_uevq(&uevq_tmp); >> } >> ... >> } >> In structure of ��struct uevent�� , an additional member of ��char >> wwid[WWID_SIZE]�� is >> added to record each device WWID for addition or change uevent to identify >> whether >> these uevents coming from the same LUN, and an additional member of >> ��struct list_head merger_node�� is added to record the list of uevents >> which having been >> merged with this uevent: >> struct uevent { >> struct list_head node; >> struct list_head merger_node; >> char wwid[WWID_SIZE] >> struct udev_device *udev; >> ... >> }; > > You can't just get the wwid with no work (look at all work uev_add_path > does, specifically alloc_path_with_pathinfo). Now you could reorder > this, but there isn't much point, since it is doing useful things, like > checking if this is a spurious uevent, and necessary things, like > figuring out the device type and using that that the configuration to > figure out HOW to get the wwid. It seems like what you want to do is to > call uev_add_path multiple times, but defer most of the work that > ev_add_path does (creating or updating the multipath device), until > you've processed all that paths. > Which is kinda what I've proposed; split off uev_add_path() and ev_add_path(). Then uev_add_path could generate a list of fully-formed paths which ev_add_path() would process. IE generalize coalesce_paths() to work on a passed-in list rather than the normal vecs->pathvec. Or consider using vecs->pathvec to hold 'orphan' paths when processing uevents; then we could add the paths to the pathvec in uev_add_path() and have another thread going through the list of paths and trying to call coalesce_paths() for the changed mpps. Will be a tad tricky, as we'd need to identify which mpps needs to be updated; but nothing too awkward. Meanwhile I'm working on converting vecs->pathvec and vecs->mpvec to RCU lists; that should get rid of the need for most locks, which I suspect being the main reason for the scaling issues. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
-- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel