Of course. I wanted to make another change before submitting the patch. The patch currently also change update_session() to connect targets in serial as it was a faster change for a testing the issue but it is a problem when you have many targets. it fills slow and forever to finish. I want to change this to leave target connections in parallel (as it is now) but still wait for all the connections to finish before releasing the mutex for another discoveryd fork to enter. I didn't get into it since sending the email but I'll do it this week and some testing. I"ll submit a patch this week.
On Sat, Aug 22, 2015 at 6:49 AM The Lee-Man <[email protected]> wrote: > Roi: Could you supply your changes? > > > On Monday, August 10, 2015 at 11:14:48 PM UTC-7, Roi Dayan wrote: > >> I also played with discoveryd. found an issue when you have many portals. >> can be reproduced with iscsiadm. >> >> The netlink response from the kernel is using multicast. So when we have >> more than 1 portal there is an issue that multiple forks of the discoveryd >> handle the same response. >> I noticed that when configured multiple portals and noticed we tried to >> handle session id 1 more than once and started to get errors about it. >> seems like iscsid is not handling the error very well and becomes >> unresponsive and need to kill the process. >> >> I then used a named mutex in discoveryd to protect the calls >> to __do_st_disc_and_login() and waited for the connections to complete in >> update_sessions(). >> This way I had 1 discoveryd fork working at a time. This solved the issue >> with discoveryd only >> >> You can reproduce the issue with creating multiple discoveryd portals or >> with iscsiadm if you run it in the background and discovery multiple >> portals. >> I used this to reproduce the issue with iscsiadm: >> >> >> portals="11.212.164.1:3261 11.212.164.1:3262 11.212.164.1:3263 >> 11.212.164.1:3264" >> for p in $portals; do >> iscsiadm -m discovery -t st -I iser -p $p -l & >> done >> >> >> >> >> >> >> >> On Mon, Aug 10, 2015 at 10:57 PM, The Lee-Man <[email protected]> >> wrote: >> > On Tuesday, August 4, 2015 at 8:36:14 AM UTC-7, Mike Christie wrote: >>>> >>>> On 08/04/2015 10:33 AM, Mike Christie wrote: >>>> > On 08/04/2015 09:45 AM, Roi Dayan wrote: >>>> >> >>>> >> >>>> >> On Friday, July 24, 2015 at 3:38:00 AM UTC+3, The Lee-Man wrote: >>>> >> >>>> >> On Wednesday, July 22, 2015 at 1:43:59 PM UTC-7, Mike Christie >>>> wrote: >>>> >> >>>> >> On 07/22/2015 10:24 AM, The Lee-Man wrote: >>>> >> > On Tuesday, July 21, 2015 at 9:29:19 PM UTC-7, Mike >>>> Christie >>>> >> wrote: >>>> >> > >>>> >> > On 07/21/2015 05:47 PM, [email protected] >>>> <javascript:> >>>> >> wrote: >>>> >> > > From: Lee Duncan <[email protected] <javascript:>> >>>> >> > > >>>> >> > > This patch allows iser transport to be used for the >>>> >> discovery >>>> >> > > daemon. Otherwise, iscsid core dumps when attempting >>>> this. >>>> >> > > --- >>>> >> > > usr/discoveryd.c | 5 ----- >>>> >> > > 1 file changed, 5 deletions(-) >>>> >> > > >>>> >> > > diff --git a/usr/discoveryd.c b/usr/discoveryd.c >>>> >> > > index 1e149771a50b..2d3ccbcd722f 100644 >>>> >> > > --- a/usr/discoveryd.c >>>> >> > > +++ b/usr/discoveryd.c >>>> >> > > @@ -1034,11 +1034,6 @@ static void >>>> >> __do_st_disc_and_login(struct >>>> >> > discovery_rec *drec) >>>> >> > > drec->u.sendtargets.reopen_max = 0; >>>> >> > > >>>> >> > > iface_link_ifaces(&setup_ifaces); >>>> >> > > - /* >>>> >> > > - * disc code assumes this is not set and >>>> wants >>>> >> to use >>>> >> > > - * the userspace IO code. >>>> >> > > - */ >>>> >> > > - ipc = NULL; >>>> >> > > >>>> >> > > rc = >>>> >> idbm_bind_ifaces_to_nodes(discovery_sendtargets, drec, >>>> >> > > >>>> &setup_ifaces, >>>> >> &rec_list); >>>> >> > > >>>> >> > >>>> >> > Do you need this patch for offload support too, and >>>> does >>>> >> it work ok now >>>> >> > too, or was that already working? >>>> >> > >>>> >> > >>>> >> > That was already working. With this patch, offload via >>>> IB/iSER >>>> >> seems >>>> >> > to be working for us. >>>> >> > >>>> >> >>>> >> For offload, like bnx2i, was it doing discovery through the >>>> offload >>>> >> engine or in software for you? I thought it would crash in >>>> >> iscsi_create_leading_conn when it references the ipc pointer >>>> here: >>>> >> >>>> >> conn->socket_fd = ipc->ctldev_open(); >>>> >> >>>> >> for bnx2i. >>>> >> >>>> >> Your patch is correct. I am just trying to figure out why I >>>> >> wrote that >>>> >> "disc code assumes" comment above. It seems like my comment >>>> in >>>> >> the code >>>> >> is very very wrong, because if CAP_TEXT_NEGO, like with >>>> >> bnx2i/cxgb/be2iscsi and in newer kernels where we now set >>>> that >>>> >> bit iser, >>>> >> then we want a valid ipc pointer. >>>> >> >>>> >> >>>> >> I have never tried running the discovery daemon with bnx2i. >>>> Regular >>>> >> discovery >>>> >> through bnx2i works fine without this patch. >>>> >> >>>> >> So I set it up just now and confirmed: using the discovery >>>> daemon >>>> >> for bnx2i >>>> >> also gets a core dump without this patch. And regular discovery >>>> is >>>> >> verified >>>> >> as working with or without this patch using bnx2i. >>>> >> >>>> >> >>>> >> >>>> >> Hi, >>>> >> >>>> >> any update about this patch ? >>>> >> >>>> > >>>> > Needs more testing and review of the offload code it also enables. I >>>> am >>>> > trying to get to it. >>>> > >>>> >>>> If offload discoveryd support has gone through a distro QA cycle, let >>>> me >>>> know. It would be helpful. >>>> >>> >>> I did some more testing today. I created a target with discoveryd >>> enabled. >>> >>> First, I verified it was working normally by setting startup to >>> automatic, logging out of the target, and letting the discovery daemon >>> recreate the session for me. All was working normally. >>> >>> Then I logged out of my session from the command line, and deleted the >>> "node" record. >>> >>> When the discovery daemon next ran for that target, it re-logged in, >>> though it did not create the "node" entry in the database. >>> >>> So it looks like it worked correct after a node deletion. >>> >> -- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "open-iscsi" group. >>> To unsubscribe from this topic, visit >>> https://groups.google.com/d/topic/open-iscsi/RyQZI1DsDw8/unsubscribe. >>> >> To unsubscribe from this group and all its topics, send an email to >>> [email protected]. >>> To post to this group, send email to [email protected]. >>> >> >>> Visit this group at http://groups.google.com/group/open-iscsi. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- > You received this message because you are subscribed to a topic in the > Google Groups "open-iscsi" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/open-iscsi/RyQZI1DsDw8/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/open-iscsi. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/d/optout.
