I also played with discoveryd. found an issue when you have many portals.
can be reproduced with iscsiadm.
The netlink response from the kernel is using multicast. So when we have
more than 1 portal there is an issue that multiple forks of the discoveryd
handle the same response.
I noticed that when configured multiple portals and noticed we tried to
handle session id 1 more than once and started to get errors about it.
seems like iscsid is not handling the error very well and becomes
unresponsive and need to kill the process.
I then used a named mutex in discoveryd to protect the calls
to __do_st_disc_and_login() and waited for the connections to complete in
update_sessions().
This way I had 1 discoveryd fork working at a time. This solved the issue
with discoveryd only
You can reproduce the issue with creating multiple discoveryd portals or
with iscsiadm if you run it in the background and discovery multiple
portals.
I used this to reproduce the issue with iscsiadm:
portals="11.212.164.1:3261 11.212.164.1:3262 11.212.164.1:3263
11.212.164.1:3264"
for p in $portals; do
iscsiadm -m discovery -t st -I iser -p $p -l &
done
On Mon, Aug 10, 2015 at 10:57 PM, The Lee-Man <[email protected]>
wrote:
> On Tuesday, August 4, 2015 at 8:36:14 AM UTC-7, Mike Christie wrote:
>>
>> On 08/04/2015 10:33 AM, Mike Christie wrote:
>> > On 08/04/2015 09:45 AM, Roi Dayan wrote:
>> >>
>> >>
>> >> On Friday, July 24, 2015 at 3:38:00 AM UTC+3, The Lee-Man wrote:
>> >>
>> >> On Wednesday, July 22, 2015 at 1:43:59 PM UTC-7, Mike Christie
>> wrote:
>> >>
>> >> On 07/22/2015 10:24 AM, The Lee-Man wrote:
>> >> > On Tuesday, July 21, 2015 at 9:29:19 PM UTC-7, Mike Christie
>> >> wrote:
>> >> >
>> >> > On 07/21/2015 05:47 PM, [email protected]
>> <javascript:>
>> >> wrote:
>> >> > > From: Lee Duncan <[email protected] <javascript:>>
>> >> > >
>> >> > > This patch allows iser transport to be used for the
>> >> discovery
>> >> > > daemon. Otherwise, iscsid core dumps when attempting
>> this.
>> >> > > ---
>> >> > > usr/discoveryd.c | 5 -----
>> >> > > 1 file changed, 5 deletions(-)
>> >> > >
>> >> > > diff --git a/usr/discoveryd.c b/usr/discoveryd.c
>> >> > > index 1e149771a50b..2d3ccbcd722f 100644
>> >> > > --- a/usr/discoveryd.c
>> >> > > +++ b/usr/discoveryd.c
>> >> > > @@ -1034,11 +1034,6 @@ static void
>> >> __do_st_disc_and_login(struct
>> >> > discovery_rec *drec)
>> >> > > drec->u.sendtargets.reopen_max = 0;
>> >> > >
>> >> > > iface_link_ifaces(&setup_ifaces);
>> >> > > - /*
>> >> > > - * disc code assumes this is not set and
>> wants
>> >> to use
>> >> > > - * the userspace IO code.
>> >> > > - */
>> >> > > - ipc = NULL;
>> >> > >
>> >> > > rc =
>> >> idbm_bind_ifaces_to_nodes(discovery_sendtargets, drec,
>> >> > >
>> &setup_ifaces,
>> >> &rec_list);
>> >> > >
>> >> >
>> >> > Do you need this patch for offload support too, and does
>> >> it work ok now
>> >> > too, or was that already working?
>> >> >
>> >> >
>> >> > That was already working. With this patch, offload via
>> IB/iSER
>> >> seems
>> >> > to be working for us.
>> >> >
>> >>
>> >> For offload, like bnx2i, was it doing discovery through the
>> offload
>> >> engine or in software for you? I thought it would crash in
>> >> iscsi_create_leading_conn when it references the ipc pointer
>> here:
>> >>
>> >> conn->socket_fd = ipc->ctldev_open();
>> >>
>> >> for bnx2i.
>> >>
>> >> Your patch is correct. I am just trying to figure out why I
>> >> wrote that
>> >> "disc code assumes" comment above. It seems like my comment in
>> >> the code
>> >> is very very wrong, because if CAP_TEXT_NEGO, like with
>> >> bnx2i/cxgb/be2iscsi and in newer kernels where we now set that
>> >> bit iser,
>> >> then we want a valid ipc pointer.
>> >>
>> >>
>> >> I have never tried running the discovery daemon with bnx2i.
>> Regular
>> >> discovery
>> >> through bnx2i works fine without this patch.
>> >>
>> >> So I set it up just now and confirmed: using the discovery daemon
>> >> for bnx2i
>> >> also gets a core dump without this patch. And regular discovery is
>> >> verified
>> >> as working with or without this patch using bnx2i.
>> >>
>> >>
>> >>
>> >> Hi,
>> >>
>> >> any update about this patch ?
>> >>
>> >
>> > Needs more testing and review of the offload code it also enables. I am
>> > trying to get to it.
>> >
>>
>> If offload discoveryd support has gone through a distro QA cycle, let me
>> know. It would be helpful.
>>
>
> I did some more testing today. I created a target with discoveryd enabled.
>
> First, I verified it was working normally by setting startup to automatic,
> logging out of the target, and letting the discovery daemon recreate the
> session for me. All was working normally.
>
> Then I logged out of my session from the command line, and deleted the
> "node" record.
>
> When the discovery daemon next ran for that target, it re-logged in,
> though it did not create the "node" entry in the database.
>
> So it looks like it worked correct after a node deletion.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "open-iscsi" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/open-iscsi/RyQZI1DsDw8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/open-iscsi.
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.