> On Nov 21, 2016, at 6:14 AM, Hannes Reinecke <[email protected]> wrote:
>
> On 11/19/2016 08:46 PM, The Lee-Man wrote:
>> In this wonderful new world of systemd, I have an issue with stopping
>> the iscsid service when the daemon has died or been killed.
>>
>> My setup:
>> * I have an iscsid.socket unit file, which is enabled and started
>> * I have an iscsid.service unit file, which controls the iscsid daemon.
>> This is disabled and not started
>>
>> Normally, if I run a command like "iscsiadm -m discovery -t st -p
>> SOME-TARGET", systemd notices that iscsiadm is trying to talk to iscsid
>> through the socket, and it starts up iscsid. This is the cool part
>> (IMHO) of systemd socket activation.
>>
>> When I want to stop iscsid, I can just tell systemctl to do it via
>> "systemctl stop iscsid", and it runs the "ExecStop=" command in the
>> service unit file, which is "iscsiadm -k 0 2" before terminating the
>> daemon process(es).
>>
>> [NOTE: the "2" here in this command actually does nothing and is
>> ignored, but I copied this from someplace else long ago, and the "2" was
>> present there.]
>>
>> It is of importance, in this case, that the ExecStop command actually
>> sends an IPC message to the daemon (iscsid) requesting it to cleanly
>> shut itself down. Herein lies the rub.
>>
>> All of this works great until the daemon happens not to be running. You
>> can simulate this with "kill -TERM $(pidof iscsid)" when the daemon is
>> running. Now you are in a situation where systemd started the service
>> and knows it is now not running, so it seems to send the ExecStop
>> command to cleanly shut it down. This command hangs! It seems to be
>> stuck in an infinite loop trying to send the shutdown command to the
>> daemon, waiting for it to timeout, then trying again. The daemon starts
>> up, sees an IPC error, and exits.
>>
>> While this clearly seems like a systemd issue, I have found a workaround
>> that looks clean. Well, as clean as the shutdown command is, anyway.
>
> No, that's not the case.
> This is a plain issue with iscsiadm; it's waiting in 'recv' to receive
> an response packet from iscsid which of course never will come.
> The reason why this becomes an issue now is that systemd holds the
> socket for us, so the socket is available; hence 'recv()' will not
> return with ENOTCONN or somesuch, indicating that the socket isn't present.
> The socket _is_ present, it's just not sending any responses back.
>
> Attached is a patch for open-iscsi to use 'poll()' before recv(), and
> terminating it if we didn't get a response in time.
>
> This doesn't do anything from the mentioned socket activation issue for
> iscsid, but at least iscsiadm doesn't hang anymore.
>
> Cheers,
>
> Hannes
One issue with the patch. It looks like you might need a “break” added
after the “log_error()” when poll() fails, or it could be an infinite loop:
+ while (len) {
+ struct pollfd pfd;
+
+ pfd.fd = fd;
+ pfd.events = POLLIN;
+ err = poll(&pfd, 1, timeout);
+ if (!err) {
+ if (poll_wait)
+ continue;
+ return ISCSI_ERR_ISCSID_NOTCONN;
+ } else if (err < 0) {
+ if (errno == EINTR)
+ continue;
+ log_error("got poll error (%d/%d), daemon died?",
+ err, errno);
+ return ISCSI_ERR_ISCSID_COMM_ERR;
+ } else if (!pfd.revents & POLLIN)
+ continue;
+ err = recv(fd, rsp, sizeof(*rsp), MSG_WAITALL);
+ if (err < 0) {
+ log_error("got read error (%d/%d), daemon died?",
+ err, errno);
break;
<<===== Add a break here?
+ } else {
+ len -= err;
+ iscsi_err = rsp->err;
+ }
+ }
I can make that adjustment, if you agree it’s needed, when applying the patch.
--
The Lee-Man
Prepare to be unprepared. -- Me
--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.