On 02/28/2011 06:45 AM, Open SA Forum AIS Services mailing list wrote: > All signals are registered by calling "signal" so at least on Linux, > EINTR is NEVER returned. > > Open SA Forum AIS Services mailing list wrote: >> On Mon, Feb 28, 2011 at 6:41 AM, Open SA Forum AIS Services mailing >> list <[email protected]> wrote: >>> If the ring id file for the processor is less then 8 bytes, totemsrp would >>> assert. Our speculation is that this condition happens during a fencing >>> operation or local filesystem corruption. >>> >>> With this patch, OpenAIS will create fresh ring id file data when the >>> incorrect number of bytes are read from the ring id. >>> >>> Backport of Corosync 7471c883464ccee31424e9a58c71fc5f948f41f9 >> >> As far as the backport goes, it looks good. >> >>> --- >>> branches/whitetank/exec/totemsrp.c | 34 >>> ++++++++++++++++++++-------------- >>> 1 files changed, 20 insertions(+), 14 deletions(-) >>> >>> diff --git a/branches/whitetank/exec/totemsrp.c >>> b/branches/whitetank/exec/totemsrp.c >>> index 20de8c9..a30dd30 100644 >>> --- a/branches/whitetank/exec/totemsrp.c >>> +++ b/branches/whitetank/exec/totemsrp.c >>> @@ -3003,31 +3003,37 @@ static void memb_ring_id_create_or_load ( >>> struct memb_ring_id *memb_ring_id) >>> { >>> int fd; >>> - int res; >>> + int res = 0; >>> char filename[256]; >>> >>> sprintf (filename, "%s/ringid_%s", >>> rundir, totemip_print (&instance->my_id.addr[0])); >>> fd = open (filename, O_RDONLY, 0700); >>> - if (fd > 0) { >>> - res = read (fd, &memb_ring_id->seq, sizeof (unsigned long >>> long)); >>> - assert (res == sizeof (unsigned long long)); >>> + /* >>> + * If file can be opened and read, read the ring id >>> + */ >>> + if (fd != -1) { >>> + res = read (fd, &memb_ring_id->seq, sizeof (uint64_t)); >> >> There is at least one case where read() may fail but it should be tried >> again. >>
We just went through this exercise with bugzilla #: https://bugzilla.redhat.com/show_bug.cgi?id=679534 TLDR is that linux atleast restarts interrupted system calls by default >> do { >> res = read (fd, &memb_ring_id->seq, sizeof (uint64_t)); >> } while (res == -1 && errno == EINTR); >> >>> close (fd); >>> - } else >>> - if (fd == -1 && errno == ENOENT) { >>> + } >>> + /* >>> + * If file could not be opened or read, create a new ring id >>> + */ >>> + if ((fd == -1) || (res != sizeof (uint64_t))) { >>> memb_ring_id->seq = 0; >>> umask(0); >>> fd = open (filename, O_CREAT|O_RDWR, 0700); >>> - if (fd == -1) { >>> + if (fd != -1) { >>> + res = write (fd, &memb_ring_id->seq, sizeof >>> (uint64_t)); >> >> The same thing applies here. I would modify it to try again if errno == >> EINTR. >> >>> + close (fd); >>> + if (res == -1) { >>> + log_printf >>> (instance->totemsrp_log_level_warning, >>> + "Couldn't write ringid file '%s' >>> %s\n", filename, strerror (errno)); >>> + } >>> + } else { >>> log_printf (instance->totemsrp_log_level_warning, >>> - "Couldn't create %s %s\n", filename, >>> strerror (errno)); >>> + "Couldn't create ringid file %s %s\n", >>> filename, strerror (errno)); >>> } >>> - res = write (fd, &memb_ring_id->seq, sizeof (unsigned long >>> long)); >>> - assert (res == sizeof (unsigned long long)); >>> - close (fd); >>> - } else { >>> - log_printf (instance->totemsrp_log_level_warning, >>> - "Couldn't open %s %s\n", filename, strerror >>> (errno)); >>> } >>> >>> totemip_copy(&memb_ring_id->rep, &instance->my_id.addr[0]); >>> -- >>> 1.7.3.4 >>> >>> _______________________________________________ >>> Openais mailing list >>> [email protected] >>> https://lists.linux-foundation.org/mailman/listinfo/openais >>> >> _______________________________________________ >> Openais mailing list >> [email protected] >> https://lists.linux-foundation.org/mailman/listinfo/openais > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
