On Mon, Feb 28, 2011 at 6:41 AM, Open SA Forum AIS Services mailing
list <[email protected]> wrote:
> If the ring id file for the processor is less then 8 bytes, totemsrp would
> assert. Our speculation is that this condition happens during a fencing
> operation or local filesystem corruption.
>
> With this patch, OpenAIS will create fresh ring id file data when the
> incorrect number of bytes are read from the ring id.
>
> Backport of Corosync 7471c883464ccee31424e9a58c71fc5f948f41f9
As far as the backport goes, it looks good.
> ---
> branches/whitetank/exec/totemsrp.c | 34 ++++++++++++++++++++--------------
> 1 files changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/branches/whitetank/exec/totemsrp.c
> b/branches/whitetank/exec/totemsrp.c
> index 20de8c9..a30dd30 100644
> --- a/branches/whitetank/exec/totemsrp.c
> +++ b/branches/whitetank/exec/totemsrp.c
> @@ -3003,31 +3003,37 @@ static void memb_ring_id_create_or_load (
> struct memb_ring_id *memb_ring_id)
> {
> int fd;
> - int res;
> + int res = 0;
> char filename[256];
>
> sprintf (filename, "%s/ringid_%s",
> rundir, totemip_print (&instance->my_id.addr[0]));
> fd = open (filename, O_RDONLY, 0700);
> - if (fd > 0) {
> - res = read (fd, &memb_ring_id->seq, sizeof (unsigned long
> long));
> - assert (res == sizeof (unsigned long long));
> + /*
> + * If file can be opened and read, read the ring id
> + */
> + if (fd != -1) {
> + res = read (fd, &memb_ring_id->seq, sizeof (uint64_t));
There is at least one case where read() may fail but it should be tried again.
do {
res = read (fd, &memb_ring_id->seq, sizeof (uint64_t));
} while (res == -1 && errno == EINTR);
> close (fd);
> - } else
> - if (fd == -1 && errno == ENOENT) {
> + }
> + /*
> + * If file could not be opened or read, create a new ring id
> + */
> + if ((fd == -1) || (res != sizeof (uint64_t))) {
> memb_ring_id->seq = 0;
> umask(0);
> fd = open (filename, O_CREAT|O_RDWR, 0700);
> - if (fd == -1) {
> + if (fd != -1) {
> + res = write (fd, &memb_ring_id->seq, sizeof
> (uint64_t));
The same thing applies here. I would modify it to try again if errno == EINTR.
> + close (fd);
> + if (res == -1) {
> + log_printf
> (instance->totemsrp_log_level_warning,
> + "Couldn't write ringid file '%s'
> %s\n", filename, strerror (errno));
> + }
> + } else {
> log_printf (instance->totemsrp_log_level_warning,
> - "Couldn't create %s %s\n", filename, strerror
> (errno));
> + "Couldn't create ringid file %s %s\n",
> filename, strerror (errno));
> }
> - res = write (fd, &memb_ring_id->seq, sizeof (unsigned long
> long));
> - assert (res == sizeof (unsigned long long));
> - close (fd);
> - } else {
> - log_printf (instance->totemsrp_log_level_warning,
> - "Couldn't open %s %s\n", filename, strerror (errno));
> }
>
> totemip_copy(&memb_ring_id->rep, &instance->my_id.addr[0]);
> --
> 1.7.3.4
>
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais
>
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais