RIck van der Zwet wrote:
I been (re)searching and reading what the options are with regards to H(igh) A(vailablility) file storage using FreeBSD, but cannot yet find a proper working solution. Any advice welcome!

I like to be able to mirror a full identical disk between two server. So in case of hardware failure of server A (Master). Server B (Slave) immediately takes over, without any loss of data. The Network configuration is easy using ucarp/vppr. But the file system is the hard part. Paths I have investigated:

a) ggate & gmirror: Export system on Server B to Server A. Use gmirror on Server A to keep identical disks. When the ggated on Server B actually goes down, the whole setup freezes, until the ggated is back up again. Second on network delays gmirror looses, having to sync all over again. Leaving the machine at risk.

The freezing has come to a end, with the patch attached, but is the
patch the right way to go (as C coding is not my strongest point)?

To test:
  # Create backup filesystem & export it
  serverB$ truncate -s100m /root/ha-slave.img
  serverB$ echo "192.168.33.41 RW /root/ha-slave.img" > /etc/gg.exports
  serverB$ ggated

  # Apply attached patch
  serverA$ cd /usr/src/sbin/ggate/ggatec
  serverA$ patch < %%ATTACHED_FILE%%
  serverA$ make clean install
  # Local file image
  serverA$ truncate -s 100m /root/ha-master.img
  serverA$ mdconfig -t vnode -f /root/ha-master.img
  #Remote file image
  serverA$ ggatec create  192.168.33.42 /root/ha-slave.img
  # Mirror building
  serverA$ gmirror label hamirror ggate0 md0
  serverA$ newfs /dev/mirror/hamirror
  serverA$ mount /dev/mirror/hamirror /mnt

Note: if you have _not_ applied the patch and you kill ggated on
machineB you will notice machineA freeze when trying to write to
something on /mnt or call `gmirror status'. Same applies if you kill ggatec on machineA without patch.

Using net/ucarp I detect failures on serverA and terminate ggated and
mount the image on serverB.
/Rick

--- ggatec.c.orig	2009-07-09 18:27:12.000000000 +0200
+++ ggatec.c	2009-07-14 10:15:34.000000000 +0200
@@ -156,7 +156,7 @@
 			break;
 		if (data != sizeof(hdr)) {
 			g_gate_log(LOG_ERR, "Lost connection 1.");
-			reconnect = 1;
+			reconnect = 0;
 			pthread_kill(recvtd, SIGUSR1);
 			break;
 		}
@@ -168,7 +168,7 @@
 				break;
 			if (data != ggio.gctl_length) {
 				g_gate_log(LOG_ERR, "Lost connection 2 (%zd != %zd).", data, (ssize_t)ggio.gctl_length);
-				reconnect = 1;
+				reconnect = 0;
 				pthread_kill(recvtd, SIGUSR1);
 				break;
 			}
@@ -177,6 +177,7 @@
 		}
 	}
 	g_gate_log(LOG_DEBUG, "%s: Died.", __func__);
+	g_gate_destroy(unit, 1);
 	return (NULL);
 }
 
@@ -203,7 +204,7 @@
 			if (data == -1 && errno == EAGAIN)
 				continue;
 			g_gate_log(LOG_ERR, "Lost connection 3.");
-			reconnect = 1;
+			reconnect = 0;
 			pthread_kill(sendtd, SIGUSR1);
 			break;
 		}
@@ -223,7 +224,7 @@
 			g_gate_log(LOG_DEBUG, "Received data packet.");
 			if (data != ggio.gctl_length) {
 				g_gate_log(LOG_ERR, "Lost connection 4.");
-				reconnect = 1;
+				reconnect = 0;
 				pthread_kill(sendtd, SIGUSR1);
 				break;
 			}
@@ -235,6 +236,7 @@
 		g_gate_ioctl(G_GATE_CMD_DONE, &ggio);
 	}
 	g_gate_log(LOG_DEBUG, "%s: Died.", __func__);
+	g_gate_destroy(unit, 1);
 	pthread_exit(NULL);
 }
 
@@ -410,8 +412,7 @@
 static void
 signop(int sig __unused)
 {
-
-	/* Do nothing. */
+	g_gate_destroy(unit,1);
 }
 
 static void
@@ -420,6 +421,7 @@
 	struct g_gate_ctl_cancel ggioc;
 
 	signal(SIGUSR1, signop);
+	signal(SIGINT, signop);
 	for (;;) {
 		g_gatec_start();
 		g_gate_log(LOG_NOTICE, "Disconnected [%s %s]. Connecting...",
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Reply via email to