Re: [Sheepdog] A question about sheepdog's beahvior ...

Davide Casale Wed, 27 Oct 2010 00:17:34 -0700

Thank you very much for your answers ...
If you need any test on future versions I'm available to make them...


soon,
davide


On 27/10/2010 8.16, MORITA Kazutaka wrote:

At Tue, 26 Oct 2010 12:39:06 +0200,
Davide Casale wrote:

Hi to all,
I've installed Sheepdog Daemon, version 0.1.0 (with corosync 1.2.0 svn
rev. 2637) on ubuntu 10.04LTS..
The corosync.conf file is (for the useful part) :
---
compatibility: whitetank
totem {
          version: 2
          secauth: off
          threads: 0
          token: 3000
          consensus: 5000
          interface {
                  ringnumber: 0
                  bindnetaddr: 192.168.7.x
                  mcastaddr: 226.94.1.1
                  mcastport: 5405
          }
}
---
I've installed all on three machines with default redundancy (that's 3,
it's correct? I launch sheepdog with default /etc/init.d/sheepdog start)..

Yes, it's default redundancy.

I've got 20GB of kvm virtual machines ..

The questions are :

- is it correct that if a single node crash (or I stop with "killall
sheep" the sheepdog processes) when I relaunch sheepdog ALL the data
are rebuilt from scratch from the other two nodes (each time it restarts
from zero bytes to arrive to 20GB) ?
I thought that only the changed blocks (4mb each) are resyncronized .... ??

Yes, it's correct behavior.  Sheepdog cannot detect which objects are
updated from the previous node membership change, so it is safe to
receive all objects from the already joined nodes.  However, as you
say, it's worth considering to optimize it.

- is it correct that when the syncronization is running on a node, all
the others are frozen (and also the kvm virtual machines are frozen)
until the syncronization is completed ?

Yes.  Currently, if a virtual machine accesses to the object which is
not placed on the right nodes (it could happen because of node
membership changes), sheepdog stops the access until the object is
moved to the right node.  But this behavior should be fixed as soon as
possible, I think.

And perhaps this is a little bug:
if during the syncronization I launch on the node in syncronization the
command 'collie node info', the command remain in standby after
the first output.. if I stop it with CTRL+C, when the syncronization
ended one of the sheep process crash and if I relaunch sheepdog the
sycnronization starts again from the beginning (from zero bytes) ...

The reason 'collie node info' sleeps is same with above.  The problem
that sheep crashes would be fixed by the following patch.  Thanks for
your feedback.


=
From: MORITA Kazutaka<morita.kazut...@lab.ntt.co.jp>
Subject: [PATCH] sheep: call free_request() after decrementing reference 
counters

We cannot call free_req() here because client_decref() accesses
req->ci.

Signed-off-by: MORITA Kazutaka<morita.kazut...@lab.ntt.co.jp>
---
  sheep/sdnet.c |    7 ++++++-
  1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/sheep/sdnet.c b/sheep/sdnet.c
index 9ad0bc7..6d7e7a3 100644
--- a/sheep/sdnet.c
+++ b/sheep/sdnet.c
@@ -271,12 +271,17 @@ static void free_request(struct request *req)

  static void req_done(struct request *req)
  {
+       int dead = 0;
+
        list_add(&req->r_wlist,&req->ci->done_reqs);
        if (conn_tx_on(&req->ci->conn)) {
                dprintf("connection seems to be dead\n");
-               free_request(req);
+               dead = 1;
        }
        client_decref(req->ci);
+
+       if (dead)
+               free_request(req);
  }

  static void init_rx_hdr(struct client_info *ci)



--

----------------------------------
DAVIDE CASALE
Security Engineer
mailto:cas...@shorr-kan.com

SHORR KAN IT ENGINEERING Srl
www.shorr-kan.com
Via Sestriere 28/a
10141 Torino
Phone:  +39 011 382 8358
Fax:    +39 011 384 2028
----------------------------------


--
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog

Re: [Sheepdog] A question about sheepdog's beahvior ...

Reply via email to