Hi folks,

We've long been using CTDB and Samba for our NAS service, servicing ~500 users. We've been suffering from some problems with the CTDB performance over the last few weeks, likely triggered either by an upgrade of samba from 3.5 to 3.6 (and enabling of SMB2 as a result), or possibly by additional users coming on with a new workload.

We run CTDB 1.0.114.4-1 (from sernet) and samba3-3.6.12-44 (again, from sernet). Before we roll back, we'd like to make sure we can't fix the problem and stick with Samba 3.6 (and we don't even know that a roll back would fix the issue).

The symptoms are a complete freeze of the service for CIFS users for 10-60 seconds, and on the servers a corresponding spawning of large numbers of CTDB processes, which seem to be created in a "big bang", and then do what they do and exit in the subsequent 10-60 seconds.

We also serve up NFS from the same ctdb-managed frontends, and GPFS from the cluster - and these are both fine throughout.

This was happening 5-10 times per hour, not at exact intervals though. When we added a third node to the CTDB cluster, it "got worse", and when we dropped the CTDB cluster down to a single node and everything started behaving fine - which is where we are now.

So, I've got a bunch of questions!

- does anyone know why ctdb would be spawning these processes, and if there's anything we can do to stop it needing to do it? - has anyone done any more general performance / config optimisation of CTDB?

And - more generally - does anyone else actually use ctdb/samba/gpfs on the scale of ~500 users or higher? If so - how do you find it?


--
            --
   Dr Orlando Richards
  Information Services
IT Infrastructure Division
       Unix Section
    Tel: 0131 650 4994

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
_______________________________________________
gpfsug-discuss mailing list
[email protected]
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to