Have you tried putting the ctdb files onto a separate gpfs filesystem?

Vic Cornell
[email protected]


On 12 Apr 2013, at 16:43, Orlando Richards <[email protected]> wrote:

> On 12/04/13 15:43, Bob Cregan wrote:
>> Hi Orlando,
>>                       We use ctdb/samba for CIFS, and CNFS for NFS
>> (GPFS version 3.4.0-13) . Current versions are
>> 
>> ctdb - 1.0.99
>> samba 3.5.15
>> 
>> Both compiled from source. We have about 300+ users normally.
>> 
> 
> We have suspicions that 3.6 has put additional "chatter" into the ctdb 
> database stream, which has pushed us over the edge. Barry Evans has found 
> that the clustered locking databases, in particular, prove to be a 
> scalability/usability limit for ctdb.
> 
> 
>> We have had no issues with this setup apart from CNFS which had 2 or 3
>> bad moments over the last year . These have gone away since we have
>> fixed a bug with our 10G NIC drivers (emulex cards , kernel module
>> be2net) which lead to occasional dropped packets for jumbo frames. There
>> have been no issues with samba/ctdb
>> 
>> The only comment I can make is that during initial investigations into
>> an upgrade of samba to 3.6.x we discovered that the 3.6 code would not
>> compile against  ctdb 1.0.99 (compilation requires tthe ctdb source )
>> with error messages like:
>> 
>>  configure: checking whether cluster support is available
>> checking for ctdb.h... yes
>> checking for ctdb_private.h... yes
>> checking for CTDB_CONTROL_TRANS3_COMMIT declaration... yes
>> checking for CTDB_CONTROL_SCHEDULE_FOR_DELETION declaration... no
>> configure: error: "cluster support not available: support for
>> SCHEDULE_FOR_DELETION control missing"
>> 
>> 
>> What occurs to me is that this message seems to indicate that it is
>> possible to run  a ctdb version that is incompatible with samba 3.6.
>>  That would imply that an upgrade to a higher version of ctdb might
>> help, of course it might not and make backing out harder.
> 
> Certainly 1.0.114 builds fine - I've not tried 2.0, I'm too scared! The 
> versioning in CTDB has proved hard for me to fathom...
> 
>> 
>> A compile against ctdb 2.0 works fine. We will soon be running in this
>> upgrade, but I'm waiting to see what the samba  people say at the UG
>> meeting first!
>> 
> 
> It has to be said - the timing is good!
> Cheers,
> Orlando
> 
>> 
>> Thanks
>> 
>> Bob
>> 
>> 
>> On 12 April 2013 13:37, Orlando Richards <[email protected]
>> <mailto:[email protected]>> wrote:
>> 
>>    Hi folks, ac <mailto:[email protected]>
>> 
>>    We've long been using CTDB and Samba for our NAS service, servicing
>>    ~500 users. We've been suffering from some problems with the CTDB
>>    performance over the last few weeks, likely triggered either by an
>>    upgrade of samba from 3.5 to 3.6 (and enabling of SMB2 as a result),
>>    or possibly by additional users coming on with a new workload.
>> 
>>    We run CTDB 1.0.114.4-1 (from sernet) and samba3-3.6.12-44 (again,
>>    from sernet). Before we roll back, we'd like to make sure we can't
>>    fix the problem and stick with Samba 3.6 (and we don't even know
>>    that a roll back would fix the issue).
>> 
>>    The symptoms are a complete freeze of the service for CIFS users for
>>    10-60 seconds, and on the servers a corresponding spawning of large
>>    numbers of CTDB processes, which seem to be created in a "big bang",
>>    and then do what they do and exit in the subsequent 10-60 seconds.
>> 
>>    We also serve up NFS from the same ctdb-managed frontends, and GPFS
>>    from the cluster - and these are both fine throughout.
>> 
>>    This was happening 5-10 times per hour, not at exact intervals
>>    though. When we added a third node to the CTDB cluster, it "got
>>    worse", and when we dropped the CTDB cluster down to a single node
>>    and everything started behaving fine - which is where we are now.
>> 
>>    So, I've got a bunch of questions!
>> 
>>      - does anyone know why ctdb would be spawning these processes, and
>>    if there's anything we can do to stop it needing to do it?
>>      - has anyone done any more general performance / config
>>    optimisation of CTDB?
>> 
>>    And - more generally - does anyone else actually use ctdb/samba/gpfs
>>    on the scale of ~500 users or higher? If so - how do you find it?
>> 
>> 
>>    --
>>                 --
>>        Dr Orlando Richards
>>       Information Services
>>    IT Infrastructure Division
>>            Unix Section
>>         Tel: 0131 650 4994
>> 
>>    The University of Edinburgh is a charitable body, registered in
>>    Scotland, with registration number SC005336.
>>    _________________________________________________
>>    gpfsug-discuss mailing list
>>    [email protected] <mailto:[email protected]>
>>    http://gpfsug.org/mailman/__listinfo/gpfsug-discuss
>>    <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>> 
>> 
>> 
>> 
>> --
>> 
>> Bob Cregan
>> 
>> Senior Storage Systems Administrator
>> 
>> ACRC
>> 
>> Bristol University
>> 
>> Tel:     +44 (0) 117 331 4406
>> 
>> skype:  bobcregan
>> 
>> Mobile: +44 (0) 7712388129
>> 
> 
> 
> -- 
>            --
>   Dr Orlando Richards
>  Information Services
> IT Infrastructure Division
>       Unix Section
>    Tel: 0131 650 4994
> 
> The University of Edinburgh is a charitable body, registered in Scotland, 
> with registration number SC005336.
> _______________________________________________
> gpfsug-discuss mailing list
> [email protected]
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
[email protected]
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to