Hi Michael, I sincerely appreciate your detailed response.
We migrated from Active Directory 2008 + idmap backend=RID to Active Directory 2003 + ServiceForUnix + idmap backend=AD and now we have Samba clients authenticate/mount shares from CTDB managed servers. Now, we are seeing strange case of file corruption when NFS and Samba updates the same file even though the file is consistent in the underlying clustered file-system. The case is documented in detailed in: https://bugzilla.samba.org/show_bug.cgi?id=6019 Thoughts/inputs would be greatly appreciated. Please see my comments inlined to your suggestion below (to make sure I have covered all the bases). > You need to have compiled samba with the configure option > "--with-cluster-support". Without this, none of the things below > will work! [Tim] Yes, Samba is compiled with "--with-cluster-support" >> All nodes should be able to authenticate and provide idmappings themselves. The nodes keep a local cache of the id mappings, but for instance the idmap_tdb2 database is distributed via ctdb to all nodes (idmap backend = tdb2), so there is no need to proxy anything over one node. There is no such thing as a primary ctdb node. Only one node at a time is data master (has the authoritative copy of the tdb data) but this changes as nodes try to write to a tdb file. [Tim] Nodes authenticate via Active Directory(AD). idmap backend=AD. So, nodes that are able to successfully join to the AD domain can authenticate via AD. To have samba+winbind working correctly you need to put winbind into your /etc/nsswitch.conf file (this has nothing to do with ctdb and applies to non-clustered setups as well), e.g.: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ passdb: files winbind group: files winbind ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [Tim] I presume, you meant passwd instead of passdb. I have the following settings: passwd: files winbind shadow: files winbind group: files winbind Now let's look at your configs and possible ways do diagnose your problem: > > > * What version of samba did you use? samba-3.2.3-ctdb.50 ctdb-1.0-68 I have built the packages from the source RPM with necessary clustering options from below link: http://ctdb.samba.org/packages/redhat/RHEL5/ > > * CTDB should be up and running on all nodes when you do the join! Yes. > > > (Or else you might have created the join data only in local > non-clustered fallback copies of the tdbs, depending on the > precise samba version...) > > Verify that ctdb is running ok by issueing "ctdb status". [r...@d1950-01 ~]# ctdb status Number of nodes:2 pnn:0 172.16.2.252 OK pnn:1 172.16.2.253 OK (THIS NODE) Generation:2006239815 Size:2 hash:0 lmaster:0 hash:1 lmaster:1 Recovery mode:NORMAL (0) Recovery master:1 > > > * Please verify that all nodes are correctly joined to AD by > running (e.g.) "onnode all net ads testjoin" and (after winbind > has been started on all nodes) "onnode all wbinfo -t". > [r...@d1950-01 ~]# onnode all net ads testjoin >> NODE: 172.16.2.252 << Join is OK >> NODE: 172.16.2.253 << Join is OK [r...@d1950-01 ~]# onnode all wbinfo -t >> NODE: 172.16.2.252 << checking the trust secret via RPC calls succeeded >> NODE: 172.16.2.253 << checking the trust secret via RPC calls succeeded > > * Restart winbind and smb services after the join. Yes, did this. * Your samba config looks basically ok. The only strange thing is the idmap ranges: the global range and the TESTDOMAIN range overlap in a strange way. You might want to try with only the global (default) idmap config first. [Tim] Yes, the new smb.conf has only global idmap config. Please see attached. * After you verified that all nodes are joined with "wbinfo -t", you should verify that winbindd is working correctly with some wbinfo commands (on all nodes) - authentication: wbinfo -a TESTDOMAIN+user%password" [r...@d1950-01 ~]# wbinfo -a TESTDOMAIN2+testuserd%abc1234 plaintext password authentication succeeded challenge/response password authentication succeeded raw id/name mapping: - name-to-sid: wbinfo -n TESTDOMAIN+user wbinfo -n TESTDOMAIN+group - id <username> [Tim] All works fine: [r...@d1950-01 ~]# wbinfo -n TESTDOMAIN2+testuserd S-1-5-21-3023865537-465988374-1439864529-1136 User (1) [r...@d1950-01 ~]# id TESTDOMAIN2+testuserd uid=11005(TESTDOMAIN2+testuserd) gid=20001(TESTDOMAIN2+win_users) groups=20001(TESTDOMAIN2+win_users),20002(TESTDOMAIN2+domain users) Also verify that they still work after flushing the idmap cache with "net cache flush". [Tim] [r...@d1950-01 ~]# net cache flush [r...@d1950-01 ~]# id TESTDOMAIN2+testuserd uid=11005(TESTDOMAIN2+testuserd) gid=20001(TESTDOMAIN2+win_users) groups=20001(TESTDOMAIN2+win_users),20002(TESTDOMAIN2+domain users) * Next you can try connecting to samba shares. You can test connectivity with smbclient: smbclient //some_public_ip/global-share -UTESTDOMAIN+user%password This should work for all public addresses. [Tim] Works fine #Mount to SMB server1 [r...@d2950-11 ~]# smbclient -U TESTDOMAIN2+testuserd '\\192.168.97.5\global-share' Enter TESTDOMAIN2+testuserd's password: Domain=[TESTDOMAIN2] OS=[Unix] Server=[Samba 3.2.3] smb: \> exit #Mount to SMB server2 [r...@d2950-11 ~]# smbclient -U TESTDOMAIN2+testuserd '\\192.168.97.6\global-share' Enter TESTDOMAIN2+testuserd's password: Domain=[TESTDOMAIN2] OS=[Unix] Server=[Samba 3.2.3] smb: \> exit Then test with windows clients, with the explorer and with "net use". At any point where this fails, we would need level 10 server logs of the failing connection. [Tim] Works fine I hope this gives you a new start in sorting out the problem. Thanks Again. Now the problem is investigating why the files are getting corrupted. Any help/thoughts/comments are welcome. Regards, -Tim
-- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/listinfo/samba
