On 12/20/2017 2:27 PM, Anjana Kar wrote: > Currently we have 4 servers running openafs-server-1.4.14-el5.1.1, > 2 of them being database servers.
According to your database servers the psc.edu cell has three database servers: >vos eachvl -cell psc.edu -noauth -execute \ -format "rxdebug %s 7003 -version" Trying all endpoints for daphne.psc.edu: Version: OpenAFS 1.4.14 built 2010-12-27 Trying all endpoints for velma.psc.edu: Version: OpenAFS 1.4.14 built 2010-12-27 Trying all endpoints for shaggy.psc.edu: Version: OpenAFS 1.4.14 built 2010-12-27 >udebug daphne.psc.edu -coord [128.182.66.185]:7003 is not the coordinator Contacting coordinator at [128.182.66.184]:7003 First response received from [128.182.66.184]:7003 Host's addresses are: 128.182.66.184 10.32.5.186 Host's time is Fri Dec 22 15:23:31 2017 Local time is Fri Dec 22 15:23:31 2017 (time differential 0 secs) Last yes vote for 128.182.66.184 was 3 secs ago (coordinator); Last vote started 3 secs ago (at Fri Dec 22 15:23:28 2017) Local db version is 1502154074.168010 I am coordinator until 56 secs from now (at Fri Dec 22 15:24:27 2017) (3 servers) Recovery state 1f (have best db version; sync complete; db modified) The last trans I handled was 1502154074.28237654 Coordinator's db version is 1502154074.168010 0 locked pages, 0 of them for write Last time a new db version was labelled was: 11820137 secs ago (at Mon Aug 07 21:01:14 2017) Server (128.182.66.185 10.32.5.185): (db 1502154074.168010) last vote rcvd 4 secs ago (at Fri Dec 22 15:23:27 2017), last beacon sent 3 secs ago (at Fri Dec 22 15:23:28 2017), last vote was yes dbcurrent=1, up=1 beaconSince=1 Server (128.182.59.182): (db 1502154074.168010) last vote rcvd 3 secs ago (at Fri Dec 22 15:23:28 2017), last beacon sent 3 secs ago (at Fri Dec 22 15:23:28 2017), last vote was yes dbcurrent=1, up=1 beaconSince=1 Each of these servers is also a fileserver. There is one more server, 128.182.59.77, which is only a fileserver. As these systems are 1.4.14 they could not be rekeyed to support AES Kerberos service keys in order to address "Brute force DES attack permits compromise of AFS cell" http://www.openafs.org/pages/security/#OPENAFS-SA-2013-003 > The new VM servers have openafs-server-1.6.16-1.el7.centos.x86_64, > and we'd like to configure them so they can replace the 1.4 servers. Clients and admin tools find the psc.edu database servers through the CellServDB file published by grand.central.org and distributed with OpenAFS: >psc.edu #PSC (Pittsburgh Supercomputing Center) 128.182.59.182 #shaggy.psc.edu 128.182.66.184 #velma.psc.edu 128.182.66.185 #daphne.psc.edu DNS SRV records: _afs3-vlserver._udp.psc.edu SRV service location: priority = 0 weight = 0 port = 7003 svr hostname = daphne.psc.edu _afs3-vlserver._udp.psc.edu SRV service location: priority = 0 weight = 0 port = 7003 svr hostname = shaggy.psc.edu _afs3-vlserver._udp.psc.edu SRV service location: priority = 0 weight = 0 port = 7003 svr hostname = velma.psc.edu DNS AFSDB records: psc.edu AFSDB subtype = 1, AFS db server = shaggy.psc.edu psc.edu AFSDB subtype = 1, AFS db server = velma.psc.edu psc.edu AFSDB subtype = 1, AFS db server = daphne.psc.edu in that order. When using CellServDB file data the OpenAFS UNIX clients only use the IP addresses and the Windows clients only use the DNS host names. Clients will remember the resolved IP addresses until either: 1. they are restarted 2. "fs newcell" is issued to update the address list for the cell As such, when replacing database servers it is very important that the host names and IP addresses be preserved across the replacement. Otherwise, there will be an impact on the clients. > The first problem we've run into is with the KeyFile. "bos create" also > gives the same message. > > [root@afs-vma etc]# bos listkeys afs-vma.psc.edu -localauth > bos: ticket contained unknown key version number error encountered while > listing keys > > Question is do we need to create a new KeyFile? The "KeyFile" from the existing servers should be copied to the new servers as is. > Are there any documentation or steps we can follow for this migration? Due to bugs in the 1.4.14 database server implementation it is important that a database server remain off for at least five minutes before it is restarted each time the server is shutdown. Failure to do so can result in corruption of the replicated database. The 1.6.16 release (as are all versions before 1.6.22) is vulnerable to a remote denial of service attack that can result in server panics. I strongly advise deploying 1.6.22 or later. Once all of the servers have been upgraded to at least 1.6.22 it is critical that the DES cell key be replaced with an AES256-CTS-HMAC-SHA1-96 Kerberos service key. Failure to do so leaves the cell vulnerable to brute force attacks. AuriStor provides professional OpenAFS support services to assist organizations such as PSC when upgrading cells. https://www.auristor.com/openafs/ Jeffrey Altman
<<attachment: jaltman.vcf>>
smime.p7s
Description: S/MIME Cryptographic Signature