Re: [xcat-user] How can I migrate to a new xCAT MN in a hierarchical environment?

2016-06-16 Thread Xiao Peng Wang
The communication between 'xcatclient' and 'xcatd', 'xcatd' and 'xcatd' (MN <-> SN) use certificates to authenticate each other. 
 
This key update can be achieved through `updatenode -P`, if you want to continue to use the old one. You should copy the following dir from the old MN:
 
/etc/xcat/ca/
/etc/xcat/cert/
/root/.xcat/ 
ThanksBest Regards--Wang Xiaopeng (王晓朋)IBM China System Technology LaboratoryTel: 86-10-82453455Email: w...@cn.ibm.comAddress: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193
 
 
- Original message -From: Josh Nielsen To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] How can I migrate to a new xCAT MN in a hierarchical environment?Date: Fri, Jun 17, 2016 3:34 AM 
Well, I should have looked in the logs first. There were more detailed messages in /var/log/messages on the MN: 
Jun 16 14:10:14 xcat-master xcat[30550]: Error dispatching request to xcat-serv1:3001, trying other service nodes: Connection failure: SSL connect attempt failed because of handshake problems error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca at /opt/xcat/lib/perl/xCAT/Client.pm line 265.
Jun 16 14:10:15 xcat-master xcat[30550]: Error dispatching request to xcat-serv2:3001, trying other service nodes: Connection failure: SSL connect attempt failed because of handshake problems error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca at /opt/xcat/lib/perl/xCAT/Client.pm line 265.Which SSL cert or key is involved in this connection? Although I copied over the rsa keys in /root/.ssh from the old MN to the new one I did not do the same for either /etc/xcat/cert/ or /etc/ssh/. Might a missing key or cert from either of those directories be responsible for that error?Thanks,Josh
 
On Thu, Jun 16, 2016 at 2:23 PM, Josh Nielsen  wrote:

Xiao,Okay, so I followed those four steps with some modifications. I did 1 & 4 as instructed with no issues. The service nodes are getting their database access from the new MN now, and I updated the SN object definitions to point xcatmaster, tftpserver, and other relevant parameters to the new MN.I avoided step #3 because I just copied the old /root/.ssh/id_rsa and corresponding .pub file to the new MN and passwordless logon works fine. I also tested this from the two service nodes to make sure they could fetch the host keys: "USEOPENSSLFORXCAT=yes XCATSERVER=:3001 /xcatpost/getcredentials.awk ssh_rsa_hostkey. Is that sufficient for the key step?And lastly for #3 I only selectively updated certain packages on the SNs like syslog and NTP, because I didn't want to run all of the packages and in particular the servicenode postscript.So, I was able to use updatenode with no issues from the new MN to update the SNs, however when I try to update any cluster client nodes it is having problems dispatching to the service nodes in the hierarchy: 
# updatenode node0010 -P addsiteyum
Error: Failed to dispatch command to any of the following service nodes: xcat-serv1,xcat-serv2What is most likely causing that issue?Thanks,Josh
 
On Fri, Jun 3, 2016 at 7:01 AM, Xiao Peng Wang  wrote:

I think we should talk it as opposite way that how to make the MN to use the new SN.
 
Following steps are necessary to switch a SN:
1. rerun 'mysqlsetup -f' to assign the access permission for SN to access DB on MN
2. run 'updatenode -k ' to set up the ssh key
3. run 'updatenode -P' to update the SN
4. change the 'servicenode' attribute for compute node accordingly.
 
ThanksBest Regards--Wang Xiaopeng (王晓朋)IBM China System Technology LaboratoryTel: 86-10-82453455Email: w...@cn.ibm.comAddress: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193
 
 
- Original message -From: Josh Nielsen To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] How can I migrate to a new xCAT MN in a hierarchical environment?Date: Thu, Jun 2, 2016 3:49 AM 
Can anyone verify if simply updating cfgloc should be all I need to for the SNs to start using the new MN? By pointing it to the new MN's MySQL instance, which has a site table with the new MN specified as the xcatmaster, it should even update the content the the xcatmaster value shown in an 'lsdef' of the service nodes automatically, right?Thanks,Josh
 
On Tue, May 17, 2016 at 3:42 PM, Josh Nielsen  wrote:

A correction below for something I wrote previously."...and the SNs then shouldn't need newly generated keys (right?)..."
 
On Tue, May 17, 2016 at 3:36 PM, Josh Nielsen  wrote:

I looked at the 'servicenode' postscript and it does _way_ too much for what I want to accomplish. I don't think 

Re: [xcat-user] How can I migrate to a new xCAT MN in a hierarchical environment?

2016-06-16 Thread Josh Nielsen
Well, I should have looked in the logs first. There were more detailed
messages in /var/log/messages on the MN:

Jun 16 14:10:14 xcat-master xcat[30550]: Error dispatching request to
xcat-serv1:3001, trying other service nodes: Connection failure: SSL
connect attempt failed because of handshake problems error:14094418:SSL
routines:SSL3_READ_BYTES:tlsv1 alert unknown ca at
/opt/xcat/lib/perl/xCAT/Client.pm line 265.
Jun 16 14:10:15 xcat-master xcat[30550]: Error dispatching request to
xcat-serv2:3001, trying other service nodes: Connection failure: SSL
connect attempt failed because of handshake problems error:14094418:SSL
routines:SSL3_READ_BYTES:tlsv1 alert unknown ca at
/opt/xcat/lib/perl/xCAT/Client.pm line 265.

Which SSL cert or key is involved in this connection? Although I copied
over the rsa keys in /root/.ssh from the old MN to the new one I did not do
the same for either /etc/xcat/cert/ or /etc/ssh/. Might a missing key or
cert from either of those directories be responsible for that error?

Thanks,
Josh

On Thu, Jun 16, 2016 at 2:23 PM, Josh Nielsen 
wrote:

> Xiao,
>
> Okay, so I followed those four steps with some modifications. I did 1 & 4
> as instructed with no issues. The service nodes are getting their database
> access from the new MN now, and I updated the SN object definitions to
> point xcatmaster, tftpserver, and other relevant parameters to the new MN.
>
> I avoided step #3 because I just copied the old /root/.ssh/id_rsa and
> corresponding .pub file to the new MN and passwordless logon works fine. I
> also tested this from the two service nodes to make sure they could fetch
> the host keys: "USEOPENSSLFORXCAT=yes XCATSERVER=:3001
> /xcatpost/getcredentials.awk ssh_rsa_hostkey. Is that sufficient for the
> key step?
>
> And lastly for #3 I only selectively updated certain packages on the SNs
> like syslog and NTP, because I didn't want to run all of the packages and
> in particular the servicenode postscript.
>
> So, I was able to use updatenode with no issues from the new MN to update
> the SNs, however when I try to update any cluster client nodes it is having
> problems dispatching to the service nodes in the hierarchy:
>
>
> # updatenode node0010 -P addsiteyum
> Error: Failed to dispatch command to any of the following service nodes:
> xcat-serv1,xcat-serv2
>
> What is most likely causing that issue?
>
> Thanks,
> Josh
>
> On Fri, Jun 3, 2016 at 7:01 AM, Xiao Peng Wang  wrote:
>
>> I think we should talk it as opposite way that how to make the MN to use
>> the new SN.
>>
>> Following steps are necessary to switch a SN:
>> 1. rerun 'mysqlsetup -f' to assign the access permission for SN to access
>> DB on MN
>> 2. run 'updatenode -k ' to set up the ssh key
>> 3. run 'updatenode -P' to update the SN
>> 4. change the 'servicenode' attribute for compute node accordingly.
>>
>>
>> Thanks
>> Best Regards
>> --
>> Wang Xiaopeng (王晓朋)
>> IBM China System Technology Laboratory
>> Tel: 86-10-82453455
>> Email: w...@cn.ibm.com
>> Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
>> Haidian District Beijing P.R.China 100193
>>
>>
>>
>> - Original message -
>> From: Josh Nielsen 
>> To: xCAT Users Mailing list 
>> Cc:
>> Subject: Re: [xcat-user] How can I migrate to a new xCAT MN in a
>> hierarchical environment?
>> Date: Thu, Jun 2, 2016 3:49 AM
>>
>> Can anyone verify if simply updating cfgloc should be all I need to for
>> the SNs to start using the new MN? By pointing it to the new MN's MySQL
>> instance, which has a site table with the new MN specified as the
>> xcatmaster, it should even update the content the the xcatmaster value
>> shown in an 'lsdef' of the service nodes automatically, right?
>>
>> Thanks,
>> Josh
>>
>> On Tue, May 17, 2016 at 3:42 PM, Josh Nielsen 
>> wrote:
>>
>> A correction below for something I wrote previously.
>>
>> "...and the SNs then shouldn't need newly generated keys (right?)..."
>>
>> On Tue, May 17, 2016 at 3:36 PM, Josh Nielsen 
>> wrote:
>>
>> I looked at the 'servicenode' postscript and it does _way_ too much for
>> what I want to accomplish. I don't think the script was written with
>> changes or upgrades in mind. It looks like it freshly copies everything to
>> the SNs' $installdir/postscripts and /etc/xcat on the service node and
>> generates (new?) keys. The SNs don't need those updates/changes in my case.
>> From looking at the following comment in the 'servicenode' postscript and
>> the code I'm wondering if all I need to do is manually
>> modify /etc/xcat/cfgloc to update the IP for the new MN database location
>> and if everything else will be fine. They keys should already be in place
>> because I am copying the same keys from the old MN onto the new MN server,
>> and the SNs then shouldn't need 

Re: [xcat-user] How can I migrate to a new xCAT MN in a hierarchical environment?

2016-06-16 Thread Josh Nielsen
Xiao,

Okay, so I followed those four steps with some modifications. I did 1 & 4
as instructed with no issues. The service nodes are getting their database
access from the new MN now, and I updated the SN object definitions to
point xcatmaster, tftpserver, and other relevant parameters to the new MN.

I avoided step #3 because I just copied the old /root/.ssh/id_rsa and
corresponding .pub file to the new MN and passwordless logon works fine. I
also tested this from the two service nodes to make sure they could fetch
the host keys: "USEOPENSSLFORXCAT=yes XCATSERVER=:3001
/xcatpost/getcredentials.awk ssh_rsa_hostkey. Is that sufficient for the
key step?

And lastly for #3 I only selectively updated certain packages on the SNs
like syslog and NTP, because I didn't want to run all of the packages and
in particular the servicenode postscript.

So, I was able to use updatenode with no issues from the new MN to update
the SNs, however when I try to update any cluster client nodes it is having
problems dispatching to the service nodes in the hierarchy:


# updatenode node0010 -P addsiteyum
Error: Failed to dispatch command to any of the following service nodes:
xcat-serv1,xcat-serv2

What is most likely causing that issue?

Thanks,
Josh

On Fri, Jun 3, 2016 at 7:01 AM, Xiao Peng Wang  wrote:

> I think we should talk it as opposite way that how to make the MN to use
> the new SN.
>
> Following steps are necessary to switch a SN:
> 1. rerun 'mysqlsetup -f' to assign the access permission for SN to access
> DB on MN
> 2. run 'updatenode -k ' to set up the ssh key
> 3. run 'updatenode -P' to update the SN
> 4. change the 'servicenode' attribute for compute node accordingly.
>
>
> Thanks
> Best Regards
> --
> Wang Xiaopeng (王晓朋)
> IBM China System Technology Laboratory
> Tel: 86-10-82453455
> Email: w...@cn.ibm.com
> Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
> Haidian District Beijing P.R.China 100193
>
>
>
> - Original message -
> From: Josh Nielsen 
> To: xCAT Users Mailing list 
> Cc:
> Subject: Re: [xcat-user] How can I migrate to a new xCAT MN in a
> hierarchical environment?
> Date: Thu, Jun 2, 2016 3:49 AM
>
> Can anyone verify if simply updating cfgloc should be all I need to for
> the SNs to start using the new MN? By pointing it to the new MN's MySQL
> instance, which has a site table with the new MN specified as the
> xcatmaster, it should even update the content the the xcatmaster value
> shown in an 'lsdef' of the service nodes automatically, right?
>
> Thanks,
> Josh
>
> On Tue, May 17, 2016 at 3:42 PM, Josh Nielsen 
> wrote:
>
> A correction below for something I wrote previously.
>
> "...and the SNs then shouldn't need newly generated keys (right?)..."
>
> On Tue, May 17, 2016 at 3:36 PM, Josh Nielsen 
> wrote:
>
> I looked at the 'servicenode' postscript and it does _way_ too much for
> what I want to accomplish. I don't think the script was written with
> changes or upgrades in mind. It looks like it freshly copies everything to
> the SNs' $installdir/postscripts and /etc/xcat on the service node and
> generates (new?) keys. The SNs don't need those updates/changes in my case.
> From looking at the following comment in the 'servicenode' postscript and
> the code I'm wondering if all I need to do is manually
> modify /etc/xcat/cfgloc to update the IP for the new MN database location
> and if everything else will be fine. They keys should already be in place
> because I am copying the same keys from the old MN onto the new MN server,
> and the SNs then shouldn't need to keys (right?). Please let me know if you
> see any problems with this.
>
> The comment in the code:
>
>  For Linux:
>It calls xcatserver and xcatclient script to get the ssh keys, ssl
>redentials and cfgloc file and transfer from the MN to the SN
>to be able to access the
>database,  setup ssh keys on the nodes and have daemon to daemon
>commmunication between the SN and MN and have the SN access the DB.
>
>
> P.S. Also would just giving the new MN the same IP and hostname (even as
> an alias to a different primary hostname) more or less prevent any changes
> from needing to be made on the SNs at all (no postscripts run nor manual
> modifications of files)?
>
> Thanks,
> Josh
>
> On Thu, May 5, 2016 at 11:42 AM, Josh Nielsen 
> wrote:
>
> Hi Christian,
>
> Thanks for the response. So do I actually have to reinstall the SNs and/or
> rerun the service node postscript? If reruning the SN post script just
> makes some minor adjustments but doesn't clear the dhcpd.leases and the
> .conf files for named and dhcp, as I have them configured, then that would
> be fine, but if it blows all that away and starts over that would qualify
> as disruptive for my environment since