I believe the problems described in VCL-839 are fixed in trunk. I did not make any changes to vcl-install.sh, but updated the backend code to not rely on the management node's private IP address. Please test it out. The changes only affect Linux images. When testing, be sure to verify the firewall is correct:
-after an image loads, before post_load is executed (22 should be open to all IPs) -after post_load is executed (22 should be closed, all of management node's IPs should be allowed to connect to any port) -after a user clicks Connect but before connecting (22 should be open to any IP, management node access shouldn't change) -after a user connects (22 should be locked down to user's IP, management node access shouldn't change) -after a user clicks Connect and the reservation times out due to no initial connection, after sanitize is executed (22 should be closed to all, management node should still be allowed from any of its IPs to any port) -after pre_capture is executed (22 should be open to all) -after a user clicks Connect from a different remote IP address (22 should be allowed from user's original and new remote IP) It is also beneficial to test the outcome if the management node is only allowed to connect on port 22. Manually change iptables and check the various stages. Under no condition should the management node be locked out. We are essentially running 2.4 in production right now. I'll update all of our management nodes tomorrow morning to trunk and we will watch things closely. If no problems are identified, I think a release candidate could be created late in the day tomorrow. Thanks, Andy On Wed, Mar 18, 2015 at 4:09 PM, Andy Kurth <[email protected]> wrote: > > On Wed, Mar 18, 2015 at 11:08 AM, Aaron Coburn <[email protected]> > wrote: > >> I'm in favor of whatever would be least confusing to users. And that >> probably means waiting until a 2.4.1 release before announcing it on the >> a.o mailing list. >> > > Agree. > > Regarding 2.4.1, the problem discovered yesterday has been fixed in > trunk. I tested a few 15-VM reservations using the code in trunk and > cluster_info was correct. > > However, I found another problem described in > https://issues.apache.org/jira/browse/VCL-839. Using a slightly modified > vcl-install.sh, I installed a new CentOS 6.5 VM with VCL 2.4 and then > updated it with the code in trunk. I was able to create a CentOS 6.5 base > image and make reservations without any problems. When I attempted to > capture one of the reservations, it failed because the management node had > locked itself out after the first user connection was detected. This is > described ad nauseam in the Jira issue. > > The problem is partially due to vcl-install.sh using localhost by default > as the management node name. We could change the script to use something > else. Regardless, the management node name must resolve to the private IP > address or problems will occur. The script should add an entry to > /etc/hosts so the MN's hostname in vcld.conf and the management node table > resolves to the MN's private IP address. Josh primarily developed the > script but is travelling this week. I can try to address the issues with > the script tomorrow. > > This will fix the install script but there are still problems with the > code. A management node should never lock itself out. These problems can > be pushed off in my opinion but we need to add to the install documentation > a step to make sure the MN's hostname resolves to its private IP address. > > Thought? > > Regards, > Andy >
