>From this thread:

http://vcl.markmail.org/thread/2jyeef4dt3zfz5iq

Set the output of the SSH call to be verbose, and am getting "connection
timed out during banner exchange"[1] [2].  Dmitri has changed his to 10
seconds with VCL 2.3 [3], but we haven't made any changes to the code and
this suddenly started happening.

7804|5:5|new| ---- WARNING ----
|7804|5:5|new| 2014-05-29
10:28:33|7804|5:5|new|utils.pm:run_ssh_command(5010)|attempt
1/3: failed to execute SSH command on vcl02: 'echo "testing ssh on vcl02"',
exit status: 255, SSH exits with the exit status of the remote command or
with 255 if an error occurred, output:
|7804|5:5|new| ssh output (echo "test...): Connection timed out during
banner exchange
|7804|5:5|new| ( 0) utils.pm, run_ssh_command (line: 5010)
|7804|5:5|new| (-1) OS.pm, is_ssh_responding (line: 668)
|7804|5:5|new| (-2) VMware.pm, initialize (line: 303)
|7804|5:5|new| (-3) Module.pm, new (line: 207)
|7804|5:5|new| (-4) Module.pm, create_provisioning_object (line: 522)
|7804|5:5|new| (-5) State.pm, initialize (line: 136)

[1] http://vcl.markmail.org/thread/kixneixigivlgqjh
[2] http://vcl.markmail.org/thread/bulmesaak4bqe44a
[3] http://vcl.markmail.org/thread/qzxovcgfklmsfce5

Is network congestion the most likely issue?  vcl-private is on its own,
isolated network between two servers -- there's not even a switch in
between...

Thanks,
Mike


On Thu, May 29, 2014 at 1:14 PM, Mike Haudenschild <[email protected]>
wrote:

> FWIW, seeing this across multiple management nodes and multiple VM hosts.
>
>
> On Thu, May 29, 2014 at 1:07 PM, Mike Haudenschild <[email protected]>
> wrote:
>
>> Yes, I'm able to connect manually via SSH using
>>
>> ssh -i /etc/vcl/vcl.key <vmhostname>
>>
>> By IP and by name with and without DNS suffix.
>>
>> Thanks,
>> Mike
>>
>>
>> On Thu, May 29, 2014 at 1:04 PM, Aaron Coburn <[email protected]>
>> wrote:
>>
>>> Mike,
>>>
>>> For vmware hosts, vcld will always try to connect over the vSphere API
>>> first. If that fails, it will try ssh. It appears the problem you're
>>> encountering relates to the VM hosts not responding to ssh. Are you able to
>>> manually connect to the host via ssh?
>>>
>>> Aaron Coburn
>>>
>>>
>>> On May 29, 2014, at 11:49 AM, Mike Haudenschild <[email protected]>
>>> wrote:
>>>
>>> > Good morning --
>>> >
>>> > We're unable to create reservations, or even migrate VMs from VM
>>> hosts, via the VCL Web UI.  Seeing the following in the logs, which seem to
>>> suggest that the management node is trying to use the vSphere SDK to
>>> control the ESXi host, which is not desired.  The username/password fields
>>> in the VM host profile assigned to these VM hosts is EMPTY, which I believe
>>> is the criterion on which VCL decides what method to use to control ESXi.
>>>  I confirmed that these fields are NULL in the database for this host
>>> profile, as well.
>>> >
>>> > I would create a new/clean host profile and move the VM hosts there,
>>> but since they have VMs attached to them (and I can't migrate them off),
>>> VCL won't allow me to change the individual VM hosts' profiles to a new one.
>>> >
>>> > Thanks very much,
>>> > Mike
>>> >
>>> >
>>> >
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:create_mn_os_object(381)|VCL::Module::OS::Linux::ManagementNode
>>> OS object created, address: 97e1ae0
>>> > 2014-05-29 08:38:57|7330|109:109|new|Module.pm:new(192)|VCL::new
>>> object created for state new, address: 8f50650
>>> > 2014-05-29 08:38:57|7330|109:109|new|State.pm:initialize(88)|obtained
>>> a database handle for this state process, stored as $ENV{dbh}
>>> > 2014-05-29 08:38:57|7330|109:109|new|State.pm:check_image_os(771)|no
>>> corrections need to be made, not an imaging request, returning 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:is_parent_reservation(1004)|returning
>>> true: parent reservation ID for this request: 109
>>> > 2014-05-29 
>>> > 08:38:57|7330|109:109|new|utils.pm:rename_vcld_process(6752)|reservation
>>> count: 1
>>> > 2014-05-29 
>>> > 08:38:57|7330|109:109|new|utils.pm:rename_vcld_process(6771)|PARENTIMAGE:
>>> 1
>>> > 2014-05-29 
>>> > 08:38:57|7330|109:109|new|utils.pm:rename_vcld_process(6772)|SUBIMAGE:
>>> 0
>>> > 2014-05-29 
>>> > 08:38:57|7330|109:109|new|utils.pm:rename_vcld_process(6790)|renamed
>>> process to 'vcld 109:109 new labvm13>vcl01.whittier.edu
>>> vmwarewin7-Windows7Base3-v2 admin'
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:is_parent_reservation(1004)|returning
>>> true: parent reservation ID for this request: 109
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:is_parent_reservation(1004)|returning
>>> true: parent reservation ID for this request: 109
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:create_os_object(308)|VCL::Module::OS::Windows::Version_6::7
>>> module loaded
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:new(196)|VCL::Module::OS::Windows::Version_6::7
>>> object created for image vmwarewin7-Windows7Base3-v2, address: 97ea9a0
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Windows.pm:initialize(214)|beginning Windows
>>> module initialization
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Windows.pm:initialize(224)|Windows module
>>> initialization complete
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:create_os_object(314)|VCL::Module::OS::Windows::Version_6::7
>>> OS object created, address: 97ea9a0
>>> > 2014-05-29 08:38:57|7330|109:109|new|State.pm:initialize(125)|computer
>>> is a VM, attempting to create VM host OS object
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(619)|computer ID
>>> argument was specified, retrieving data for computer ID: 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(643)|image ID
>>> argument was specified: 1, DataStructure object will contain image
>>> information for the production imagerevision of this image
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(674)|retrieved data
>>> for imagerevision ID: 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(690)|retrieved data
>>> for image ID: 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:create_vmhost_os_object(454)|attempting
>>> to load VM host OS module: VCL::Module::OS::Linux::UnixLab (image: 1)
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:create_vmhost_os_object(460)|VM host OS
>>> module loaded: VCL::Module::OS::Linux::UnixLab
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:new(196)|VCL::Module::OS::Linux::UnixLab
>>> object created for image noimage, address: 9ead8e0
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:create_vmhost_os_object(465)|VCL::Module::OS::Linux::UnixLab
>>> OS object created, address: 9ead8e0
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:create_provisioning_object(519)|VCL::Module::Provisioning::VMware::VMware
>>> module loaded
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:new(200)|VCL::Module::Provisioning::VMware::VMware
>>> object created for computer labvm13, address: 91447d0
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|VMware.pm:initialize(267)|initializing
>>> VCL::Module::Provisioning::VMware::VMware object
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(619)|computer ID
>>> argument was specified, retrieving data for computer ID: 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(643)|image ID
>>> argument was specified: 1, DataStructure object will contain image
>>> information for the production imagerevision of this image
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(674)|retrieved data
>>> for imagerevision ID: 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(690)|retrieved data
>>> for image ID: 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_datastructure(1397)|created
>>> DataStructure object for VM host: vcl01
>>> > 2014-05-29 08:38:57|7330|109:109|new|VMware.pm:initialize(287)|VM
>>> profile assigned to vcl01: WhittierLocal - profile
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(619)|computer ID
>>> argument was specified, retrieving data for computer ID: 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(643)|image ID
>>> argument was specified: 1, DataStructure object will contain image
>>> information for the production imagerevision of this image
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(674)|retrieved data
>>> for imagerevision ID: 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(690)|retrieved data
>>> for image ID: 1
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_datastructure(1397)|created
>>> DataStructure object for VM host: vcl01
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_api_object(1500)|attempting
>>> to load VMware control module:
>>> VCL::Module::Provisioning::VMware::vSphere_SDK
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_api_object(1506)|loaded
>>> VMware control module: VCL::Module::Provisioning::VMware::vSphere_SDK
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|Module.pm:new(200)|VCL::Module::Provisioning::VMware::vSphere_SDK
>>> object created for computer labvm13, address: a305240
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|vSphere_SDK.pm:initialize(98)|vSphere SDK for
>>> Perl does not appear to be installed on this managment node, unable to load
>>> VMware vSphere SDK Perl modules, error:
>>> > |7330|109:109|new| Can't locate VMware/VIRuntime.pm in @INC (@INC
>>> contains: /usr/local/vcl/bin/../../../.. /usr/local/vcl/bin/../..
>>> /usr/local/vcl/bin/../../.. /usr/local/vcl/bin/../../../../..
>>> /usr/local/vcl/bin/.. /usr/local/vcl/bin/../lib
>>> /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi
>>> /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl
>>> /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi
>>> /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl
>>> /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8 .) at
>>> (eval 231) line 1.
>>> > |7330|109:109|new| BEGIN failed--compilation aborted at (eval 231)
>>> line 1.
>>> > 2014-05-29
>>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_api_object(1520)|API object
>>> could not be created: VCL::Module::Provisioning::VMware::vSphere_SDK
>>> > 2014-05-29 08:39:01|7166|vcld:main(167)|lastcheckin time updated for
>>> management node 2: 2014-05-29 08:39:01
>>> > 2014-05-29 08:39:06|7166|vcld:main(167)|lastcheckin time updated for
>>> management node 2: 2014-05-29 08:39:06
>>> > 2014-05-29 08:39:11|7166|vcld:main(167)|lastcheckin time updated for
>>> management node 2: 2014-05-29 08:39:11
>>> > 2014-05-29
>>> 08:39:12|7330|109:109|new|OS.pm:is_ssh_responding(682)|vcl01 is NOT
>>> responding to SSH, SSH command failed, port 22: open, port 24: closed
>>> > 2014-05-29 08:39:12|7330|109:109|new|VMware.pm:initialize(307)|unable
>>> to control OS of VM host vcl01 using VCL::Module::OS::Linux::UnixLab OS
>>> object because VM host is not responding to SSH
>>> >
>>> > |7330|109:109|new| ---- WARNING ----
>>> > |7330|109:109|new| 2014-05-29
>>> 08:39:12|7330|109:109|new|Module.pm:create_provisioning_object(530)|provisioning
>>> object could not be created, returning 0
>>> > |7330|109:109|new| ( 0) Module.pm, create_provisioning_object (line:
>>> 530)
>>> > |7330|109:109|new| (-1) State.pm, initialize (line: 136)
>>> > |7330|109:109|new| (-2) Module.pm, new (line: 207)
>>> > |7330|109:109|new| (-3) vcld, make_new_child (line: 567)
>>> > |7330|109:109|new| (-4) vcld, main (line: 350)
>>> >
>>> >
>>> > |7330|109:109|new| ---- WARNING ----
>>> > |7330|109:109|new| 2014-05-29
>>> 08:39:12|7330|109:109|new|State.pm:initialize(145)|failed to create
>>> provisioning object
>>> > |7330|109:109|new| ( 0) State.pm, initialize (line: 145)
>>> > |7330|109:109|new| (-1) Module.pm, new (line: 207)
>>> > |7330|109:109|new| (-2) vcld, make_new_child (line: 567)
>>> > |7330|109:109|new| (-3) vcld, main (line: 350)
>>> >
>>> > 2014-05-29 
>>> > 08:39:12|7330|109:109|new|utils.pm:delete_computerloadlog_reservation(6396)|removing
>>> computerloadlog entries matching loadstate = begin
>>> > 2014-05-29 
>>> > 08:39:12|7330|109:109|new|utils.pm:delete_computerloadlog_reservation(6443)|deleted
>>> rows from computerloadlog for reservation id=109
>>> > 2014-05-29 08:39:12|7330|109:109|new|State.pm:DESTROY(929)|VCL::new
>>> process duration: 15 seconds
>>> > 2014-05-29
>>> 08:39:12|7330|109:109|new|DataStructure.pm:get_computer_private_ip_address(1641)|attempting
>>> to retrieve private IP address for computer: labvm13
>>> > 2014-05-29
>>> 08:39:12|7330|109:109|new|OS.pm:get_file_contents(1897)|retrieved 77 lines
>>> from file on vclman02: '/etc/hosts'
>>> > 2014-05-29
>>> 08:39:12|7330|109:109|new|DataStructure.pm:get_computer_private_ip_address(1697)|returning
>>> IP address from /etc/hosts file: 192.168.1.13
>>> > 2014-05-29 08:39:12|7330|109:109|new|utils.pm:getnewdbh(2474)|unable
>>> to use database handle stored in $ENV{dbh}
>>> > 2014-05-29 08:39:12|7330|109:109|new|utils.pm:getnewdbh(2531)|database
>>> handle stored in $ENV{dbh}
>>> > 2014-05-29 08:39:12|7330|109:109|new|utils.pm:is_inblockrequest(5760)|zero
>>> rows were returned from database select
>>> > 2014-05-29
>>> 08:39:12|7330|109:109|new|DataStructure.pm:get_image_affiliation_name(2121)|image
>>> owner id: 1
>>> > 2014-05-29
>>> 08:39:12|7330|109:109|new|DataStructure.pm:retrieve_user_data(1412)|attempting
>>> to retrieve and store data for user: user.id = '1'
>>> > 2014-05-29
>>> 08:39:12|7330|109:109|new|DataStructure.pm:retrieve_user_data(1475)|data
>>> has been retrieved for user: admin (id: 1)
>>> > 2014-05-29 08:39:16|7166|vcld:main(167)|lastcheckin time updated for
>>> management node 2: 2014-05-29 08:39:16
>>> > 2014-05-29 08:39:22|7166|vcld:main(167)|lastcheckin time updated for
>>> management node 2: 2014-05-29 08:39:21
>>> > 2014-05-29 08:39:27|7166|vcld:main(167)|lastcheckin time updated for
>>> management node 2: 2014-05-29 08:39:27
>>> > 2014-05-29 08:39:32|7166|vcld:main(167)|lastcheckin time updated for
>>> management node 2: 2014-05-29 08:39:32
>>> > 2014-05-29 08:39:37|7166|vcld:main(167)|lastcheckin time updated for
>>> management node 2: 2014-05-29 08:39:37
>>> > 2014-05-29 08:39:42|7166|vcld:main(167)|lastcheckin time updated for
>>> management node 2: 2014-05-29 08:39:42
>>> > 2014-05-29 08:39:43|7330|109:109|new|utils.pm:mail(1253)|SUCCESS --
>>> Sending mail To: [email protected], PROBLEM --
>>> vclman02|109:109|new|vcld|labvm13>vcl01.whittier.edu
>>> |vmwarewin7-Windows7Base3-v2|admin
>>> >
>>> > |7330|109:109|new| ---- CRITICAL ----
>>> > |7330|109:109|new| 2014-05-29
>>> 08:39:12|7330|109:109|new|vcld:make_new_child(574)|VCL::new object could
>>> not be created and initialized
>>> > |7330|109:109|new| ( 0) vcld, make_new_child (line: 574)
>>> > |7330|109:109|new| (-1) vcld, main (line: 350)
>>>
>>>
>>
>

Reply via email to