Hello, I've recently been hitting a memory leak with the vcl daemon (VCL version 2.3.2). The problem appears to be happening in the computer_not_being_used subroutine within new.pm (see attached log)
The problem appears to start when during a reload there was an issue communicating with the VMWare server. This caused the VM to be left on the VMHost in a powered off state along with the deletion of the entries from the computerloadlog table *|6309|19812:19812|reload| ---- CRITICAL ---- * *|6309|19812:19812|reload| 2014-09-03 09:45:50|6309|19812:19812|reload|vcld:die_handler(639)|:1: parser error : Start tag expected, '<' not found* *|6309|19812:19812|reload| Can't connect to vcl2:443 (Connection refused)* *|6309|19812:19812|reload| ^* *|6309|19812:19812|reload| ( 0) vcld, die_handler (line: 639)* *|6309|19812:19812|reload| (-1) LibXML.pm, (eval) (line: 378)* *|6309|19812:19812|reload| (-2) LibXML.pm, parse_string (line: 378)* *|6309|19812:19812|reload| (-3) VICommon.pm, (eval) (line: 2194)* *|6309|19812:19812|reload| (-4) VICommon.pm, request (line: 2194)* *|6309|19812:19812|reload| (-5) (eval 29660), RetrieveProperties (line: 172)* *|6309|19812:19812|reload| (-6) VICommon.pm, update_view_data (line: 1663)* *|6309|19812:19812|reload| (-7) VICommon.pm, get_view (line: 1512)* *|6309|19812:19812|reload| (-8) vSphere_SDK.pm, _get_file_info (line: 2471)* *|6309|19812:19812|reload| (-9) vSphere_SDK.pm, find_files (line: 2096)* *|6309|19812:19812|reload| (-10) VMware.pm, remove_existing_vms (line: 1594)* *|6309|19812:19812|reload| (-11) VMware.pm, load (line: 469)* *|6309|19812:19812|reload| (-12) new.pm <http://new.pm>, reload_image (line: 671)* *|6309|19812:19812|reload| (-13) new.pm <http://new.pm>, process (line: 291)* *|6309|19812:19812|reload| (-14) vcld, make_new_child (line: 571)* *2014-09-03 09:45:51|6309|19812:19812|reload|utils.pm:delete_computerloadlog_reservation(6396)|removing computerloadlog entries matching loadstate = begin* *2014-09-03 09:45:51|6309|19812:19812|reload|utils.pm:delete_computerloadlog_reservation(6443)|deleted rows from computerloadlog for reservation id=19812* Now when a new reservation comes in and the same vm is allocated for the reservation, the computer_not_being_used subroutine calls the $self->code_loop_timeout(sub{return !reservation_being_processed(@_)}, [$competing_reservation_id], $message, $total_wait_seconds, $attempt_delay_seconds)) section (on line # 815 in new.pm) it receives a 0 from reservation_being_processed with message *"2014-10-07 11:45:54|8175|23084:23084|new|utils.pm:reservation_being_processed(8634)|computerloadlog 'begin' entry does NOT exist for reservation 19812" * The vcl daemon thinks that the reload has completed. This causes the same reservation to be processed over an over within computer_not_being_used causing memory spike's and eventually killing that vcld thread. Any ideas how the reservation_being_processed can handle the lack of "begin" entries when used along with the code_loop_timeout from computer_not_being_used or the DESTROY handler can make sure such reservations are purged, so it doesn't cause this issue? Please let me know if you need any further clarification. Thanks. -- Junaid Ali
|8175|23084:23084|new| ---- WARNING ---- |8175|23084:23084|new| 2014-10-07 11:44:36|8175|23084:23084|new|new.pm:computer_not_being_used(736)|vcl-vm-33 state is reloading, checking if any conflicting reservations are active |8175|23084:23084|new| ( 0) new.pm, computer_not_being_used (line: 736) |8175|23084:23084|new| (-1) new.pm, computer_not_being_used (line: 821) |8175|23084:23084|new| (-2) new.pm, computer_not_being_used (line: 821) |8175|23084:23084|new| (-3) new.pm, process (line: 127) |8175|23084:23084|new| (-4) vcld, make_new_child (line: 571) |8175|23084:23084|new| (-5) vcld, main (line: 350) 2014-10-07 11:44:36|8175|23084:23084|new|new.pm:computer_not_being_used(742)|retrieving info for reservations assigned to vcl-vm-33 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user mousa, setting UID to VCL user ID: mousa, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user mousa, setting UID to VCL user ID: mousa, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_connect_method_info(10059)|attempting to retrieve connect method info: |8175|23084:23084|new| imagerevision: 70 - vmwarewin7-IITWin7OTS137-v5 |8175|23084:23084|new| OS: 35 - vmwarewin7 |8175|23084:23084|new| OS type: 1 - windows 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_connect_method_info(10123)|RDP: connectmethodid=2, OStypeid=1, OSid=NULL, imagerevisionid=NULL, disabled=0 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user vclreload, setting UID to VCL user ID: vclreload, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user vclreload, setting UID to VCL user ID: vclreload, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_request_by_computerid(5991)|retrieved info and DataStructure object for 19812:19812 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user mousa, setting UID to VCL user ID: mousa, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user mousa, setting UID to VCL user ID: mousa, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_connect_method_info(10059)|attempting to retrieve connect method info: |8175|23084:23084|new| imagerevision: 70 - vmwarewin7-IITWin7OTS137-v5 |8175|23084:23084|new| OS: 35 - vmwarewin7 |8175|23084:23084|new| OS type: 1 - windows 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_connect_method_info(10123)|RDP: connectmethodid=2, OStypeid=1, OSid=NULL, imagerevisionid=NULL, disabled=0 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user cgordon2, setting UID to VCL user ID: cgordon2, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user cgordon2, setting UID to VCL user ID: cgordon2, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_request_by_computerid(5991)|retrieved info and DataStructure object for 21170:21170 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user mousa, setting UID to VCL user ID: mousa, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user mousa, setting UID to VCL user ID: mousa, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_connect_method_info(10059)|attempting to retrieve connect method info: |8175|23084:23084|new| imagerevision: 70 - vmwarewin7-IITWin7OTS137-v5 |8175|23084:23084|new| OS: 35 - vmwarewin7 |8175|23084:23084|new| OS type: 1 - windows 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_connect_method_info(10123)|RDP: connectmethodid=2, OStypeid=1, OSid=NULL, imagerevisionid=NULL, disabled=0 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user cgordon2, setting UID to VCL user ID: cgordon2, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_user_info(7540)|UID value is not configured for user cgordon2, setting UID to VCL user ID: cgordon2, standalone: 1 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:get_request_by_computerid(5991)|retrieved info and DataStructure object for 23084:23084 2014-10-07 11:44:36|8175|23084:23084|new|new.pm:computer_not_being_used(779)|checking reservation assigned to vcl-vm-33: |8175|23084:23084|new| request:reservation ID: 19812:19812 |8175|23084:23084|new| request state: pending/reload |8175|23084:23084|new| request start time: 2014-09-03 09:45:15 |8175|23084:23084|new| request end time: 2014-09-03 10:15:15 2014-10-07 11:44:36|8175|23084:23084|new|new.pm:computer_not_being_used(808)|reservation 19812 is currently loading vcl-vm-33 with the correct image: vmwarewin7-IITWin7OTS137-v5, waiting for the other reload process to complete 2014-10-07 11:44:36|8175|23084:23084|new|Module.pm:code_loop_timeout(887)|reload reservation 19812:19812 is still loading vcl-vm-33 with vmwarewin7-IITWin7OTS137-v5, maximum of 1200 seconds 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:reservation_being_processed(8634)|computerloadlog 'begin' entry does NOT exist for reservation 19812 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:run_command(8706)|executed command: pgrep -fl 'vcld [0-9]+:19812 ', pid: 8184, exit status: 1, output: 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:is_management_node_process_running(8939)|process is NOT running, identifier: 'vcld [0-9]+:19812 ' 2014-10-07 11:44:36|8175|23084:23084|new|utils.pm:reservation_being_processed(8655)|reservation is NOT currently being processed 2014-10-07 11:44:36|8175|23084:23084|new|Module.pm:code_loop_timeout(900)|reload reservation 19812:19812 is still loading vcl-vm-33 with vmwarewin7-IITWin7OTS137-v5, code returned true 2014-10-07 11:44:36|8175|23084:23084|new|new.pm:computer_not_being_used(816)|reload reservation 19812 finished loading vcl-vm-33 with vmwarewin7-IITWin7OTS137-v5 2014-10-07 11:44:36|8175|23084:23084|new|new.pm:computer_not_being_used(820)|calling this subroutine again to retrieve the current list of competing reservations assigned to vcl-vm-33 2014-10-07 11:44:36|8175|23084:23084|new|DataStructure.pm:get_computer_state_name(2436)|attempting to retrieve current state of computer vcl-vm-33 from the database 2014-10-07 11:44:36|8175|23084:23084|new|DataStructure.pm:get_computer_state_name(2467)|retrieved current state of computer vcl-vm-33 from the database: reloading 2014-10-07 11:44:36|8175|23084:23084|new|DataStructure.pm:_automethod(847)|data structure updated, hash path: $self->request_data->{reservation}{23084}{computer}{state}{name}, data identifier: computer_state_name, data: |8175|23084:23084|new| : "reloading" |8175|23084:23084|new| ---- WARNING ---- |8175|23084:23084|new| 2014-10-07 11:44:36|8175|23084:23084|new|new.pm:computer_not_being_used(736)|vcl-vm-33 state is reloading, checking if any conflicting reservations are active |8175|23084:23084|new| ( 0) new.pm, computer_not_being_used (line: 736) |8175|23084:23084|new| (-1) new.pm, computer_not_being_used (line: 821) |8175|23084:23084|new| (-2) new.pm, computer_not_being_used (line: 821) |8175|23084:23084|new| (-3) new.pm, computer_not_being_used (line: 821) |8175|23084:23084|new| (-4) new.pm, process (line: 127) |8175|23084:23084|new| (-5) vcld, make_new_child (line: 571)