Hello, VCL dev -- I've seen three instances of "relaim" errors on my VCL instance today -- machines that vcld couldn't seem to reach, although they appeared running on the ESXi console.
It started when I began creating block allocations (which did not appear to work). In both cases there seemed to be text in the log regarding incorrect SQL syntax. After the errors, the affected machines showed as "failed" in the VCL front end, but eventually reloaded and appear to be working fine. Regular, individual reservations appear to be working fine. I've attached both snippets of log and would appreciate any feedback. Many thanks, Mike
------------------------------------------------------------------------ RECENT LOG ENTRIES FOR THIS PROCESS: |28851|193:193|deleted| request state: deleted |28851|193:193|deleted| request laststate: reserved |28851|193:193|deleted| computer state: available |28851|193:193|deleted| computer type: virtualmachine 2012-01-24 09:37:20|28851|193:193|deleted|reclaim.pm:process(124)|request laststate is reserved, checking if computer table current image matches image currently loaded on richmond41 |28851|193:193|deleted| ---- WARNING ---- |28851|193:193|deleted| 2012-01-24 11:38:35|28851|193:193|deleted|OS.pm:get_currentimage_txt_contents(161)|failed to retrieve currentimage.txt from richmond41, exit status: 255, output: |28851|193:193|deleted| ssh_exchange_identification: read: Connection reset by peer |28851|193:193|deleted| ( 0) OS.pm, get_currentimage_txt_contents (line: 161) |28851|193:193|deleted| (-1) OS.pm, get_current_image_name (line: 197) |28851|193:193|deleted| (-2) reclaim.pm, process (line: 133) |28851|193:193|deleted| (-3) vcld, make_new_child (line: 568) |28851|193:193|deleted| (-4) vcld, main (line: 346) |28851|193:193|deleted| ---- WARNING ---- |28851|193:193|deleted| 2012-01-24 11:38:35|28851|193:193|deleted|OS.pm:get_current_image_name(201)|failed to retrieve currentimage.txt contents from richmond41 |28851|193:193|deleted| ( 0) OS.pm, get_current_image_name (line: 201) |28851|193:193|deleted| (-1) reclaim.pm, process (line: 133) |28851|193:193|deleted| (-2) vcld, make_new_child (line: 568) |28851|193:193|deleted| (-3) vcld, main (line: 346) |28851|193:193|deleted| ---- WARNING ---- |28851|193:193|deleted| 2012-01-24 11:38:35|28851|193:193|deleted|reclaim.pm:process(135)|failed to retrieve name of image currently loaded on richmond41, computer will be reloaded |28851|193:193|deleted| ( 0) reclaim.pm, process (line: 135) |28851|193:193|deleted| (-1) vcld, make_new_child (line: 568) |28851|193:193|deleted| (-2) vcld, main (line: 346) 2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:get_management_predictive_info(5849)|management node info retrieved from database 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:get_next_image_dataStructure(1187)|attempting to load predictive loading module: VCL::Module::Predictive::Level_1 2012-01-24 11:38:35|28851|193:193|deleted|Module.pm:create_mn_os_object(335)|management node OS object has already been created, address: 1ac5a880, returning 1 2012-01-24 11:38:35|28851|193:193|deleted|Module.pm:new(203)|VCL::Module::Predictive::Level_1 object created, address: 1b199aa0 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:get_next_image_dataStructure(1197)|VCL::Module::Predictive::Level_1 predictive loading object successfully created 2012-01-24 11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(91)|predictive_reload_Level_1 : starting predictive_reload_level_1 for 45 2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:is_inblockrequest(6163)|zero rows were returned from database select 2012-01-24 11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(247)|predictive_reload_Level_1 : computer_short_name= richmond41 type= virtualmachine 2012-01-24 11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(248)|predictive_reload_Level_1 : avail= 25 notavail= 25 online= 50 timeframe= 1 DAY 2012-01-24 11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(357)|predictive_reload_Level_1 : 1 available images can go on richmond41 2012-01-24 11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(379)|predictive_reload_Level_1 : 1 of 1 available images loaded at least once 2012-01-24 11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(391)|predictive_reload_Level_1 : 0 of 1 total images available for selection |28851|193:193|deleted| ---- WARNING ---- |28851|193:193|deleted| 2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:database_select(4008)|could not execute statement, |28851|193:193|deleted| SELECT |28851|193:193|deleted| COUNT(imageid) AS cnt, |28851|193:193|deleted| imageid |28851|193:193|deleted| FROM |28851|193:193|deleted| log |28851|193:193|deleted| WHERE |28851|193:193|deleted| imageid IN () |28851|193:193|deleted| AND start > (NOW() - INTERVAL 1 DAY) |28851|193:193|deleted| GROUP BY imageid |28851|193:193|deleted| ORDER BY cnt DESC |28851|193:193|deleted| LIMIT 1 |28851|193:193|deleted| , You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ') |28851|193:193|deleted| AND start > (NOW() - INTERVAL 1 DAY) |28851|193:193|deleted| GROUP BY imageid |28851|193:193|deleted| ' at line 7 |28851|193:193|deleted| ( 0) utils.pm, database_select (line: 4008) |28851|193:193|deleted| (-1) Level_1.pm, get_next_image (line: 408) |28851|193:193|deleted| (-2) DataStructure.pm, get_next_image_dataStructure (line: 1198) |28851|193:193|deleted| (-3) reclaim.pm, insert_reload_and_exit (line: 191) |28851|193:193|deleted| (-4) reclaim.pm, process (line: 136) |28851|193:193|deleted| (-5) vcld, make_new_child (line: 568) |28851|193:193|deleted| ---- WARNING ---- |28851|193:193|deleted| 2012-01-24 11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(410)|predictive_reload_Level_1 : failed to fetch preferred image for computer_id 45 |28851|193:193|deleted| ( 0) Level_1.pm, get_next_image (line: 410) |28851|193:193|deleted| (-1) DataStructure.pm, get_next_image_dataStructure (line: 1198) |28851|193:193|deleted| (-2) reclaim.pm, insert_reload_and_exit (line: 191) |28851|193:193|deleted| (-3) reclaim.pm, process (line: 136) |28851|193:193|deleted| (-4) vcld, make_new_child (line: 568) |28851|193:193|deleted| (-5) vcld, main (line: 346) |28851|193:193|deleted| ---- WARNING ---- |28851|193:193|deleted| 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:get_next_image_dataStructure(1204)|predictive loading module failed to retrieve image information, returning current reservation image information |28851|193:193|deleted| ( 0) DataStructure.pm, get_next_image_dataStructure (line: 1204) |28851|193:193|deleted| (-1) reclaim.pm, insert_reload_and_exit (line: 191) |28851|193:193|deleted| (-2) reclaim.pm, process (line: 136) |28851|193:193|deleted| (-3) vcld, make_new_child (line: 568) |28851|193:193|deleted| (-4) vcld, main (line: 346) 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:_automethod(834)|data structure updated: $self->request_data->{reservation}{193}{imagerevision}{imagename} |28851|193:193|deleted| image_name = vmwarewinxp-longsighttest037-v0 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:_automethod(834)|data structure updated: $self->request_data->{reservation}{193}{image}{id} |28851|193:193|deleted| image_id = 7 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:_automethod(834)|data structure updated: $self->request_data->{reservation}{193}{imagerevision}{id} |28851|193:193|deleted| imagerevision_id = 7 2012-01-24 11:38:35|28851|193:193|deleted|reclaim.pm:insert_reload_and_exit(203)|next image: vmwarewinxp-longsighttest037-v0, image id=7, imagerevision id=7 2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:insertloadlog(3848)|computerloadlog entry not inserted, reservation has been deleted |28851|193:193|deleted| ---- WARNING ---- |28851|193:193|deleted| 2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:insert_reload_request(8450)|req=193: current computer state is reloading, reload request will not be inserted |28851|193:193|deleted| ( 0) utils.pm, insert_reload_request (line: 8450) |28851|193:193|deleted| (-1) reclaim.pm, insert_reload_and_exit (line: 206) |28851|193:193|deleted| (-2) reclaim.pm, process (line: 136) |28851|193:193|deleted| (-3) vcld, make_new_child (line: 568) |28851|193:193|deleted| (-4) vcld, main (line: 346) 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:get_computer_private_ip_address(1581)|attempting to retrieve private IP address for computer: richmond41 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:get_computer_private_ip_address(1585)|retrieved contents of /etc/hosts on this management node, contains 60 lines 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:get_computer_private_ip_address(1645)|returning IP address from /etc/hosts file: 10.100.0.67 2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:is_inblockrequest(6163)|zero rows were returned from database select 2012-01-24 11:38:35|28851|193:193|deleted|DataStructure.pm:get_image_affiliation_name(2035)|image owner id: 1 2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:getnewdbh(2709)|database requested (information_schema) does not match handle stored in $ENV{dbh} (vcl:127.0.0.1) 2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:getnewdbh(2760)|database handle stored in $ENV{dbh} 2012-01-24 11:38:36|28851|193:193|deleted|DataStructure.pm:retrieve_user_data(1352)|attempting to retrieve and store data for user: user.id = '1' 2012-01-24 11:38:36|28851|193:193|deleted|utils.pm:getnewdbh(2709)|database requested (vcl) does not match handle stored in $ENV{dbh} (information_schema:127.0.0.1) 2012-01-24 11:38:36|28851|193:193|deleted|utils.pm:getnewdbh(2760)|database handle stored in $ENV{dbh} 2012-01-24 11:38:36|28851|193:193|deleted|DataStructure.pm:retrieve_user_data(1415)|data has been retrieved for user: admin (id: 1)
Use of uninitialized value in concatenation (.) or string at /usr/local/vcl/bin/../lib/VCL/blockrequest.pm line 167 (#1) (W uninitialized) An undefined value was used as if it were already defined. It was interpreted as a "" or a 0, but maybe it was a mistake. To suppress this warning assign a defined value to your variables. To help you figure out what was undefined, perl tells you what operation you used the undefined value in. Note, however, that perl optimizes your program and the operation displayed in the warning may not necessarily appear literally in your program. For example, "that $foo" is usually optimized into "that " . $foo, and the warning will refer to the concatenation (.) operator, even though there is no . in your program. |28413|blockrequest| ---- WARNING ---- |28413|blockrequest| 2012-01-24 12:01:04|28413|blockrequest|vcld:warning_handler(610)|Use of uninitialized value in concatenation (.) or string at /usr/local/vcl/bin/../lib/VCL/blockrequest.pm line 167. |28413|blockrequest| ( 0) vcld, warning_handler (line: 610) |28413|blockrequest| (-1) blockrequest.pm, process (line: 167) |28413|blockrequest| (-2) vcld, make_new_child (line: 568) |28413|blockrequest| (-3) vcld, main (line: 448) 2012-01-24 12:01:04|28413|blockrequest|blockrequest.pm:process(167)|help address: 2012-01-24 12:01:04|28413|blockrequest|blockrequest.pm:process(312)|Removed computers from blockComputers table for blocktime_id=5 2012-01-24 12:01:04|28413|blockrequest|blockrequest.pm:process(315)|Removed blocktime_id=5 from blockTimes table 2012-01-24 12:01:04|28413|blockrequest|utils.pm:check_blockrequest_time(1033)|block request end time has been reached (-1 minutes from now), returning 'end' 2012-01-24 12:01:04|4488|vcld:REAPER(718)|VCL process exited for reservation <unknown>, PID: 28409, signal: CHLD 2012-01-24 12:01:04|4488|vcld:main(167)|lastcheckin time updated for management node 1: 2012-01-24 12:01:04 2012-01-24 12:01:04|28413|blockrequest|blockrequest.pm:process(343)|Removed processing flag on blockrequest_id 5 2012-01-24 12:01:04|28413|blockrequest|State.pm:DESTROY(829)|VCL::blockrequest destructor called, address: 1aa2f010 2012-01-24 12:01:04|28413|blockrequest|State.pm:DESTROY(848)|number of database handles state process created: 1 2012-01-24 12:01:04|4488|vcld:REAPER(718)|VCL process exited for reservation <unknown>, PID: 28413, signal: CHLD