I've just tried this...
> * Make sure time on management node is correct.
Time is correct.
> * double check the blockRequest table for the given allocation, make
> sure processing is set to 0
It was set to 1, but setting it back to 0 triggered lots of traffic in
the log, see below.
> * make sure the related BlockTimes processing entry in the BlockTimes
> table is set to 0
That was 0.
On the way I uncovered a few more error conditions, involving Perl
libraries that I'd overlooked before, which are now adjusted to call our
purpose-build perl. But I'm back to the error I was getting before:
[...]
2012-10-04
13:21:53|21007|blockrequest|utils.pm:xmlrpc_call(9105)|argument_string=
XMLRPCprocessBlockTime 19 1
Can't locate object method "type" via package "RPC::XML::Client::send_request:
HTTP server error: Can't connect to vlab.uchicago.edu:443 (certificate
verify failed)" (perhaps you forgot to load "RPC::XML::Client::send_request:
HTTP server error: Can't connect to vlab.uchicago.edu:443 (certificate verify
failed)"?) at /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121 (#1)
Uncaught exception from user code:
Can't locate object method "type" via package
"RPC::XML::Client::send_request: HTTP server error: Can't connect to
vlab.uchicago.edu:443 (certificate verify failed)" (perhaps you forgot to load
"RPC::XML::Client::send_request: HTTP server error: Can't connect to
vlab.uchicago.edu:443 (certificate verify failed)"?) at
/usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121.
at /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121
VCL::utils::xmlrpc_call('XMLRPCprocessBlockTime', 19, 1) called at
/usr/local/vcl-2.2.1/bin/../lib/VCL/blockrequest.pm line 373
VCL::blockrequest::process_block_time(19) called at
/usr/local/vcl-2.2.1/bin/../lib/VCL/blockrequest.pm line 193
VCL::blockrequest::process('VCL::blockrequest=HASH(0x102e528)') called
at /usr/local/vcl/bin/vcld line 568
VCL::vcld::make_new_child('HASH(0x208fe70)') called at
/usr/local/vcl/bin/vcld line 448
VCL::vcld::main() called at /usr/local/vcl/bin/vcld line 98
2012-10-04 13:21:53|21007|blockrequest|State.pm:DESTROY(829)|VCL::blockrequest
destructor called, address: 102e528
2012-10-04 13:21:53|21007|blockrequest|State.pm:DESTROY(848)|number of database
handles state process created: 1
2012-10-04 13:21:53|14005|vcld:REAPER(718)|VCL process exited for reservation
<unknown>, PID: 21007, signal: CHLD
So maybe we do have a cert issue. Could it be that the CA bundle that
VCL uses (wherever that is) doesn't like Comodo/AddTrust?
On Thu, Oct 04, 2012 at 11:21:57AM -0400, Aaron Peeler wrote:
> You shouldn't need to wait 45 minutes, the start time only needs to be
> between 6hrs and 30 minutes from now. So as long as the start time is
> between this window, you should see something immediately.
>
> Some things to check:
> * Make sure time on management node is correct.
> * double check the blockRequest table for the given allocation, make
> sure processing is set to 0
> * make sure the related BlockTimes processing entry in the BlockTimes
> table is set to 0
>
> Note, resetting these to 0 will re-processing the blockallocation to
> help debugging a particular block.
>
> -A
>
> On Thu, Oct 4, 2012 at 11:04 AM, Michael Jinks <[email protected]> wrote:
> > Hrm. Well, I made my block reservation, waited my 45 minutes, and still
> > no mention of it in the log. I'll try again with a longer lead time.
> >
> >
> > On Thu, Oct 04, 2012 at 09:01:34AM -0500, Michael Jinks wrote:
> >> Thanks Aaron.
> >>
> >> It looks like we're in good shape as far as this stuff is concerned...
> >>
> >> $ /usr/local/vcl/bin/perl -MRPC::XML -e 'print "success\n";'
> >> success
> >>
> >> $ /usr/local/vcl/bin/perl -MLWP -e 'print LWP->VERSION ."\n";'
> >> 6.04
> >>
> >> ...but I wonder if it's possible that under some circumstances vcld is
> >> calling a perl other than the one we purpose-installed for it. vcld
> >> itself was adjusted to call that instance but a couple of other perl
> >> scripts in vcl/bin weren't, so I've fixed that and scheduled another
> >> reservation 45 minutes from now, will see if that makes any difference.
> >>
> >>
> >>
> >> On Thu, Oct 04, 2012 at 01:42:47PM +0000, Aaron Coburn wrote:
> >> > I have seen something like this before.
> >> >
> >> > I would recommend ensuring that the RPC::XML perl module is properly
> >> > installed on the management node. Try this command:
> >> >
> >> > perl -MRPC::XML -e 'print "success\n";'
> >> >
> >> > If you get errors from this, try manually installing the RPC::XML
> >> > module from here:
> >> >
> >> > [1]http://search.cpan.org/~rjray/RPC-XML-0.77/
> >> >
> >> > The only thing to note is that the current version of RPC::XML
> >> > requires
> >> > at least version 5.834 of LWP. To check which version of LWP you
> >> > currently have installed, run this command:
> >> >
> >> > perl -MLWP -e 'print LWP->VERSION ."\n";'
> >> >
> >> > If this reports any version before 5.834 (as my system does), then you
> >> > may need to use version 0.72 of the RPC::XML module, which requires
> >> > only version 5.801 of LWP.
> >> >
> >> > Also, when I say 'manually install', I do not mean using the cpan
> >> > interactive interface; rather, download the package with wget, and
> >> > then
> >> > use the standard perl installation sequence:
> >> >
> >> > perl Makefile.PL
> >> >
> >> > make
> >> >
> >> > make test
> >> >
> >> > make install
> >> >
> >> > Aaron
> >> >
> >> > --
> >> > Aaron Coburn
> >> > Systems Administrator and Programmer
> >> > Academic Technology Services, Amherst College
> >> > [2][email protected]
> >> > On Oct 4, 2012, at 9:05 AM, Michael Jinks <[3][email protected]>
> >> > wrote:
> >> >
> >> > Okay, this makes sense. Thanks for clearing that up; I'd wondered
> >> > if
> >> > vcld had logic for ramping up to a reservation.
> >> > I made a reservation last night, scheduled for 08:00 this morning.
> >> > At
> >> > 02:00, I have errors in my log, which I'll paste below. I haven't
> >> > started trying to investigate what's wrong; the error suggests a bad
> >> > cert. Our cert is signed by Comodo, maybe that's the problem? I
> >> > don't
> >> > know what vcld uses as a trust chain.
> >> > There could be several other things going on too. The reference to
> >> > 'Can't locate object method "type"' looks ugly, but I wonder if it's
> >> > just a result of the SSL error further up?
> >> > Log excerpt follows.
> >> > [...]
> >> > 2012-10-04
> >> > 02:00:02|3322|blockrequest|blockrequest.pm:process(192)|processing
> >> > blocktime_id= 15 pass 1
> >> > 2012-10-04
> >> > 02:00:02|3322|blockrequest|utils.pm:xmlrpc_call(9105)|argument_strin
> >> > g= XMLRPCprocessBlockTime 15 1
> >> > Can't locate object method "type" via package
> >> > "RPC::XML::Client::send_request:
> >> > HTTP server error: Can't connect to [4]vlab.uchicago.edu:443
> >> > (certificate verify failed)" (perhaps you forgot to load
> >> > "RPC::XML::Client::send_request: HTTP server error: Can't connect to
> >> > [5]vlab.uchicago.edu:443 (certificate verify failed)"?) at
> >> > /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121 (#1)
> >> > Uncaught exception from user code:
> >> > Can't locate object method "type" via package
> >> > "RPC::XML::Client::send_request: HTTP server error: Can't connect to
> >> > [6]vlab.uchicago.edu:443 (certificate verify failed)" (perhaps you
> >> > forgot to load "RPC::XML::Client::send_request: HTTP server error:
> >> > Can't connect to [7]vlab.uchicago.edu:443 (certificate verify
> >> > failed)"?) at /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line
> >> > 9121.
> >> > at /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121
> >> > VCL::utils::xmlrpc_call('XMLRPCprocessBlockTime', 15, 1)
> >> > called at /usr/local/vcl-2.2.1/bin/../lib/VCL/blockrequest.pm line
> >> > 373
> >> > VCL::blockrequest::process_block_time(15) called at
> >> > /usr/local/vcl-2.2.1/bin/../lib/VCL/blockrequest.pm line 193
> >> > VCL::blockrequest::process('VCL::blockrequest=HASH(0x358bb68)
> >> > ') called at /usr/local/vcl/bin/vcld line 568
> >> > VCL::vcld::make_new_child('HASH(0x34740f0)') called at
> >> > /usr/local/vcl/bin/vcld line 448
> >> > VCL::vcld::main() called at /usr/local/vcl/bin/vcld line 98
> >> > 2012-10-04
> >> > 02:00:02|3322|blockrequest|State.pm:DESTROY(829)|VCL::blockrequest
> >> > destructor called, address: 358bb68
> >> > 2012-10-04 02:00:02|3322|blockrequest|State.pm:DESTROY(848)|number
> >> > of database handles state process created: 1
> >> > 2012-10-04 02:00:02|20597|vcld:REAPER(718)|VCL process exited for
> >> > reservation <unknown>, PID: 3322, signal: CHLD
> >> > On Thu, Oct 04, 2012 at 08:48:34AM -0400, Aaron Peeler wrote:
> >> >
> >> > Hello Michael,
> >> > The start time for block allocations need to be requested > 30
> >> > minutes
> >> > in advance.
> >> > If folks think this should be adjusted let us know. The original
> >> > thinking was to allow the enough time for the machines to get
> >> > prepared
> >> > for 30+ machines, which is what we've experienced for the average
> >> > number of machines for Blocks allocations.
> >> > Aaron
> >> > On Wed, Oct 3, 2012 at 6:03 PM, Michael Jinks
> >> > <[8][email protected]> wrote:
> >> >
> >> > Thanks, Aaron and Josh.
> >> > I did have some problems in my vcld.conf; the vclsystem account
> >> > hadn't
> >> > been properly configured. So that was clearly part of my trouble.
> >> > Having fixed that, though, I'm still not getting a successful block
> >> > reservation. Based on Josh's remarks about vcld reserving the slots
> >> > "several hours" in advance, I wonder if that's an issue. In order
> >> > to
> >> > test this in a reasonable amount of time, I've been creating a time
> >> > slot
> >> > a few minutes in the future, which then comes and goes with no
> >> > activity
> >> > in the vcld log except 'lastcheckin' lines. I also find, in the
> >> > database:
> >> > mysql> select processed from blockTimes;
> >> > +-----------+
> >> > | processed |
> >> > +-----------+
> >> > | 0 |
> >> > | 0 |
> >> > +-----------+
> >> > 2 rows in set (0.00 sec)
> >> > How far advance do I need to do a block allocation in order for it
> >> > to
> >> > take effect?
> >> > On Wed, Oct 03, 2012 at 03:03:46PM -0400, Josh Thompson wrote:
> >> >
> >> > -----BEGIN PGP SIGNED MESSAGE-----
> >> > Hash: SHA1
> >> > Michael,
> >> > Aaron gave some good info.
> >> > One thing I'll point out is that when you view the status of a block
> >> > allocation through the web interface as you've described, the number
> >> > listed for "Failed" comes from
> >> > (number of computers for block allocation) - (number of computers in
> >> > blockComputers table)
> >> > It doesn't necessarily reflect that VCL attempted to load the
> >> > machines
> >> > and that those loads failed. If vcld never processes the block
> >> > allocation, it will always show the full count in the Failed row.
> >> > When the block allocation is created, an entry is created in the
> >> > blockTimes table for each time slot in which the block allocation
> >> > will
> >> > occur. However, no computers are actually allocated for each time
> >> > slot until several hours before a time slot starts. Several hours
> >> > before any block time, vcld processes that block time to allocate
> >> > the
> >> > computers and populate the blockComputers table for that blockTimes
> >> > entry. It does this by making an XMLRPC call to the web code. You
> >> > can check that vcld has actually processed a block time entry by
> >> > looking at the blockTimes.processed field in the database for a
> >> > given
> >> > time slot.
> >> > Josh
> >> > On 10/03/12 13:59, Aaron Coburn wrote:
> >> >
> >> > Michael,
> >> > You should make sure that /etc/vcl/vcld.conf has the proper values
> >> > set for:
> >> > xmlrpc_username xmlrpc_pass xmlrpc_url
> >> > xmlrpc_username should be set as vclsystem@Local xmlrpc_url should
> >> > be something like:
> >> > [9]https://vcl.myinstitution.edu/index.php?mode=xmlrpccall
> >> > This is all described in the vcld.conf file as well as on this
> >> > page:
> >> > https://cwiki.apache.org/confluence/display/VCL/VCL+2.3+Management+N
> >> > ode+Installation#VCL2.3ManagementNodeInstallation-Configurevcld.conf
> >> > Then be sure to set the password for the user identified above
> >> > (e.g. vclsystem@Local). Instructions are here:
> >> > https://cwiki.apache.org/confluence/display/VCL/VCL+2.3+Management+N
> >> > ode+Installation#VCL2.3ManagementNodeInstallation-Setthevclsystemacc
> >> > ountpasswordforxmlrpcapi
> >> > (The instructions are for version 2.3, but they will work for
> >> > 2.2.1 as well)
> >> > Finally, restart vcld so that it can pick up the new information.
> >> > If that doesn't help, then try inspecting the vcld logfile -- it
> >> > will tell you more about why the reservations fail.
> >> > Aaron
> >> > -- Aaron Coburn Systems Administrator and Programmer Academic
> >> > Technology Services, Amherst College [email protected]
> >> > On Oct 3, 2012, at 1:30 PM, Michael Jinks <[email protected]>
> >> > wrote:
> >> >
> >> > Hi List.
> >> > We have a VCL 2.2 instance running, most things work, but one
> >> > glaring problem is still thwarting me. When I try to create a
> >> > block allocation, the computers always come up "Failed"; this in
> >> > spite of the fact that I can reserve the same image individually
> >> > with no issues.
> >> > I've tried this so far with the admin@Local account, as well as
> >> > my user account which has full admin privileges (or at least as
> >> > close as we can get by assigning rights in the Privileges page).
> >> > I go to the "Block Allocations" page, click "Create New Block
> >> > Allocation", fill in the form, choosing a known-working image in
> >> > the "Environment" dropdown, and a "List of Dates/Times" with one
> >> > time span in the near future. "Submit New Block Allocation" pops
> >> > up a confirmation window, that goes fine, and I have an entry
> >> > "Block Allocations" page under "Manage Block Allocations". Looks
> >> > fine.
> >> > But, if I click an entry under "Your Active Block Allocations",
> >> > under "Current status of computers," the number by "Failed" is
> >> > equal to the number of seats I reserved, and as far as I can tell
> >> > the system never tries to deploy a VM.
> >> > I'm guessing that there is some permission setting or some other
> >> > item within the management node that I've overlooked. Any hints
> >> > on where to look?
> >> > Thanks, -j
> >> > -- Michael Jinks :: [email protected] University of Chicago IT
> >> > Services
> >> >
> >> > - --
> >> > - -------------------------------
> >> > Josh Thompson
> >> > VCL Developer
> >> > North Carolina State University
> >> > my GPG/PGP key can be found at [10]pgp.mit.edu
> >> > All electronic mail messages in connection with State business which
> >> > are sent to or received by this account are subject to the NC Public
> >> > Records Law and may be disclosed to third parties.
> >> > -----BEGIN PGP SIGNATURE-----
> >> > Version: GnuPG v2.0.17 (GNU/Linux)
> >> > iEYEARECAAYFAlBsjBIACgkQV/LQcNdtPQOcyQCfTQE1TzGXVFsIqitpPFrzJrM/
> >> > SoIAn3p/PMlH+mjI0jWpHdwOC12/hwyu
> >> > =gnM2
> >> > -----END PGP SIGNATURE-----
> >> >
> >> > --
> >> > Michael Jinks :: [11][email protected] :: 773-469-9688
> >> > University of Chicago IT Services
> >> >
> >> > --
> >> > Aaron Peeler
> >> > Program Manager
> >> > Virtual Computing Lab
> >> > NC State University
> >> > All electronic mail messages in connection with State business which
> >> > are sent to or received by this account are subject to the NC Public
> >> > Records Law and may be disclosed to third parties.
> >> >
> >> > --
> >> > Michael Jinks :: [12][email protected] :: 773-469-9688
> >> > University of Chicago IT Services
> >> >
> >> > References
> >> >
> >> > 1. http://search.cpan.org/~rjray/RPC-XML-0.77/
> >> > 2. mailto:[email protected]
> >> > 3. mailto:[email protected]
> >> > 4. http://vlab.uchicago.edu/
> >> > 5. http://vlab.uchicago.edu/
> >> > 6. http://vlab.uchicago.edu/
> >> > 7. http://vlab.uchicago.edu/
> >> > 8. mailto:[email protected]
> >> > 9. https://vcl.myinstitution.edu/index.php?mode=xmlrpccall
> >> > 10. http://pgp.mit.edu/
> >> > 11. mailto:[email protected]
> >> > 12. mailto:[email protected]
> >>
> >> --
> >> Michael Jinks :: [email protected] :: 773-469-9688
> >> University of Chicago IT Services
> >
> > --
> > Michael Jinks :: [email protected] :: 773-469-9688
> > University of Chicago IT Services
>
>
>
> --
> Aaron Peeler
> Program Manager
> Virtual Computing Lab
> NC State University
>
> All electronic mail messages in connection with State business which
> are sent to or received by this account are subject to the NC Public
> Records Law and may be disclosed to third parties.
--
Michael Jinks :: [email protected] :: 773-469-9688
University of Chicago IT Services