Our testing steps are these:

- make the reservation/request
- wait for connect button
- click connect button
- connect to the machine
- touch a file for uniqueness purposes
- then you can hit the create image button
- indicate we want to update the image (not create a new image from this one).

Still looking into why the child create to handle the original imaging reservation/request continues to try to ssh continuously instead of marking the imaging event as in use like it should. That log is attached; it's towards the end, and really shows itself with repeated messages like:

|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:21|31467|93:93|reserved| utils.pm:run_ssh_command(6807)|failed to run SSH command after 3 attempts, command: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -x test_vm 'who' 2>&1, exit status: 255, output: |31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)


Brian




Brian Bouterse
Secure Open Systems Initiative
919.698.8796




On Feb 6, 2009, at 3:41 PM, Aaron Peeler wrote:

What are your steps when you start the imaging process?

You have to do the following:
- make the reservation/request
- wait for connect button
- click connect button
- connect to the machine
- then you can hit the create image button to start the imaging process.

If your in the reserved state as reported in the log file below - the process is polling the node looking for the user to connect. Once connected then it proceeds to the inuse state.

This is actually a user interface bug. We're assuming users will actually login to the remote node and install something - but in your case if your just testing I can see going straight to the image creation mode. Created JIRA issue VCL-84.

Under normal operation you don't need to clean on the computerloadlog. Since your developing and likely having to break things, it's going to behave a little bit different.

Aaron

--On February 6, 2009 3:12:16 PM -0500 Andrew Brown <brow...@gmail.com> wrote:

So we have things mostly working except for one detail: When we check out a computer for capture through "update image", all that goes fine. But
when we go and hit "create" to actually capture the image, vcld hangs
with the log messages that Brian posted, repeating.

If we go in and remove the database entry in the computerloadlog table,
the capture goes through; however, there seems to be a process that
lingers and keeps trying to ssh to the machine even when our capture is
proceeding. It gives us log entries like this:
2009-02-06
14:56:57|31467|93:93|reserved|utils.pm:check_connection(1828)| checking
for connection by administrator on test_vm, attempt 4
2009-02-06
14:56:57|31467|93:93|reserved|utils.pm:run_ssh_command(6729)| executing
SSH command on test_vm: netstat -an
2009-02-06
14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
signal: 0, core dump: 0, exit status: 0
2009-02-06
14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH command
executed on test_vm: netstat -an, returning (0, output)
2009-02-06
14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6729)| executing
SSH command on test_vm: who
2009-02-06
14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
signal: 0, core dump: 0, exit status: 0
2009-02-06
14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6796)| run_ssh_comm
and output: admin pts/0 Feb 6 15:59 (152.14.244.35)
2009-02-06
14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH command
executed on test_vm: who, returning (0, output)
(the pid being different from the one that is going into our capture
routine)

This may be unrelated, I don't know. But we need to fix it so we don't need to manually delete the computerloadlog entry each time, as well as
fix it so this second process... goes away or something.

-Andrew

On Fri, Feb 6, 2009 at 1:23 PM, Aaron Peeler <aaron_pee...@ncsu.edu>
wrote:

Actually it's computerloadlog

The computerloadlog table has three purposes.
It also has two dependent tables computerloadflow and computerloadstate


1) Provides information to the user about what state user request is in,
where it's at in loading, performing post install tasks, adding user
acct, etc. This information is seen on the current reservations page by
the user clicking on the pending link. This link opens a ajax based
window and will provided the state the load is in. At some point during
your load routine - you'll want to add these type of calls in the
appropriate places.

insertloadlog($reservation_id, $computer_id, "startload",
"$computer_node_name $image_name");

But this isn't required to get a provision module functional. It's a nice feature, so the user can be aware of what's going on. We'll put together
information on how to best use the insertloadlog.

2) If a new|reload request is started and the computer state is reload or reloading. The forked request process will use that table to determine what to do. Either take over, fail, or let the current owning process
complete.

3) It is also used in vcld to prevent duplicate or competing processes.

It gets cleaned out on a per-reservation basis. When a reservations goes
into the inuse state and when a reservation is removed from the db.

In your case, since your just testing your module, just manually empty
the computerloadlog table as needed.

Aaron



--On February 6, 2009 11:20:47 AM -0500 Brian Bouterse
<bmbou...@ncsu.edu> wrote:

What is the purpose of the loadlog table?  Currently the ESX
provisioning
module doesn't update any information in the loadlog table, and the
provisioning works fine. We're working on the capture() portions now,
and when we go through the web interface to have it "capture" our
reservation, vcld warns saying:

| 25052|72:72|image| ---- WARNING ----
| 25052|72:72|image| 2009-02-06
| 11:19:56|25052|72:72|image|vcld:main(272)|reservation 72 is already | being processed 25052|72:72|image| ( 0) utils.pm, notify (line: 683)
| 25052|72:72|image| (-1) vcld, main (line: 272)

Is this normal for the "image update" process?

Best,
Brian

Brian Bouterse
Secure Open Systems Initiative
919.698.8796



\

Reply via email to