I've been running HOD with hadoop-0.20.0 successfully on our Torque cluster and it runs map-reduce jobs as advertised. The only hiccup comes when I try to deallocate the cluster. I run the following command:

> hod deallocate path/to/my/cluster_config

The HOD executable cleans up the cluster_config directory, but it doesn't seem to actually terminate the actual HOD process job in the Torque queue. I can still see it if I run "qstat" and it doesn't go away, even after waiting a few minutes. Eventually, I have to kill the task with "qdel".

Shouldn't the deallocate operation automatically stop the HOD job? Is this a bug? Something I haven't configured correctly? How might I go about debugging what's wrong if it is a bug?

This obviously isn't mission critical, since everything else is working correctly and qdel seems to do the trick either way. Mostly I'm just curious.

More details and version information:

Fedora 7
hadoop-0.20.0
torque-2.1.10-1.fc7
torque-client-2.1.10-1.fc7
torque-scheduler-2.1.10-1.fc7
libtorque-2.1.10-1.fc7
torque-gui-2.1.10-1.fc7
torque-docs-2.1.10-1.fc7
torque-server-2.1.10-1.fc7

Thanks,
Brian

Reply via email to