Hi, I searched web, and find some other people also reported that the HOD doesn't automatically delete the jobs that it submitted, which means, these jobs can still be seen from the command "qstat".
Is this normal? Regards Song Liu On Mon, Mar 15, 2010 at 8:52 PM, Song Liu <lamfeeli...@gmail.com> wrote: > Thanks Peeyush, > I tried your solution, it sometimes works. I guess it should be a bug of > torque, because I also see jobs in queue after I delete it using qdel. I > will keep an eye on this issue. > > The second is a little tough. I tried to allocate more than 3 nodes, but > the error remains. I compared the two clusters and found they use different > torque software. The one HOD works uses torque 2.3.3, but the later one uses > 2.3.7 and Python 2.6 was installed on that cluster. > > BTW, I'm a bit confused about the Cluster Name in config file, it says to > take the value of "Cluster Name", but where can I see this attribute for my > cluster ? I use the showbf command, it shows: > > Partition Tasks Nodes StartOffset Duration StartDate > --------- ----- ----- ------------ ------------ -------------- > ALL 497 72 00:00:00 INFINITY 20:50:22_03/15 > main 265 43 00:00:00 INFINITY 20:50:22_03/15 > test 232 29 00:00:00 INFINITY 20:50:22_03/15 > > is the "Partition" column tells about the cluster name? > > Thanks a lot! > > Regards > Song Liu > > On Mon, Mar 15, 2010 at 6:03 PM, Peeyush Bishnoi > <peeyu...@yahoo-inc.com>wrote: > >> Song, >> >> For answer to question 1. >> Before deallocating the node( hod deallocate ...) are you setting the >> HADOOP_CONF_DIR to hod allocated directory. >> >> For answer to question 2. >> Yes hod need mininum 3 nodes. As one node will run with JobTracker, second >> node for Namenode and third and more number of nodes will be tasktracker and >> datanode. >> >> If you have any questions please let me know :-) >> >> Thanks, >> --- >> Peeyush >> ________________________________________ >> From: Song Liu [lamfeeli...@gmail.com] >> Sent: Monday, March 15, 2010 8:25 PM >> To: common-user@hadoop.apache.org >> Subject: 2 HOD Questions >> >> Hi all, I have two questions about HOD >> >> 1. I confiured and setup a HOD on one cluster, it works fine, but when I >> finished jobs and deallocated the nodes, I found my jobID can still be >> seen >> using "qstat", until I kill them using "qdel". Is this normal? or do I >> have >> to do this manually, or leave with that? >> >> 2. I failed to configure the HOD on another cluster using the approach >> shown in userguide. It fails when allocating the nodes, and shows this >> error: >> >> Uncaught Exception : need more than 2 values to unpack >> >> Did anyone met this message before? >> >> BTW, to make the hod script work, I commented the line 576 which is >> "finally". >> >> Thanks >> >> Song Liu >> > >