Hello all, anyone has seen this behaviour in his/her cluster?
I rotate log every day, so I have diff files for each day. The files are totally different: # wc -l accounting-20140414 accounting-20140415 44984 accounting-20140414 76448 accounting-20140415 # md5sum accounting-20140414 accounting-20140415 91f3dd1c7c71515fe913591c3ede063d accounting-20140414 8255190e16e87b54cea30670be56f1b6 accounting-20140415 But many entries (106720) are present in both files: # qacct -j 8786048 -f accounting-20140414 |md5sum d529d3e3c309069f5a1cd753e6f86f75 - # qacct -j 8786048 -f accounting-20140415 |md5sum d529d3e3c309069f5a1cd753e6f86f75 - # qacct -j 8786048 -f accounting-20140414|grep time qsub_time Sat Apr 12 16:19:42 2014 start_time Sat Apr 12 16:20:03 2014 end_time Sat Apr 12 16:21:14 2014 ru_utime 69.866 ru_stime 0.629 # qacct -j 8786048 -f accounting-20140415|grep time qsub_time Sat Apr 12 16:19:42 2014 start_time Sat Apr 12 16:20:03 2014 end_time Sat Apr 12 16:21:14 2014 ru_utime 69.866 ru_stime 0.629 I've been looking for any log reference of this job in master: [root@ant-master2 qmaster]# zgrep 8786048 messages-201404* [root@ant-master2 qmaster]# or in the node (hostname node-hp0214) : [root@node-hp0214 ~]# zgrep 8786048 /var/spool/gridengine/node-hp0214/messages-201404* [root@node-hp0214 ~]# shouldn't I find some job log entry in node? what could be generating this duplicated entry in log file? TIA, Arnau _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
