Yes, I am running both at same time. But I tried only 1.1.0 version to
check the performance.But, due to unstable behaviour I had to run DUCC
1.0.0 and DUCC 1.1.0 at the same time. I am running DUCC 1.0.0 for
running Jobs and DUCC 1.1.0 for testing purpose.
Do I need to increase heartbeats timing to greater than to 60 sec?
Signature
**Reshu.
On 12/05/2014 05:57 PM, Lou DeGenaro wrote:
You can fetch the latest code containing the bug fix from SVN and build
your own snapshot. However, this bug is of minimal impact so there is no
pressing need to do so.
Are you trying to run 1.0 and 1.1 at the same time? This can be very
tricky. You need to be sure of no overlaps. I highly recommend that you
pick one or the other.
Lou.
On Fri, Dec 5, 2014 at 6:31 AM, reshu.agarwal <[email protected]>
wrote:
Dear Lou,
Thanks for confirming this.
Is Bug fixing version available for use?
What can be the reason of delaying in heartbeats? Because machines were
not able to send heartbeats with in 60 seconds so node gets down in DUCC
1.1.0 but DUCC 1.0.0 is working fine on same machines.
My master node is physical and client is on virtual. Can this be a reason
for delaying in heartbeats as well as increase of processing time of job?
Thanks.
Reshu.
On 12/05/2014 04:45 PM, Lou DeGenaro wrote:
Each node has a DUCC Agent daemon that sends heartbeats.
There was a bug discovered after the release of 1.1 whereby the share
calculation is incorrect (a rounding up problem that you observe). The
impact of this bug should be minimal. The bug has been fixed.
Lou.
On Fri, Dec 5, 2014 at 12:41 AM, reshu.agarwal <[email protected]>
wrote:
Lou,
How can a node send heartbeats in DUCC? If you can tell me this I will be
able to identify problem of down in my nodes.
The other problem which I am facing is:
Memory(GB):total : 31
Memory(GB):usable : 16
Shares:total : 8
Shares:inuse : 9
Means actual RAM which is available is 30 GB so shares available should
be
15(2GB per share) but it is showing Memory(GB):usable : 16 and
Shares:total : 8.
In DUCC 1.0.0, I don't face this problem.
Please explain me its reason.
Reshu.
On 12/04/2014 06:42 PM, Lou DeGenaro wrote:
Which of these are no understandable? If you hover over the column
heading
a little more explanation is given (though not much).
For example, If you hover over Heartbeat(last) you'll see "The elapsed
time
(in seconds) since the last heartbeat". This should usually be around
60
seconds. On the system I'm looking at live presently, I see a range
from
9
to 66. If the number gets too large, the DUCC system will consider the
node down. As best as I can tell, it looks like your numbers are 58 &
59
which is perfect.
Lou.
On Thu, Dec 4, 2014 at 7:41 AM, reshu.agarwal <[email protected]
wrote:
Hi,
Please look this stats:
/ Status Name Memory(GB):usable Memory(GB):total
Swap(GB):inuse
Swap(GB):free Alien PIDs Shares:total Shares:inuse
Heartbeat
(last)
Total 58 70
0 29 9 29
3
up S144 36 39
0 20 8 18 2
59
down S143 22 31
0 9 1 11 11
58
/
I am not able to understand this stats.
Please help.
Reshu.