I find the Job Status page increasingly difficult to use. Here are the
problems I have with it:

 

1)       It shows a lot of Jobs, but there is no way of knowing (other
than drawing possibly incorrect conclusions from the file name) what
service these jobs relate to

2)       It distinguishes between file replication and data replication
but does not tell me what the difference is

3)       It shows a start time and a stop time and they are always the
same

4)       When a status 'failed' is reported, the error/warning colum is
empty

5)       It does not show what server / host the file is replicated to

6)       It does not allow to do anything about failed replications

7)       It is possible to manually initiate replication by re-sending
profiles (either for a server or for devices). The Job Status page does
not say so and does not provide an easy way of getting to the right
place

 

Isn't it true that replications can fail because the server the file or
data needs to be replicated to is temporarily unavailable?  What is the
recovery procedure?

 

With an increasing number of 'Jobs' I think it would be better to use
this page to only show errors or warnings. Failure or warning status
needs to be correlated with the service, process and host involved in
the replication. The admin should be given info about the recovery
procedure and status. If auto-recovery is available, the admin should be
told that there is not action necessary other than getting the server
back online. If manual intervention is required, it should be possible
to take action right from the Job Status screen (i.e. if replication
needs to be manually re-initiated, there should be a button to do so). 

 

In a clustered system the desired behavior would be that it, to the
utmost extent possible, tries to selv-recover from problems. In
particular, if hosts that participate in the cluster are temporarily
unavailable for whatever reason, the cluster management system should
recover automatically and not require any admin intervention. An alarm
should be raised to notify the admin of the fact that a host was
unavailable and needed recovery.

 

Comments?

 

--martin

 

_______________________________________________
sipx-dev mailing list [email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-dev
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev
sipXecs IP PBX -- http://www.sipfoundry.org/

Reply via email to