Hi all,

since a few weeks, we have repeatedly had problems with one Coda client that doesn't seem to push his updates to the server. We have monitoring on every client and get a call when the CML entries go over 25. I've found a (from what I think) local/global conflict. I'll just post some info, not sure what you need to be able to point me in the right direction.

We have two servers and currently about 8 clients. The problem client is called cmp06. The volume with the conflict is named cmpprod. This already happened before. The actions we resorted to the last two times were stop all apps using files in /coda, stop venus, de-install venus and "rm -rf /var/log/coda /var/lib/coda /var/cache/coda" and then reinstall venus again from scratch. This worked for a while, modifications were correctly pushed to the servers and showed up on other clients.

Output of commands run on cmp06:

root@cmp06:/# ctokens
Tokens held by the Cache Manager for root:
    @nkh.spup.net
        Coda user id:    10001
        Expiration time: Sat Apr  2 21:37:02 2011
root@cmp06:/# cfs cs
Contacting servers .....
All servers up
root@cmp06:/# cfs lv /coda/nkh.spup.net/cmpprod
  Status of volume 7f000004 (2130706436) named "cmpprod"
  Volume type is ReadWrite
  Connection State is Reachable
  Reintegration age: 0 sec, time 15.000 sec
  Minimum quota is 0, maximum quota is unlimited
  Current blocks used are 2965098
  The partition has 7823104 blocks available out of 11756312
  *** There are pending conflicts in this volume ***
  There are 30 CML entries pending for reintegration (3617288 bytes)

The command cfs listlocal /coda/nkh.spup.net/cmpprod never returns and gives no output at all (waited for a little over 30 minutes)

The directory containing the conflict shows:
root@cmp06:/coda/nkh.spup.net/cmpprod/voicemail/company8184892/217684# ls -alFh 20110401-130733-31546631891-1301656053.482641-386.wav lrw-r--r-- 1 root nogroup 29 Apr 1 20:41 20110401-130733-31546631891-1301656053.482641-386.wav -> @7f000004.000035ce.00002610@n

The client has coda-client 6.9.5 installed from your Debian package, the servers have coda-server and coda-update Debian packages with version 6.9.4.

The /var/log/coda/venus.log is filled with entries like these:

[ W(177) : 0000 : 21:08:54 ] WAIT OVER, elapsed = 5005.9
[ W(177) : 0000 : 21:08:54 ] WAITING(VOL): cmpprod, state = Reachable, [0, 0], counts = [0 0 5 0]
[ W(177) : 0000 : 21:08:54 ] CML= [30, 103], Res = 0
[ W(177) : 0000 : 21:08:54 ] WAITING(VOL): shrd_count = 0, excl_count = 0, excl_pgid = 0

And the /var/log/coda/venus.err contains:
21:00:02 volume cmpprod has unrepaired local subtree(s), skip checkpointing CML!
21:02:27 DispatchWorker: signal received (seq = 654736)
21:10:02 volume cmpprod has unrepaired local subtree(s), skip checkpointing CML!

So I executed repair with the following transcript:
root@cmp06:/coda/nkh.spup.net/cmpprod/voicemail/company8184892/217684# repair
This repair tool ... <cropped> ... the current repair session.
repair > beginrepair
Pathname of object in conflict? []: /coda/nkh.spup.net/cmpprod/voicemail/company8184892/217684/20110401-130733-31546631891-1301656053.482641-386.wav

And is does not give any results, already waited for over 10 minutes now. The directory listing doesn't show any expanded replicas, only the broken symlink. The other clients all show the above mentioned file with a size of 0 bytes.

I'm not sure whether this is too much, too little or "sufficient" debug info. If anyone needs more info, please let me know so I can provide it.

Thank you very much in advance for your effort.

Kind regards,
Simon de Hartog
Special Technical Services
SpeakUp B.V.
http://www.speakup.nl/

Reply via email to