Hi all,
since a few weeks, we have repeatedly had problems with one Coda client
that doesn't seem to push his updates to the server. We have monitoring
on every client and get a call when the CML entries go over 25. I've
found a (from what I think) local/global conflict. I'll just post some
info, not sure what you need to be able to point me in the right direction.
We have two servers and currently about 8 clients. The problem client is
called cmp06. The volume with the conflict is named cmpprod. This
already happened before. The actions we resorted to the last two times
were stop all apps using files in /coda, stop venus, de-install venus
and "rm -rf /var/log/coda /var/lib/coda /var/cache/coda" and then
reinstall venus again from scratch. This worked for a while,
modifications were correctly pushed to the servers and showed up on
other clients.
Output of commands run on cmp06:
root@cmp06:/# ctokens
Tokens held by the Cache Manager for root:
@nkh.spup.net
Coda user id: 10001
Expiration time: Sat Apr 2 21:37:02 2011
root@cmp06:/# cfs cs
Contacting servers .....
All servers up
root@cmp06:/# cfs lv /coda/nkh.spup.net/cmpprod
Status of volume 7f000004 (2130706436) named "cmpprod"
Volume type is ReadWrite
Connection State is Reachable
Reintegration age: 0 sec, time 15.000 sec
Minimum quota is 0, maximum quota is unlimited
Current blocks used are 2965098
The partition has 7823104 blocks available out of 11756312
*** There are pending conflicts in this volume ***
There are 30 CML entries pending for reintegration (3617288 bytes)
The command cfs listlocal /coda/nkh.spup.net/cmpprod never returns and
gives no output at all (waited for a little over 30 minutes)
The directory containing the conflict shows:
root@cmp06:/coda/nkh.spup.net/cmpprod/voicemail/company8184892/217684#
ls -alFh 20110401-130733-31546631891-1301656053.482641-386.wav
lrw-r--r-- 1 root nogroup 29 Apr 1 20:41
20110401-130733-31546631891-1301656053.482641-386.wav ->
@7f000004.000035ce.00002610@n
The client has coda-client 6.9.5 installed from your Debian package, the
servers have coda-server and coda-update Debian packages with version 6.9.4.
The /var/log/coda/venus.log is filled with entries like these:
[ W(177) : 0000 : 21:08:54 ] WAIT OVER, elapsed = 5005.9
[ W(177) : 0000 : 21:08:54 ] WAITING(VOL): cmpprod, state = Reachable,
[0, 0], counts = [0 0 5 0]
[ W(177) : 0000 : 21:08:54 ] CML= [30, 103], Res = 0
[ W(177) : 0000 : 21:08:54 ] WAITING(VOL): shrd_count = 0, excl_count =
0, excl_pgid = 0
And the /var/log/coda/venus.err contains:
21:00:02 volume cmpprod has unrepaired local subtree(s), skip
checkpointing CML!
21:02:27 DispatchWorker: signal received (seq = 654736)
21:10:02 volume cmpprod has unrepaired local subtree(s), skip
checkpointing CML!
So I executed repair with the following transcript:
root@cmp06:/coda/nkh.spup.net/cmpprod/voicemail/company8184892/217684#
repair
This repair tool ... <cropped> ... the current repair session.
repair > beginrepair
Pathname of object in conflict? []:
/coda/nkh.spup.net/cmpprod/voicemail/company8184892/217684/20110401-130733-31546631891-1301656053.482641-386.wav
And is does not give any results, already waited for over 10 minutes
now. The directory listing doesn't show any expanded replicas, only the
broken symlink. The other clients all show the above mentioned file with
a size of 0 bytes.
I'm not sure whether this is too much, too little or "sufficient" debug
info. If anyone needs more info, please let me know so I can provide it.
Thank you very much in advance for your effort.
Kind regards,
Simon de Hartog
Special Technical Services
SpeakUp B.V.
http://www.speakup.nl/