Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-11-27 Thread István Kuklin
No problem, I think it's very hard to find out what the problem is if you
don't have direct access to the affected machines.
I reinstalled one of the machines, so there is one more machine running
Debian Jessie, and it is affected with the bug. Nobody uses this machine,
so I won't reinstall it.
These machines were affected and since one of them is still running Jessie,
it is still affected, so it is an ideal subject :)


2014-11-20 4:09 GMT+01:00 Benjamin Kaduk ka...@mit.edu:

 On Sun, 16 Nov 2014, István Kuklin wrote:

  I reinstalled my machine and switched from Debian to Ubuntu. It seems
 that
  it isn't affected with this bug, shutdown is quick now.

 This is unsurpising, but good to hear confirmed.
 I'm sorry that you're having such troubles with Debian and I'm failing to
 debug them effectively.

  I still have two machines running Debian testing, so I am still able
  collect logs if you wish.

 Just to clairfy: these two machines also suffer from the slow
 shutdown/reboot?


 I will try to convince some systemd experts to help out on this bug, as I
 seem to be at the limits of my understanding.

 -Ben


Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-11-20 Thread Benjamin Kaduk
Asking around on IRC, I was linked to
http://freedesktop.org/wiki/Software/systemd/Debugging/#shutdowncompleteseventually
which has a little bit of advice for debugging issues such as this.

-Ben


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-11-19 Thread Benjamin Kaduk
On Sun, 16 Nov 2014, István Kuklin wrote:

 I reinstalled my machine and switched from Debian to Ubuntu. It seems that
 it isn't affected with this bug, shutdown is quick now.

This is unsurpising, but good to hear confirmed.
I'm sorry that you're having such troubles with Debian and I'm failing to
debug them effectively.

 I still have two machines running Debian testing, so I am still able
 collect logs if you wish.

Just to clairfy: these two machines also suffer from the slow
shutdown/reboot?


I will try to convince some systemd experts to help out on this bug, as I
seem to be at the limits of my understanding.

-Ben

Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-11-16 Thread István Kuklin
I reinstalled my machine and switched from Debian to Ubuntu. It seems that
it isn't affected with this bug, shutdown is quick now.
I still have two machines running Debian testing, so I am still able
collect logs if you wish.

2014-11-09 14:57 GMT+01:00 István Kuklin kuklins...@gmail.com:

 I have an rc.local file, but it contains only
 exit 0
 line and some comments before it.
 I tried to specify Before=umount.target, but nothing has changed,
 apparently.
 http://pastebin.com/XckqcvR2
 User number 5000 is a user with AFS home directory.
 Now I'm considering to switch back to Ubuntu; maybe it hasn't got the bug
 as I had no problem with Wheezy as well.


 2014-11-06 21:06 GMT+01:00 Benjamin Kaduk ka...@mit.edu:

 Thanks for
 ​​
 that.

 It looks like the builtin systemd umount.target is trying to unmount /afs
 before or in parallel with the openafs-client.service commands to unmount
 /afs, which is not the desired ordering.  We should be able to specify
 Before=umount.target in openafs-client.service to get a different
 behavior.  (I'm not confident enough in my understanding of systemd yet to
 claim that this should fix the issue.)

 If you want, you can copy /lib/systemd/system/openafs-client.service to
 /etc/systemd/system and make that change locally (the /etc version
 overrides the /lib version), but I will try to get this in a new upload as
 well.

 I'm still confused by the lines:
 nov 04 18:15:36 kingdom-play systemd[1]: user@5000.service stop-sigterm
 timed out. Killing.
 nov 04 18:15:36 kingdom-play systemd[1]: Stopped User Manager for UID
 5000.
 nov 04 18:15:36 kingdom-play systemd[1]: Unit user@5000.service entered
 failed state.

 which account for more than a minute of the delay, but the larger portion
 of the delay seems attributable to the bits which are obviously
 AFS-related.

 I did find http://forums.fedoraforum.org/archive/index.php/t-298680.html
 ,
 which seems to implicate an rc.local file.  Do you have one in place?

 -Ben





Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-11-09 Thread István Kuklin
I have an rc.local file, but it contains only
exit 0
line and some comments before it.
I tried to specify Before=umount.target, but nothing has changed,
apparently.
http://pastebin.com/XckqcvR2
User number 5000 is a user with AFS home directory.
Now I'm considering to switch back to Ubuntu; maybe it hasn't got the bug
as I had no problem with Wheezy as well.


2014-11-06 21:06 GMT+01:00 Benjamin Kaduk ka...@mit.edu:

 Thanks for
 ​​
 that.

 It looks like the builtin systemd umount.target is trying to unmount /afs
 before or in parallel with the openafs-client.service commands to unmount
 /afs, which is not the desired ordering.  We should be able to specify
 Before=umount.target in openafs-client.service to get a different
 behavior.  (I'm not confident enough in my understanding of systemd yet to
 claim that this should fix the issue.)

 If you want, you can copy /lib/systemd/system/openafs-client.service to
 /etc/systemd/system and make that change locally (the /etc version
 overrides the /lib version), but I will try to get this in a new upload as
 well.

 I'm still confused by the lines:
 nov 04 18:15:36 kingdom-play systemd[1]: user@5000.service stop-sigterm
 timed out. Killing.
 nov 04 18:15:36 kingdom-play systemd[1]: Stopped User Manager for UID 5000.
 nov 04 18:15:36 kingdom-play systemd[1]: Unit user@5000.service entered
 failed state.

 which account for more than a minute of the delay, but the larger portion
 of the delay seems attributable to the bits which are obviously
 AFS-related.

 I did find http://forums.fedoraforum.org/archive/index.php/t-298680.html ,
 which seems to implicate an rc.local file.  Do you have one in place?

 -Ben



Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-11-06 Thread Benjamin Kaduk
Thanks for that.

It looks like the builtin systemd umount.target is trying to unmount /afs
before or in parallel with the openafs-client.service commands to unmount
/afs, which is not the desired ordering.  We should be able to specify
Before=umount.target in openafs-client.service to get a different
behavior.  (I'm not confident enough in my understanding of systemd yet to
claim that this should fix the issue.)

If you want, you can copy /lib/systemd/system/openafs-client.service to
/etc/systemd/system and make that change locally (the /etc version
overrides the /lib version), but I will try to get this in a new upload as
well.

I'm still confused by the lines:
nov 04 18:15:36 kingdom-play systemd[1]: user@5000.service stop-sigterm timed 
out. Killing.
nov 04 18:15:36 kingdom-play systemd[1]: Stopped User Manager for UID 5000.
nov 04 18:15:36 kingdom-play systemd[1]: Unit user@5000.service entered failed 
state.

which account for more than a minute of the delay, but the larger portion
of the delay seems attributable to the bits which are obviously
AFS-related.

I did find http://forums.fedoraforum.org/archive/index.php/t-298680.html ,
which seems to implicate an rc.local file.  Do you have one in place?

-Ben


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-11-04 Thread Kuklin István
Here we go:
http://pastebin.com/aCREDCR8


2014. 11. 2, vasárnap keltezéssel 23.11-kor Benjamin Kaduk ezt írta:
 On Fri, 31 Oct 2014, Kuklin István wrote:
 
  I've upgraded the openafs-client package to unstable, but it hasn't
  solved the problem.
  I tested the OpenVPN problem by stopping the openvpn service before
  logging in but the machine hung on shutdown anyway, so it looks like the
  problem isn't with the VPN.
  Maybe just I misconfigured something, because it seems like my case is
  quite special :)
 
 Hard to say.  I should probably mark the bug as re-opened, if you're still
 seeing it with the version from unstable.
 
 Can you grab the journalctl log of a hung shutdown with the new package,
 please?
 
 Thanks,
 
 Ben



smime.p7s
Description: S/MIME cryptographic signature


Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-11-02 Thread Benjamin Kaduk
On Fri, 31 Oct 2014, Kuklin István wrote:

 I've upgraded the openafs-client package to unstable, but it hasn't
 solved the problem.
 I tested the OpenVPN problem by stopping the openvpn service before
 logging in but the machine hung on shutdown anyway, so it looks like the
 problem isn't with the VPN.
 Maybe just I misconfigured something, because it seems like my case is
 quite special :)

Hard to say.  I should probably mark the bug as re-opened, if you're still
seeing it with the version from unstable.

Can you grab the journalctl log of a hung shutdown with the new package,
please?

Thanks,

Ben

Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-10-31 Thread Kuklin István
I've upgraded the openafs-client package to unstable, but it hasn't
solved the problem.
I tested the OpenVPN problem by stopping the openvpn service before
logging in but the machine hung on shutdown anyway, so it looks like the
problem isn't with the VPN.
Maybe just I misconfigured something, because it seems like my case is
quite special :)

2014. 10. 31, péntek keltezéssel 15.17-kor Benjamin Kaduk ezt írta:
 On Fri, 31 Oct 2014, Kuklin István wrote:
 
  Hello there,
 
  Here we go:
  http://pastebin.com/uQ3n21CY
 
 Thanks.  Interestingly, this trace seems to show that the openafs-client
 shut down successfully, as those are the normal shutdown messages and
 systemd seems to think the client shut down successfully.
 
 There are two delays,
 okt 31 16:40:21
 okt 31 16:41:16
 okt 31 16:41:47 kingdom-play systemd[1]: user@5000.service stop-sigterm
 timed out. Killing.
 
 I don't see an obvious cause for the first gap, but the second one is
 clearly a systemd process that is hanging and has to wait for a timeout.
 I gather this user@UID.service job relates to any running systemd
 --user invocation, with configuration in ~/.systemd/.  It's unclear
 whether the user session would be trying to write to ~/.systemd/ at that
 point, though.
 
  I think I've noticed something:
  all the machines run OpenVPN and the AFS server machine's LAN IP address
  192.168.0.2, but it has got another IP address on the VPN: 192.168.99.1.
  Although I've defined the server's IP address
  in /etc/openafs/CellServDB, but as far as I remember I saw somewhere in
  the logs AFS looking for the OpenVPN address...
  So, I think OpenVPN terminates before AFS and so AFS cannot find the
  server on 192.168.99.1, and this causes the system to hang.
 
  What can I do, if that's the problem? Is that possible, even with the
  correct CellServDB file?
 
 It's not clear that that's the problem, but the only thing that comes to
 mind would be to put a NetInfo file on the server so that the wrong
 address isn't registered in the vldb.
 (http://docs.openafs.org/Reference/5/NetInfo.html)
 
 I believe that the systemd unit file for openafs-client in sid (I uploaded
 a new version yesterday that fixes the bug I mentioned) should have the
 ordering directives needed to ensure that the client is shut down after
 user sessions (what I mentioned above) and before the network is shut down
 (which Andrew mentioned previously on the ticket).
 
 -Ben
 
 P.S. I see that we dropped the bug address from the cc list, which is
 reasonable given the pastebin that was linked here.  It's probably best to
 forward at least the rest of these messages onto the bug so the history is
 recorded, though


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-10-28 Thread Benjamin Kaduk
On Sat, 6 Sep 2014, Kuklin István wrote:

 Okay, if the bug has gone, I'll report it.

The 1.6.10-1 in unstable has a unit file for the client.  (It also has a
RC bug against it; stopping the client before taking the upgrade should be
a valid workaround.)


However, before you take the upgrade, I figured out (while reading up on
systemd) how to collect the shutdown messages I was interested in.  With
systemd, the shutdown messages are logged in the journal, which can be
stored across boot, depending on the configuration of journald.  In the
default configuration, you need to create the directory /var/log/journal
for the logs to be stored.

Starting from the next boot after that, the logs should be kept, so after
the following reboot (and hang), you can then use 'journalctl --full
--system -b -1' to get logs from the previous boot and shutdown.  I would
be interested in seeing those, in the vicinity of the openafs-client
shutdown messages, if you can.

Thanks,

Ben

Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-06 Thread Kuklin István
Okay, if the bug has gone, I'll report it.

Thank you for your help!
István

2014. 09. 5, péntek keltezéssel 14.39-kor Benjamin Kaduk ezt írta:
 On Fri, 5 Sep 2014, Kuklin István wrote:
 
  Last time the console wrote:
  [ ***  ] A stop job is running for User Manager for 5000
  Does that mean something?
 
 I don't know what it means, offhand.
 
  
   At this point, I feel like the best step forward is going to be to use a
   proper systemd unit file for the client, instead of relying on the
   compatibility shims for sysvinit scripts, since there doesn't seem to be
   an obvious way to further debug exactly what's happening at the moment.  I
   don't think I have an ETA for when that might happen, though.
  How should I do that exactly? :)
 
 I think it's something that the package maintainers need to do, not
 something that you need to do.  Some future version of the package will
 include a systemd unit file for the openafs-client, and we will want to
 come back to this ticket and re-test the shutdown behavior with that new
 version of the package.
 
 -Ben



signature.asc
Description: This is a digitally signed message part


Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-05 Thread Kuklin István
I have already removed the quiet parameter from /etc/default/grub, and
setting LogLevel=debug hasn't changed anything, it doesn't give more
information.
Last time the console wrote:
[ ***  ] A stop job is running for User Manager for 5000
Does that mean something?

2014. 09. 4, csütörtök keltezéssel 22.07-kor Benjamin Kaduk ezt írta:
 On Thu, 4 Sep 2014, Kuklin István wrote:
 
  I've found another clue:
  The shutdown problem initializes itself only if I cd to the afs share
  (after kinit and aklog). Without that, shutdown is quick.
  Here are some links to some pictures I took:
  I see this on shutdown if I mount the share from a tty with a local
  account (using kinit a central account, aklog, then cd):
  http://pbrd.co/1o246kL
  When it hangs, it looks like this(sorry for the quality):
  http://pasteboard.co/2Nv06Lqd.jpg
  Once it looked like this:
  http://pasteboard.co/2Nv6A7q0.jpg
  http://pasteboard.co/2Nv7OMtt.jpg
  Here is a video: http://youtu.be/sAc44PtsJds
 
 Thanks for putting in the time to capture all this data, I really
 appreciate the effort.  I don't see an obvious smoking gun, but there
 are at least a couple of hints.
 I have remotved 'quiet' from my kernel command line (/etc/default/grub)
 and set LogLevel=debug in /etc/systemd/system.conf to try and get
 more/better diagnostics.
 
 Also, using shutdown -H should leave the last messages visible without
 powering off.  (I'm not sure that the very last messages are going to be
 helpful, though.)
 
  In the video I'm logged in with a central profile (I use PAM modules for
  AFS home directories), which can sudo on that machine. When I'm shutting
  it down, you see what happens. It's not best quality and a couple of
  lines are missing from the picture at the ending so if you wish, I can
  record it again, for example the whole screen without moving the
  camcorder.
  Note that I'm using the same machine in the video as before, I've just
  replaced the machine name earlier to client1 for better understanding.
 
  Please try to reproduce the problem by cd-ing to the afs share.
 
 I had cd-ed into /afs in my previous attempts, though maybe I did not have
 an active shell still there during the reboot attempts.
 
Note that I don't have to stay in the directory, it is enough to cd into
it once, so it's okay if you don't have an active shell with that
directory.
 Even now, when I halt the system with root's shell in /afs/..., I do not
 see a noticably longer shutdown time than when AFS has not been used.  I
 can, however, reproduce some of the hints I mentioned above.  Well,
 sometimes.  It doesn't seem fully deterministic.
 
 In particular, there is a diagnostic about unmounting /afs failing, and
 later on a note that a cold shutdown is being performed (these two are
 related).  You had a message AFS isn't unmounted yet! Call aborted,
 which is another indicator of this, since it is what happens when a
 shutdown syscall is issued but shutdown is already in progress (but
 incomplete).
 
 At this point, I feel like the best step forward is going to be to use a
 proper systemd unit file for the client, instead of relying on the
 compatibility shims for sysvinit scripts, since there doesn't seem to be
 an obvious way to further debug exactly what's happening at the moment.  I
 don't think I have an ETA for when that might happen, though.
How should I do that exactly? :)
I'm sorry, I always find new things in Linux.
 
 -Ben



signature.asc
Description: This is a digitally signed message part


Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-05 Thread Benjamin Kaduk
On Fri, 5 Sep 2014, Kuklin István wrote:

 Last time the console wrote:
 [ ***  ] A stop job is running for User Manager for 5000
 Does that mean something?

I don't know what it means, offhand.

 
  At this point, I feel like the best step forward is going to be to use a
  proper systemd unit file for the client, instead of relying on the
  compatibility shims for sysvinit scripts, since there doesn't seem to be
  an obvious way to further debug exactly what's happening at the moment.  I
  don't think I have an ETA for when that might happen, though.
 How should I do that exactly? :)

I think it's something that the package maintainers need to do, not
something that you need to do.  Some future version of the package will
include a systemd unit file for the openafs-client, and we will want to
come back to this ticket and re-test the shutdown behavior with that new
version of the package.

-Ben

Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-05 Thread Andrew Deason
On Fri, 5 Sep 2014 14:39:03 -0400
Benjamin Kaduk ka...@mit.edu wrote:

 On Fri, 5 Sep 2014, Kuklin István wrote:
 
  Last time the console wrote:
  [ ***  ] A stop job is running for User Manager for 5000
  Does that mean something?
 
 I don't know what it means, offhand.

If it's not clear, this isn't a message from openafs; it's something for
systemd (I think) but I'm not exactly sure what it means.

And sorry if I'm butting in without reading this in detail, but maybe
one possibility is that we're hanging on trying to access the net, and
the local interface is down. Specifically, the shutdown process tries to
stop the openafs-client service, and it fails (because something is
accessing /afs). Later on, the shutdown process stops all processes,
which includes stopping networkmanager which takes down the interface.
Then we try to umount all filesystems, which means umounting /afs, which
can mean hitting the net (giving up callbacks, or flushing certain
things). And the afs client hangs on trying to access the net for a
while.

I'm not sure at the moment of an easy way of verifying if that is what
is going on, but it's just an idea.

-- 
Andrew Deason
adea...@sinenomine.net


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-04 Thread Kuklin István
I've found another clue:
The shutdown problem initializes itself only if I cd to the afs share
(after kinit and aklog). Without that, shutdown is quick.
Here are some links to some pictures I took:
I see this on shutdown if I mount the share from a tty with a local
account (using kinit a central account, aklog, then cd):
http://pbrd.co/1o246kL
When it hangs, it looks like this(sorry for the quality):
http://pasteboard.co/2Nv06Lqd.jpg
Once it looked like this:
http://pasteboard.co/2Nv6A7q0.jpg
http://pasteboard.co/2Nv7OMtt.jpg
Here is a video: http://youtu.be/sAc44PtsJds
In the video I'm logged in with a central profile (I use PAM modules for
AFS home directories), which can sudo on that machine. When I'm shutting
it down, you see what happens. It's not best quality and a couple of
lines are missing from the picture at the ending so if you wish, I can
record it again, for example the whole screen without moving the
camcorder.
Note that I'm using the same machine in the video as before, I've just
replaced the machine name earlier to client1 for better understanding.

Please try to reproduce the problem by cd-ing to the afs share.

Thank you for your help!
István


2014. 09. 3, szerda keltezéssel 00.30-kor Benjamin Kaduk ezt írta:
 On Tue, 2 Sep 2014, Kuklin István wrote:
 
  Okay, here is a complete one from the booting to shutting down:
  http://pastebin.com/tApVAfM1
 
 Thanks for this.  On first glance, I don't see anything that looks
 suspicious or particularly relevant.  It looks like the syslog has stopped
 when the shutdown started, so anything that may have happened after that
 didn't make it to the log.  Of course, those are just the parts that we
 would be most interested in.
 
 Can you arrange to be watching the console of an affected machine during
 the hang?
 
 My local jessie VM seems to reboot quickly after having had AFS mounted,
 so I don't seem to be able to reproduce the issue at the moment.
 
 -Ben



signature.asc
Description: This is a digitally signed message part


Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-04 Thread Benjamin Kaduk
On Thu, 4 Sep 2014, Kuklin István wrote:

 I've found another clue:
 The shutdown problem initializes itself only if I cd to the afs share
 (after kinit and aklog). Without that, shutdown is quick.
 Here are some links to some pictures I took:
 I see this on shutdown if I mount the share from a tty with a local
 account (using kinit a central account, aklog, then cd):
 http://pbrd.co/1o246kL
 When it hangs, it looks like this(sorry for the quality):
 http://pasteboard.co/2Nv06Lqd.jpg
 Once it looked like this:
 http://pasteboard.co/2Nv6A7q0.jpg
 http://pasteboard.co/2Nv7OMtt.jpg
 Here is a video: http://youtu.be/sAc44PtsJds

Thanks for putting in the time to capture all this data, I really
appreciate the effort.  I don't see an obvious smoking gun, but there
are at least a couple of hints.
I have remotved 'quiet' from my kernel command line (/etc/default/grub)
and set LogLevel=debug in /etc/systemd/system.conf to try and get
more/better diagnostics.

Also, using shutdown -H should leave the last messages visible without
powering off.  (I'm not sure that the very last messages are going to be
helpful, though.)

 In the video I'm logged in with a central profile (I use PAM modules for
 AFS home directories), which can sudo on that machine. When I'm shutting
 it down, you see what happens. It's not best quality and a couple of
 lines are missing from the picture at the ending so if you wish, I can
 record it again, for example the whole screen without moving the
 camcorder.
 Note that I'm using the same machine in the video as before, I've just
 replaced the machine name earlier to client1 for better understanding.

 Please try to reproduce the problem by cd-ing to the afs share.

I had cd-ed into /afs in my previous attempts, though maybe I did not have
an active shell still there during the reboot attempts.

Even now, when I halt the system with root's shell in /afs/..., I do not
see a noticably longer shutdown time than when AFS has not been used.  I
can, however, reproduce some of the hints I mentioned above.  Well,
sometimes.  It doesn't seem fully deterministic.

In particular, there is a diagnostic about unmounting /afs failing, and
later on a note that a cold shutdown is being performed (these two are
related).  You had a message AFS isn't unmounted yet! Call aborted,
which is another indicator of this, since it is what happens when a
shutdown syscall is issued but shutdown is already in progress (but
incomplete).

At this point, I feel like the best step forward is going to be to use a
proper systemd unit file for the client, instead of relying on the
compatibility shims for sysvinit scripts, since there doesn't seem to be
an obvious way to further debug exactly what's happening at the moment.  I
don't think I have an ETA for when that might happen, though.

-Ben

Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-02 Thread Kuklin István
Thank you for your answer. Unfortunately, I'll not be able to answer so
quick, but I'll do my best.

I think I found something in /var/log/messages, this line appears 4
times:
Sep  2 08:06:17 client1 kernel: [  113.230480] afs: byte-range locks
only enforced for processes on this machine (pid 2430 (zeitgeist-daemo),
user 5000, fid 536870921.300.500).
Sep  2 08:06:36 client1 kernel: [  132.165684] afs: byte-range locks
only enforced for processes on this machine (pid 2409 (tracker-store),
user 5000, fid 536870921.660.863186).
Sep  2 08:06:52 client1 kernel: [  148.373147] afs: byte-range locks
only enforced for processes on this machine (pid 2692 (localStorage DB),
user 5000, fid 536870921.22438.708412).
Sep  2 08:09:09 client1 kernel: [  284.963197] afs: byte-range locks
only enforced for processes on this machine (pid 2692 (localStorage DB),
user 5000, fid 536870921.22438.708412).

And I have to correct myself: the whole rebooting process takes 2 or 3
minutes, according to this line:
Sep  2 08:09:25 client1 rsyslogd: [origin software=rsyslogd
swVersion=8.4.0 x-pid=759 x-info=http://www.rsyslog.com;] exiting
on signal 15.
Sep  2 08:11:47 client1 rsyslogd: [origin software=rsyslogd
swVersion=8.4.0 x-pid=759 x-info=http://www.rsyslog.com;] start
Anyway, shutting down the machine takes still too long...

May I copy a bigger part of /var/log/messages?

István

2014. 09. 1, hétfő keltezéssel 00.42-kor Benjamin Kaduk ezt írta:
 On Sun, 31 Aug 2014, Kuklin István wrote:
 
  There is a network with central LDAP+Kerberos+AFS users.
  If a central user tries to access an afs share, shutting down the client
  is going to take about 3 minutes. It can be done using PAM modules, or
  with a local (non-central) user using kinit ldap+krb5-username, then
  aklog commands.
  If user logs out correctly using unlog and kdestroy, it doesn't solve
  the problem, shutting down is going to take about 3 minutes.
  If I stop openafs-client service and umount /afs before shutdown, it
  doesn't help.
  It affects rebooting as well.
  It seems that the system is trying to stop some User Manager job at
  shutdown as far as I remember.
  This problem affects Debian Jessie, shutdown was quite quick on Wheezy.
  It affects all the client machines.
  I'm writing this report from a client machine.
 
 The kernel messages during the hang (ideally with timestamps) would be
 quite helpful for understanding what's going on here.
 
 I'll have to double-check, but I may only have wheezy and sid machines
 sitting around.  I would expect any issues to also be present on sid, but
 one never knows...
 
 -Ben



signature.asc
Description: This is a digitally signed message part


Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-02 Thread Benjamin Kaduk
On Tue, 2 Sep 2014, Kuklin Istv=C3=A1n wrote:

 Thank you for your answer. Unfortunately, I'll not be able to answer so
 quick, but I'll do my best.
=20
 I think I found something in /var/log/messages, this line appears 4
 times:
 Sep  2 08:06:17 client1 kernel: [  113.230480] afs: byte-range locks
 only enforced for processes on this machine (pid 2430 (zeitgeist-daemo),
 user 5000, fid 536870921.300.500).
 Sep  2 08:06:36 client1 kernel: [  132.165684] afs: byte-range locks
 only enforced for processes on this machine (pid 2409 (tracker-store),
 user 5000, fid 536870921.660.863186).
 Sep  2 08:06:52 client1 kernel: [  148.373147] afs: byte-range locks
 only enforced for processes on this machine (pid 2692 (localStorage DB),
 user 5000, fid 536870921.22438.708412).
 Sep  2 08:09:09 client1 kernel: [  284.963197] afs: byte-range locks
 only enforced for processes on this machine (pid 2692 (localStorage DB),
 user 5000, fid 536870921.22438.708412).

I don't think these are helpful; they are normal messages, and the=20
timestamps are outside of the reboot window you derived from the syslog=20
messages below.

 And I have to correct myself: the whole rebooting process takes 2 or 3
 minutes, according to this line:
 Sep  2 08:09:25 client1 rsyslogd: [origin software=3Drsyslogd
 swVersion=3D8.4.0 x-pid=3D759 x-info=3Dhttp://www.rsyslog.com;] exit=
ing
 on signal 15.
 Sep  2 08:11:47 client1 rsyslogd: [origin software=3Drsyslogd
 swVersion=3D8.4.0 x-pid=3D759 x-info=3Dhttp://www.rsyslog.com;] star=
t
 Anyway, shutting down the machine takes still too long...
=20
 May I copy a bigger part of /var/log/messages?

Please do.  I will see if I can pull up a VM for testing.

-Ben


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-02 Thread Kuklin István
Okay, here is a complete one from the booting to shutting down:
http://pastebin.com/tApVAfM1

2014. 09. 2, kedd keltezéssel 11.17-kor Benjamin Kaduk ezt írta:
 On Tue, 2 Sep 2014, Kuklin Istv=C3=A1n wrote:
 
  Thank you for your answer. Unfortunately, I'll not be able to answer so
  quick, but I'll do my best.
 =20
  I think I found something in /var/log/messages, this line appears 4
  times:
  Sep  2 08:06:17 client1 kernel: [  113.230480] afs: byte-range locks
  only enforced for processes on this machine (pid 2430 (zeitgeist-daemo),
  user 5000, fid 536870921.300.500).
  Sep  2 08:06:36 client1 kernel: [  132.165684] afs: byte-range locks
  only enforced for processes on this machine (pid 2409 (tracker-store),
  user 5000, fid 536870921.660.863186).
  Sep  2 08:06:52 client1 kernel: [  148.373147] afs: byte-range locks
  only enforced for processes on this machine (pid 2692 (localStorage DB),
  user 5000, fid 536870921.22438.708412).
  Sep  2 08:09:09 client1 kernel: [  284.963197] afs: byte-range locks
  only enforced for processes on this machine (pid 2692 (localStorage DB),
  user 5000, fid 536870921.22438.708412).
 
 I don't think these are helpful; they are normal messages, and the=20
 timestamps are outside of the reboot window you derived from the syslog=20
 messages below.
 
  And I have to correct myself: the whole rebooting process takes 2 or 3
  minutes, according to this line:
  Sep  2 08:09:25 client1 rsyslogd: [origin software=3Drsyslogd
  swVersion=3D8.4.0 x-pid=3D759 x-info=3Dhttp://www.rsyslog.com;] exit=
 ing
  on signal 15.
  Sep  2 08:11:47 client1 rsyslogd: [origin software=3Drsyslogd
  swVersion=3D8.4.0 x-pid=3D759 x-info=3Dhttp://www.rsyslog.com;] star=
 t
  Anyway, shutting down the machine takes still too long...
 =20
  May I copy a bigger part of /var/log/messages?
 
 Please do.  I will see if I can pull up a VM for testing.
 
 -Ben
 



signature.asc
Description: This is a digitally signed message part


Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-09-02 Thread Benjamin Kaduk
On Tue, 2 Sep 2014, Kuklin István wrote:

 Okay, here is a complete one from the booting to shutting down:
 http://pastebin.com/tApVAfM1

Thanks for this.  On first glance, I don't see anything that looks
suspicious or particularly relevant.  It looks like the syslog has stopped
when the shutdown started, so anything that may have happened after that
didn't make it to the log.  Of course, those are just the parts that we
would be most interested in.

Can you arrange to be watching the console of an affected machine during
the hang?

My local jessie VM seems to reboot quickly after having had AFS mounted,
so I don't seem to be able to reproduce the issue at the moment.

-Ben

Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-08-31 Thread Kuklin István
Package: openafs-client
Version: 1.6.9-1
Severity: important
Tags: upstream

There is a network with central LDAP+Kerberos+AFS users.
If a central user tries to access an afs share, shutting down the client
is going to take about 3 minutes. It can be done using PAM modules, or
with a local (non-central) user using kinit ldap+krb5-username, then
aklog commands.
If user logs out correctly using unlog and kdestroy, it doesn't solve
the problem, shutting down is going to take about 3 minutes.
If I stop openafs-client service and umount /afs before shutdown, it
doesn't help.
It affects rebooting as well.
It seems that the system is trying to stop some User Manager job at
shutdown as far as I remember.
This problem affects Debian Jessie, shutdown was quite quick on Wheezy.
It affects all the client machines.
I'm writing this report from a client machine.



-- System Information:
Debian Release: jessie/sid
  APT prefers testing-updates
  APT policy: (500, 'testing-updates'), (500, 'testing')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.14-2-amd64 (SMP w/2 CPU cores)
Locale: LANG=hu_HU.UTF-8, LC_CTYPE=hu_HU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages openafs-client depends on:
ii  debconf [debconf-2.0]  1.5.53
ii  libc6  2.19-9
ii  libcomerr2 1.42.11-2
ii  libk5crypto3   1.12.1+dfsg-7
ii  libkrb5-3  1.12.1+dfsg-7
ii  libncurses55.9+20140712-2
ii  libtinfo5  5.9+20140712-2

Versions of packages openafs-client recommends:
ii  lsof  4.86+dfsg-1
ii  openafs-modules-dkms  1.6.9-1

Versions of packages openafs-client suggests:
pn  openafs-doc   none
ii  openafs-krb5  1.6.9-1

-- debconf information:
* openafs-client/cachesize: 5
  openafs-client/afsdb: true
* openafs-client/thiscell: lo
  openafs-client/crypt: true
  openafs-client/cell-info:
  openafs-client/dynroot: Yes
  openafs-client/run-client: true
  openafs-client/fakestat: true


signature.asc
Description: This is a digitally signed message part


Bug#760063: openafs-client: Acessing afs share causes slow shutdown/reboot (about 3 minutes) on Debian Jessie

2014-08-31 Thread Benjamin Kaduk
On Sun, 31 Aug 2014, Kuklin István wrote:

 There is a network with central LDAP+Kerberos+AFS users.
 If a central user tries to access an afs share, shutting down the client
 is going to take about 3 minutes. It can be done using PAM modules, or
 with a local (non-central) user using kinit ldap+krb5-username, then
 aklog commands.
 If user logs out correctly using unlog and kdestroy, it doesn't solve
 the problem, shutting down is going to take about 3 minutes.
 If I stop openafs-client service and umount /afs before shutdown, it
 doesn't help.
 It affects rebooting as well.
 It seems that the system is trying to stop some User Manager job at
 shutdown as far as I remember.
 This problem affects Debian Jessie, shutdown was quite quick on Wheezy.
 It affects all the client machines.
 I'm writing this report from a client machine.

The kernel messages during the hang (ideally with timestamps) would be
quite helpful for understanding what's going on here.

I'll have to double-check, but I may only have wheezy and sid machines
sitting around.  I would expect any issues to also be present on sid, but
one never knows...

-Ben