Hi Andrew,
The filer was restarted early this a.m.  We've seen
this come up semi-frequently in the past 6 months,
so I expect to see it again.

As for the timestamp from vos status fileserver,
after I ran entrans, it increased continuously.
I cannot say if it was unchanging before I ran
entrans.  I'll try and gather this on the next
volume.

I'll also see what I can do about generating
any traces to determine if this is a some
sort of Rx timeout, RPC hang, an overcount
error, or something else.

Thanks for all the help everyone!
Kris

On 4/20/13 5:01 PM, Andrew Deason wrote:
On Fri, 19 Apr 2013 18:51:12 -0600
"Kristen J. Webb" <[email protected]> wrote:

I can say for sure that server that issued the vos dump has been
rebooted since the transaction started.  The other thing I am
observing is that repeated vos status on the fileserver shows the
lastActiveTime as current (increasing).

It would be set to the current time when you ran the 'vos endtrans'
command. I assume you just saw it increase once, and not increasing
constantly.

So again, everything you've said suggests there is an RPC holding a
reference to that transaction, which is why it's not going away. So
either:

  - The Rx call is still alive. Even if the client is gone, for some
    reason the Rx call is not dying (i.e. some bug in Rx; a timer not
    going off or something).

  - The Rx call died, but the RPC is still running. Maybe the volser RPC
    is hanging on some lock or some other thing.

  - There is no RPC still running, but the transaction still says someone
    is using it. We have a bug with a reference overcount on that
    transaction.

The only way to know which it is is to look at a stack trace or core of
the volserver process, or maybe 'rxdebug' would show a stuck call if the
problem is a stuck Rx call.

There have been some bugs in the past with the volserver not accessing
transactions from multiple threads correctly (not locking things right).
It could be something like that (though I don't think I've seen that
_specific_ manifestation), or it could be something else entirely.


--
This message is NOT encrypted
--------------------------------
Mr. Kristen J. Webb
Chief Technology Officer
Teradactyl LLC.
2450 Baylor Dr. S.E.
Albuquerque, New Mexico 87106
Phone: 1-505-338-6000
Email: [email protected]
Web: http://www.teradactyl.com

Providers of Scalable Backup Solutions
   for Unique Data Environments

--------------------------------
NOTICE TO RECIPIENTS: Any information contained in or attached to this message is intended solely for the use of the intended recipient(s). If you are not the intended recipient of this transmittal, you are hereby notified that you received this transmittal in error, and we request that you please delete and destroy all copies and attachments in your possession, notify the sender that you have received this communication in error, and note that any review or dissemination of, or the taking of any action in reliance on, this communication is expressly prohibited.


Regular internet e-mail transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate, and it should not be relied upon as such. If you prefer to communicate with Teradactyl LLC. using secure (i.e., encrypted and/or digitally signed) e-mail transmission, please notify the sender. Otherwise, you will be deemed to have consented to communicate with Teradactyl via regular internet e-mail transmission. Please note that Teradactyl reserves the right to intercept, monitor, and retain all e-mail messages (including secure e-mail messages) sent to or from its systems as permitted by applicable law.



_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to