Re: [OpenAFS] Re: DB servers quorum and OpenAFS tools

2014-01-24 Thread Harald Barth
The problem is that you the client to scan quickly to find a server that is up, but because networks are not perfectly reliable and drop packets all the time, it cannot know that a server is not up until that server has failed to respond to multiple retransmissions of the request. Those

Re: [OpenAFS] Re: DB servers quorum and OpenAFS tools

2014-01-24 Thread Simon Wilkinson
On 24 Jan 2014, at 07:48, Harald Barth h...@kth.se wrote: You are completely right if one must talk to that server. But I think that AFS/RX sometimes hangs to loong on waiting for one server instead of trying the next one. For example for questions that could be answered by any VLDB. I'm

[OpenAFS] OpenAFS 1.7.2900 for windows report

2014-01-24 Thread Lars Schimmer
Hi! I just want to write a short report of success with OpenAFS 1.7.2900 for Windows. We have had some problems from time to time with roaming profiles not syncing on logout, the system kept staying in the logging out screen forever. We did not find any evidence of this in windows logs. But

RE: [OpenAFS] Re: mkdir() performance on AFS client

2014-01-24 Thread milek
The Windows cache manager even takes things a step further by maintaining a negative cache for EACCESS errors on {FID, user}. This has avoided hitting the abort threshold limits triggered by Windows that assumes that if it can list a directory it must be able to read the status of all the

Re: [OpenAFS] Re: DB servers quorum and OpenAFS tools

2014-01-24 Thread Peter Grandi
For example in an ideal world putting more or less DB servers in the client 'CellServDB' should not matter, as long as one that belongs to the cell is up; again if the logic were for all types of client: scan quickly the list of potential DB servers, find one that is up and belongs to the

Re: [OpenAFS] Re: mkdir() performance on AFS client

2014-01-24 Thread Jeffrey Altman
On 1/24/2014 8:56 AM, mi...@task.gda.pl wrote: The Windows cache manager even takes things a step further by maintaining a negative cache for EACCESS errors on {FID, user}. This has avoided hitting the abort threshold limits triggered by Windows that assumes that if it can list a directory

Re: [OpenAFS] DB servers quorum and OpenAFS tools

2014-01-24 Thread Neil Davies
Peter To solve this you can't just use the round trip in its raw form, you need to understand it terms of how the delay and loss accrued. Its a bit too long (and potentially off-topic) for this list, but briefly the way we perform this sort of analysis (in my day job) is to view it as quality

[OpenAFS] Re: DB servers quorum and OpenAFS tools

2014-01-24 Thread Andrew Deason
On Thu, 23 Jan 2014 21:55:15 + p...@afs.list.sabi.co.uk (Peter Grandi) wrote: Otherwise, when your network becomes congested, the retransmission of dropped packets will act as a runaway positive feedback loop, making the congestion worse and saturating the network. I am sorry I

Re: [OpenAFS] Re: DB servers quorum and OpenAFS tools

2014-01-24 Thread Harald Barth
I have long thought that we should be using multi for vldb lookups, specifically to avoid the problems with down database servers. The situation is a little bit different for cache managers who can remember which servers are down and command line tools which normally discocver how the world

Re: [OpenAFS] Re: DB servers quorum and OpenAFS tools

2014-01-24 Thread Jeffrey Hutzelman
On Fri, 2014-01-24 at 08:01 +, Simon Wilkinson wrote: On 24 Jan 2014, at 07:48, Harald Barth h...@kth.se wrote: You are completely right if one must talk to that server. But I think that AFS/RX sometimes hangs to loong on waiting for one server instead of trying the next one. For

Re: [OpenAFS] Re: DB servers quorum and OpenAFS tools

2014-01-24 Thread Brandon Allbery
On Fri, 2014-01-24 at 11:41 -0500, Jeffrey Hutzelman wrote: The problem is the one-off clients that make _one RPC_ and then exit. They have no opportunity to remember what didn't work last time. It Has it been considered to write a cache file somewhere (even a user dotfile) that could be used

Re: [OpenAFS] Re: DB servers quorum and OpenAFS tools

2014-01-24 Thread Jeffrey Altman
On 1/24/2014 11:45 AM, Brandon Allbery wrote: On Fri, 2014-01-24 at 11:41 -0500, Jeffrey Hutzelman wrote: The problem is the one-off clients that make _one RPC_ and then exit. They have no opportunity to remember what didn't work last time. It Has it been considered to write a cache file

[OpenAFS] Re: DB servers quorum and OpenAFS tools

2014-01-24 Thread Andrew Deason
On Fri, 24 Jan 2014 11:41:35 -0500 Jeffrey Hutzelman jh...@cmu.edu wrote: The problem is the one-off clients that make _one RPC_ and then exit. They have no opportunity to remember what didn't work last time. It might help some for these sorts of clients to use multi, if they're doing

Re: [OpenAFS] Re: 'afs/' principal rekeying instructions may be incomplete

2014-01-24 Thread Benjamin Kaduk
Sorry for the delayed response. It looks like Jeffrey's and Andrew's responses should have addressed the major issues. It would also be a little easier for me if the attribution of who wrote the quoted text was retained. On Thu, 23 Jan 2014, Peter Grandi wrote: ** Crucial details for