Re: [OpenAFS] fs newcell / clients CellServDB / adding new db server

2018-06-26 Thread Benjamin Kaduk
On Tue, Jun 26, 2018 at 10:33:24AM +0200, Andreas Ladanyi wrote:
> 
> Is there a funtion / service in afs to manage clients cellservdb ? I
> understand upclient/upserver are for servers only.

The most scalable way seems to be to not use hardcoded CellServDB entries
and instead use DNS SRV records to locate the database servers.

-Ben
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] fs newcell / clients CellServDB / adding new db server

2018-06-18 Thread Jeffrey Altman
On 6/18/2018 9:07 AM, Andreas Ladanyi wrote:
>>
>> The ubik clients do not rank servers based upon IP address.  What they
>> do is:
> ok. Then maybe i misunderstood the documentation
> (http://docs.openafs.org/QuickStartUnix/HDRWQ114.html) which tells me
> the machine with lowest ip is "usually"  elected as the ubik coordinator.

The algorithm used to elect the coordinator is specific to the ubik
servers that maintain a synchronized database.  The clients (vos, pts,
cache managers, backup, aklog, pam_afs_session, etc) do not speak ubik;
they speak the application specific protocols (VL, PR, BUDB, etc.).  The
clients do not have any visibility into which ubik instances are
electable, which instances have network connectivity to elicit
sufficient votes, nor what algorithm is used to rank (order) the ubik
instances for election purposes.

AuriStorFS ubik for example permits arbitrary ranking of servers based
upon configuration.  Just because a server has a smaller numeric IPv4
address doesn't mean that it is the best server to be the read/write
copy of the database.

> I followed the instruction on this paper to add a new db server machine
> with lowest ip.
>>
>> 1. compute the length of the ordered server list
>>
>>   A B C D
>>
>> 2. then generate a random number from 0..
>>
>> 3. use that number as an index into the list to decide which is first
>>
>> 4. and reorder the list as if it were a circular queue.  So if the
>> random number selected was 2, then the list would become
>>
>>   C D A B
>>
>> The only time the coordinator must be contacted is for a write
>> transaction.  All read transactions are processed by the first server
>> contacted.
> ok. thanks for explanation.
>>
>> My conclusion is that there is something about your cell configuration
>> that results in a write transaction for each token requested.  For example:
> I straced aklog for some tests and could see if aklog sometimes ask the
> new db server (which is offline) and then wait for a timeout (hangs
> about 15 sec) and if ask the old online db servers from CellServDB
> without timeout (hang).
> 
> This seems to cause the ssh login hanging symptom because pam debug
> shows me hanging about 15 sec when pam_afs calls aklog.
> 
> So on summary it seems to be better to first add the new db server to
> all db servers CellServDB / bos addhost and to bos restart the pt/vl
> instances for ubik corrdinator election on the servers and then to
> update the clients CellServDB.

That depends on whether or not the clients need to be able to find a
writable copy of the database or not.  If the clients must be able to
find the coordinator and the coordinator is a server that is not present
in the client's configuration, then the client won't simply experience a
random timeout but a failure.

> The documentation tells to first update clients CellServDB (when new db
> server with lowest ip) and then bring up new db server.
>>
>>  1. cell name:   example.com
> no, cellname a.b.c
>>
>>  2. One of the following is true:
>>
>> a. realm name:   AD.EXAMPLE.COM
> no AD
> 
> REALM = A.B.C, MIT Kerberos
>>
>> b. CellServDB's zeroth ubik server host domain:
>>
>>  subnet.example.com
> I dont understand this example.


If the cell name is

   foo.example.com

and the Kerberos realm is

   FOO.EXAMPLE.COM

and the host names of the ubik servers are

   afsdb1.bar.example.com
   afsdb2.bar.example.com
   afsdb3.bar.example.com

then the default host to realm mapping of afsdb1.bar.example.com will be
to realm BAR.EXAMPLE.COM not FOO.EXAMPLE.COM.  Since BAR.EXAMPLE.COM !=
FOO.EXAMPLE.COM a foreign cell registration will be attempted.  However,
that doesn't appear to be the source of the delay.  If it were, the
tracing would show aklog attempting to access every protection server
until the coordinator was discovered.

>>  3. auto-registration of foreign PTS IDs enabled:
>>
>> a. pam_afs_session configuration doesn't disable it
>>
>> b. aklog executed without -noprdb
> yes, pam_afs_session calls aklog without -noprdb



<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] fs newcell / clients CellServDB / adding new db server

2018-06-18 Thread Andreas Ladanyi
>
> The ubik clients do not rank servers based upon IP address.  What they
> do is:
ok. Then maybe i misunderstood the documentation
(http://docs.openafs.org/QuickStartUnix/HDRWQ114.html) which tells me
the machine with lowest ip is "usually"  elected as the ubik coordinator.

I followed the instruction on this paper to add a new db server machine
with lowest ip.
>
> 1. compute the length of the ordered server list
>
>   A B C D
>
> 2. then generate a random number from 0..
>
> 3. use that number as an index into the list to decide which is first
>
> 4. and reorder the list as if it were a circular queue.  So if the
> random number selected was 2, then the list would become
>
>   C D A B
>
> The only time the coordinator must be contacted is for a write
> transaction.  All read transactions are processed by the first server
> contacted.
ok. thanks for explanation.
>
> My conclusion is that there is something about your cell configuration
> that results in a write transaction for each token requested.  For example:
I straced aklog for some tests and could see if aklog sometimes ask the
new db server (which is offline) and then wait for a timeout (hangs
about 15 sec) and if ask the old online db servers from CellServDB
without timeout (hang).

This seems to cause the ssh login hanging symptom because pam debug
shows me hanging about 15 sec when pam_afs calls aklog.

So on summary it seems to be better to first add the new db server to
all db servers CellServDB / bos addhost and to bos restart the pt/vl
instances for ubik corrdinator election on the servers and then to
update the clients CellServDB.

The documentation tells to first update clients CellServDB (when new db
server with lowest ip) and then bring up new db server.
>
>  1. cell name:example.com
no, cellname a.b.c
>
>  2. One of the following is true:
>
> a. realm name:AD.EXAMPLE.COM
no AD

REALM = A.B.C, MIT Kerberos
>
> b. CellServDB's zeroth ubik server host domain:
>
>   subnet.example.com
I dont understand this example.
>
>  3. auto-registration of foreign PTS IDs enabled:
>
> a. pam_afs_session configuration doesn't disable it
>
> b. aklog executed without -noprdb
yes, pam_afs_session calls aklog without -noprdb
>
> If the "realm of cell" guessing algorithm decides that the current login
> is likely to be a foreign cell login, then an attempt to allocate a PTS
> ID for the authentication name will be performed.  This request is a
> write transaction and the ubik client will attempt to contact every ubik
> server in order until the coordinator is determined.
>
> Jeffrey Altman
>
Andi
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] fs newcell / clients CellServDB / adding new db server

2018-06-15 Thread Jeffrey Altman
On 6/15/2018 9:52 AM, Andreas Ladanyi wrote:
> ok. so the process of change CellSrvDB on db servers and bos restart AND
> updating (copying) new CellServDB to clients has to be done in a very
> short time to minimize timeout symptoms for users, because db servers
> has to be in sync and ubik coordinator has to be elected and the afs
> clients with new CellServDB with the new db server (lowest ip) asks the
> new db server (ubik coordinator) first.

The ubik clients do not rank servers based upon IP address.  What they
do is:

1. compute the length of the ordered server list

  A B C D

2. then generate a random number from 0..

3. use that number as an index into the list to decide which is first

4. and reorder the list as if it were a circular queue.  So if the
random number selected was 2, then the list would become

  C D A B

The only time the coordinator must be contacted is for a write
transaction.  All read transactions are processed by the first server
contacted.

My conclusion is that there is something about your cell configuration
that results in a write transaction for each token requested.  For example:

 1. cell name:  example.com

 2. One of the following is true:

a. realm name:  AD.EXAMPLE.COM

b. CellServDB's zeroth ubik server host domain:

subnet.example.com

 3. auto-registration of foreign PTS IDs enabled:

a. pam_afs_session configuration doesn't disable it

b. aklog executed without -noprdb

If the "realm of cell" guessing algorithm decides that the current login
is likely to be a foreign cell login, then an attempt to allocate a PTS
ID for the authentication name will be performed.  This request is a
write transaction and the ubik client will attempt to contact every ubik
server in order until the coordinator is determined.

Jeffrey Altman

<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] fs newcell / clients CellServDB / adding new db server

2018-06-15 Thread Andreas Ladanyi
Hi Jeffrey,
>>> i understand that a change in CellServDB on client does have no effect
>>> until reboot.
>> The OpenAFS unix cache manager populates the list of location servers
>> (vlservers) at startup.  The loaded server list can be adjusted via the
>> "fs newcell" command at runtime.
>>
>> This behavior is specific to the OpenAFS unix cache manager.
>>
>> It does not apply to other cache managers nor does it apply to command
>> line tools such as aklog, vos, pts, etc..  Nor does it apply to PAM modules.
ok. so the process of change CellSrvDB on db servers and bos restart AND
updating (copying) new CellServDB to clients has to be done in a very
short time to minimize timeout symptoms for users, because db servers
has to be in sync and ubik coordinator has to be elected and the afs
clients with new CellServDB with the new db server (lowest ip) asks the
new db server (ubik coordinator) first.

>>
regards,
Andi
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] fs newcell / clients CellServDB / adding new db server

2018-06-13 Thread Jeffrey Altman
On 6/13/2018 11:35 AM, Dirk Heinrichs wrote:
> Am 13.06.2018 um 14:06 schrieb Andreas Ladanyi:
> 
>> i understand that a change in CellServDB on client does have no effect
>> until reboot.
> 
> Hmm, is this also true when using DNS SRV records instead of CellServDB?

For OpenAFS, any server lists provided by the client's CellServDB file
take precedence over DNS SRV records.

If DNS SRV records are in use, the client CellServDB file should not
list any servers.

One of the benefits of DNS SRV records is that the DNS SRV record TTL
value is used to determine the validity period for the server list by
the cache manager.

In this way, clients automatically update their server list information
and administrators can control how frequently the server lists are updated.

Jeffrey Altman
<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] fs newcell / clients CellServDB / adding new db server

2018-06-13 Thread Dirk Heinrichs
Am 13.06.2018 um 14:06 schrieb Andreas Ladanyi:

> i understand that a change in CellServDB on client does have no effect
> until reboot.

Hmm, is this also true when using DNS SRV records instead of CellServDB?

Bye...

    Dirk

-- 
Dirk Heinrichs 
GPG Public Key: D01B367761B0F7CE6E6D81AAD5A2E54246986015
Sichere Internetkommunikation: http://www.retroshare.org
Privacy Handbuch: https://www.privacy-handbuch.de




signature.asc
Description: OpenPGP digital signature