Re: [OpenAFS] Changing host- and domainname

2024-01-21 Thread Jeffrey E Altman

Reading the 1.6.24 code more carefully these messages

>Sat Jan 20 16:47:37 2024 ubik: primary address 192.168.1.43 does not 
exist

>Sat Jan 20 16:47:37 2024 ubik: No network addresses found, aborting..

are produced from the following actions.

1. 192.168.1.43 is the result of evaluating the local hostname and 
resolving that name.


2. The primary address does not exist is the result of 192.168.1.43 not 
being present in the list of addresses returned from parseNetFiles().   
parseNetFiles() will filter the list of addresses returned by 
rx_getAllAddr() by the contents of the NetRestrict file and add the 
addresses included in the NetInfo file.


3. The no network addresses message is the result of rx_getAllAddr() 
returning either an error or an empty address list.  If the address list 
is empty, it means that all of the host's addresses are either 
considered to be loopback or are thought to be something other than an 
IPv4 address.


The failure occurs before the contents of the CellServDB file are examined.

What files are present in the /var/lib/openafs/local/ directory?   The 
NetInfo and NetRestrict files will be located there on Debian instead of 
/etc/openafs/server/.





smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Changing host- and domainname

2024-01-20 Thread Jeffrey E Altman

On 1/20/2024 3:49 PM, Sebix wrote:

Hi,

On 1/20/24 21:46, Jeffrey E Altman wrote:


On 1/20/2024 3:32 PM, Sebix wrote:
We already replaced the IP address in /etc/openafs/CellServDB and 
restarted the server.



Did you update /etc/openafs/server/CellServDB as well?


yes, the two files are identical.


Do you have NetInfo and/or NetRestrict files in /etc/openafs/server/?

Does the output of "ip addr" or "ifconfig -a" list the address 192.168.1.43?

The error is being generated from verifyInterfaceAddress() in 
src/ubik/beacon.c.






smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Changing host- and domainname

2024-01-20 Thread Jeffrey E Altman

On 1/20/2024 3:49 PM, Sebix wrote:

Hi,

On 1/20/24 21:46, Jeffrey E Altman wrote:


On 1/20/2024 3:32 PM, Sebix wrote:
We already replaced the IP address in /etc/openafs/CellServDB and 
restarted the server.



Did you update /etc/openafs/server/CellServDB as well?


yes, the two files are identical.


What version of OpenAFS?


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Changing host- and domainname

2024-01-20 Thread Jeffrey E Altman

On 1/20/2024 3:32 PM, Sebix wrote:
We already replaced the IP address in /etc/openafs/CellServDB and 
restarted the server.



Did you update /etc/openafs/server/CellServDB as well?




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] 1.8.10 in ppa:openafs/stable for Ubuntu 22.04 (kernel 6.2)?

2023-08-03 Thread Jeffrey E Altman

On 8/3/2023 9:04 AM, Jan Henrik Sylvester wrote:

... there are now Ubuntu LTS systems without AFS.


Jan,

As a reminder, Ubuntu 22.04 LTS systems include the Linux kernel afs 
file system (kafs).  As kafs is built as part of the kernel it is always 
up-to-date.


To use kafs:

1. apt-get install kafs-client
2. systemctl start afs.mount
3. acquire tokens using aklog-kafs
1.   or install kafs-compat to rename aklog-kafs to aklog
4. To enable afs.mount at boot, systemctl enable afs.mount
5. Read "man kafs"

Even if you prefer OpenAFS, kafs is available to access /afs until 
updated OpenAFS packages are available.


Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Re: openafs versus systemd

2023-06-28 Thread Jeffrey E Altman

On 6/28/2023 10:18 AM, Jan Henrik Sylvester wrote:

On 6/28/23 15:02, Jeffrey E Altman wrote:

On 6/28/2023 3:54 AM, Jan Henrik Sylvester wrote:

On 6/9/23 13:38, Jan Henrik Sylvester wrote:
- you cannot use snap packaged with a home directory outside /home: 
use ppa:mozillateam/ppa for Firefox and Google Chrome instead of 
Chromium


Correction: This does not seem to be true anymore.

snap set system homedirs=/afs/math.uni-hamburg.de/users

works for Ubuntu 22.04.

The Firefox snap does start with this setting. We have very limited 
experience with this setting. Kerberos authentication does not work 
in Firefox snap, which is a known problem (independent of AFS).



What credential cache type is in use?

The underlying issues are the same as for PAGs.  The assumption is 
that a 'uid' represents all of the authorization credentials 
associated with the user.   If the Kerberos credential cache is using 
a session keyring or something that is not global to the 'uid', then 
there will be no Kerberos TGT available to snap.


Maybe I was not clear enough. Accessing the home directories from 
Firefox is not the issue. Kerberized http is the issue:


You were clear.   I am suggesting that you use a Kerberos credential 
cache that is tied to the uid for example a keyring with user scope 
instead of session scope.


Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Re: openafs versus systemd

2023-06-28 Thread Jeffrey E Altman

On 6/28/2023 3:54 AM, Jan Henrik Sylvester wrote:

On 6/9/23 13:38, Jan Henrik Sylvester wrote:
- you cannot use snap packaged with a home directory outside /home: 
use ppa:mozillateam/ppa for Firefox and Google Chrome instead of 
Chromium


Correction: This does not seem to be true anymore.

snap set system homedirs=/afs/math.uni-hamburg.de/users

works for Ubuntu 22.04.

The Firefox snap does start with this setting. We have very limited 
experience with this setting. Kerberos authentication does not work in 
Firefox snap, which is a known problem (independent of AFS).



What credential cache type is in use?

The underlying issues are the same as for PAGs.  The assumption is that 
a 'uid' represents all of the authorization credentials associated with 
the user.   If the Kerberos credential cache is using a session keyring 
or something that is not global to the 'uid', then there will be no 
Kerberos TGT available to snap.


Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] 2023 AFS Technologies Workshop - virtual

2023-06-16 Thread Jeffrey E Altman

On 6/16/2023 6:40 AM, Giovanni Bracco wrote:
Dear Tracy, thank you for all the work you have done for this very 
interesting workshop!


What about slides and recordings?


As announced at the end of the workshop, the slides and recordings are 
available via the Zoom Event Lobby to all attendees until 30 Sept 2023.


I am unaware of the plans for them after that date.

Speakers that have not yet uploaded their slides to their session can 
continue to do so by using the link in the "Edit Session" invitation 
email received before the conference began.


Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


[OpenAFS] Re: openafs versus systemd

2023-06-07 Thread Jeffrey E Altman

On 6/7/2023 5:48 PM, Chad William Seys wrote:

Hi all,
  I've been trying to know how to disable PAG, but am having a google 
fail.  Anyone have pointers.


Thanks!
Chad.

A PAG is something that must be created using pagsh or via a side effect 
of a pam module.  If you are using pam_afs_session, it defaults to 
creating a PAG.  To disable that behavior, use the "nopag" option.


Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] OpenAFS access at login time on MacOS

2023-05-13 Thread Jeffrey E Altman

On 5/13/2023 11:44 AM, Jeffrey E Altman (jalt...@auristor.com) wrote:

On 5/11/2023 6:20 AM, Richard Feltstykket (rich...@unixboxen.net) wrote:

Hello Everyone,

Perhaps it is widely known already, but I just wanted to share a 
process that I have worked out to get a kerberos ticket and an afs 
token at login time on MacOS.  It seems to work fine for MacOS 
Ventura and Monterey;  I have not tested on other versions.

Thanks for posting.


My cell takes FOREVER to log in for some reason, but after aklog 
completes in the background, I have a token and can access volumes in 
the cell.


Negative DNS lookups impose an unnecessary time delay.

Assuming the name of your domain example.net is also the name of your 
cell and Kerberos realm (in upper case), and assuming the following 
hostnames for your kdc and afsdb servers


kdc1.example.net

afsdb1.example.net

create the following DNS entries

_kerberos.example.net.  IN TXT   "EXAMPLE.NET"

_kerberos._afs.example.net. IN TXT "EXAMPLE.NET"

_kerberos._tcp.example.net. IN SRV 10 0 88 kdc1.example.net.

_kerberos._udp.example.net. IN SRV 10 0 88 kdc1.example.net.

_kerberos._http.example.net. IN SRV 0  0  0 .

_kerberos._kkdcp.example.net. IN SRV 0 0 0 .

_afs3-vlserver._udp.example.net. IN SRV 10 0 7003 afsdb1.example.net.

_afs3-prserver._udp.example.net. IN SRV 10 0 7002 afsdb1.example.net.

If you are using the AFS backup service:

_afs3-budbserver._udp.example.net. IN SRV 10 0 7021
afsdb1.example.net.

If you are not using the AFS backup service:

_afs3-budbserver._udp.example.net. IN SRV 0 0 0 .

If there are more than one KDC or AFSDB server, then create one 
_kerberos* SRV record for each KDC and one _afs3-* entry for each 
AFSDB server.


Note that the hostname specified in a SRV record must not be a CNAME; 
it must be A or  records.    For the _afs3-* SRV records for an 
OpenAFS cell which does not support IPv6 the specified hostname should 
not have a  record.   The AuriStorFS cache managers and Linux 
kernel afs (kafs) clients will attempt to contact the location servers 
via IPv6 if there is a  record specified.


A SRV record whose hostname is "." indicates that the service is 
unavailable.


The AuriStorFS aklog will attempt to acquire both yfs-rxgk tokens and 
rxkad_k5 tokens.   An OpenAFS cell does not support yfs-rxgk but aklog 
doesn't know that until it is explicitly told by the Kerberos realm 
that there is no yfs-rxgk/_afs.unixboxen@unixboxen.net service 
principal.   This requires that GSS-KRB5 be able to quickly resolve 
the Kerberos realm for the name "_afs.unixboxen.net".   The SRV record 
specified above for _kerberos._afs.unixboxen.net is intended to speed 
up the resolution of the hostname to realm mapping if the client is 
configured to do so.



One thing I forgot to mention.

The service principal for yfs-rxgk is 
yfs-rxgk/_afs.example@example.net instead of 
afs/example@example.net as is used for rxkad_k5.   The reason that 
_afs.example.net is used is because of how GSS-API Kerberos v5 
implementations resolve the Kerberos realm of a service where the second 
component is a hostname.   GSS-API will fallback to using the DNS domain 
of the hostname as the realm if there is no other information 
available.   However, many implementations including macOS and MIT will 
try to validate the second component as a valid DNS hostname as part of 
the lookup process.   Therefore it issues a DNS A and  query for 
"_afs.example.net" even though a DNS hostname is not permitted to begin 
with an underscore.  In hindsight specifying the service principal in 
https://datatracker.ietf.org/doc/html/draft-wilkinson-afs3-rxgk-afs with 
an underscore based hostname was a poor idea.   That said, DNS resolvers 
and most Kerberos libraries do not perform validation on the query 
string and most DNS servers will happily respond to the out of 
specification request if there is an entry present.  I therefore suggest 
creating DNS A and  records for _afs.example.net to avoid the 
negative lookup.   The address doesn't matter since the DNS response 
will not be used to contact any host.   Specifying one of the location 
servers is reasonable.


For rxkad_k5 tokens the resolution of which Kerberos realm to use is 
performed by enumerating the hostnames of the location servers, 
performing an A/ DNS query to obtain the IP addresses, then 
performing a PTR record lookup on the IP addresses.  For example


afsdb1.example.net  A  ->  192.0.2.23

129.0.2.23 PTR -> host.example.net

_kerberos.host.example.net TXT -> "EXAMPLE.NET"

_kerberos.example.net TXT -> "EXAMPLE.NET"  (queried if the
_kerberos.host.example.net entry is not present)

If there are more than one location service address, then the one that 
is used for resolution of the Kerberos realm can appear to be

Re: [OpenAFS] OpenAFS access at login time on MacOS

2023-05-13 Thread Jeffrey E Altman

On 5/11/2023 6:20 AM, Richard Feltstykket (rich...@unixboxen.net) wrote:

Hello Everyone,

Perhaps it is widely known already, but I just wanted to share a 
process that I have worked out to get a kerberos ticket and an afs 
token at login time on MacOS.  It seems to work fine for MacOS Ventura 
and Monterey;  I have not tested on other versions.

Thanks for posting.


My cell takes FOREVER to log in for some reason, but after aklog 
completes in the background, I have a token and can access volumes in 
the cell.


Negative DNS lookups impose an unnecessary time delay.

Assuming the name of your domain example.net is also the name of your 
cell and Kerberos realm (in upper case), and assuming the following 
hostnames for your kdc and afsdb servers


   kdc1.example.net

   afsdb1.example.net

create the following DNS entries

   _kerberos.example.net.  IN TXT   "EXAMPLE.NET"

   _kerberos._afs.example.net. IN TXT "EXAMPLE.NET"

   _kerberos._tcp.example.net. IN SRV 10 0 88 kdc1.example.net.

   _kerberos._udp.example.net. IN SRV 10 0 88 kdc1.example.net.

   _kerberos._http.example.net. IN SRV 0  0  0 .

   _kerberos._kkdcp.example.net. IN SRV 0 0 0 .

   _afs3-vlserver._udp.example.net. IN SRV 10 0 7003 afsdb1.example.net.

   _afs3-prserver._udp.example.net. IN SRV 10 0 7002 afsdb1.example.net.

If you are using the AFS backup service:

   _afs3-budbserver._udp.example.net. IN SRV 10 0 7021 afsdb1.example.net.

If you are not using the AFS backup service:

   _afs3-budbserver._udp.example.net. IN SRV 0 0 0 .

If there are more than one KDC or AFSDB server, then create one 
_kerberos* SRV record for each KDC and one _afs3-* entry for each AFSDB 
server.


Note that the hostname specified in a SRV record must not be a CNAME; it 
must be A or  records.    For the _afs3-* SRV records for an OpenAFS 
cell which does not support IPv6 the specified hostname should not have 
a  record.   The AuriStorFS cache managers and Linux kernel afs 
(kafs) clients will attempt to contact the location servers via IPv6 if 
there is a  record specified.


A SRV record whose hostname is "." indicates that the service is 
unavailable.


The AuriStorFS aklog will attempt to acquire both yfs-rxgk tokens and 
rxkad_k5 tokens.   An OpenAFS cell does not support yfs-rxgk but aklog 
doesn't know that until it is explicitly told by the Kerberos realm that 
there is no yfs-rxgk/_afs.unixboxen@unixboxen.net service principal. 
This requires that GSS-KRB5 be able to quickly resolve the Kerberos 
realm for the name "_afs.unixboxen.net".   The SRV record specified 
above for _kerberos._afs.unixboxen.net is intended to speed up the 
resolution of the hostname to realm mapping if the client is configured 
to do so.


For rxkad_k5 tokens the resolution of which Kerberos realm to use is 
performed by enumerating the hostnames of the location servers, 
performing an A/ DNS query to obtain the IP addresses, then 
performing a PTR record lookup on the IP addresses.  For example


   afsdb1.example.net  A  ->  192.0.2.23

   129.0.2.23 PTR -> host.example.net

   _kerberos.host.example.net TXT -> "EXAMPLE.NET"

   _kerberos.example.net TXT -> "EXAMPLE.NET"  (queried if the
   _kerberos.host.example.net entry is not present)

If there are more than one location service address, then the one that 
is used for resolution of the Kerberos realm can appear to be random 
because whichever is first in the list will be used.


Issuing a "kinit u...@example.net" against your realm took a little more 
than six seconds to perform the DNS lookups for the kdc on a macOS 
Ventura 13.4 system.  It then took approximately 180ms to receive the 
expected principal unknown response to the AS-REQ.    I cannot measure 
the time to perform the aklog operations because I cannot obtain a TGT 
to test with.


The time for the AuriStorFS v2021.05-28 cache manager on macOS 13.4 to 
"ls -l /afs/example.net" anonymously was


 * 470ms to resolve the location service via DNS (3 RPCs)
 * 330ms to resolve the location of the "root.cell" volume  (2 RPCs +
   reachability test)
 * 850ms for the fileserver response to the first RPC including the
   fileserver->client callback service TellMeAboutYourself queries (3
   RPCs + reachability tests)
 * 600ms to read the contents of the root directory and obtain status
   info for each entry (3 RPCs)

The ICMP ping rtt from my test system to the location server averages 
115ms.


If the vlserver and fileserver connections were authenticated using 
rxkad or yfs-rxgk the PING|PING_RESPONSE reachability test for each RX 
connection would be replaced by a CHALLENGE|RESPONSE exchange.   If the 
cache manage to fileserver connection was authenticated using yfs-rxgk, 
then the fileserver TellMeAboutYourself query to the cache manager would 
not be performed.


I suspect you can reduce some of the time by adding the DNS records that 
are not present in your domain.   You can observe the DNS, Kerberos and 
AFS queries using 

Re: [OpenAFS] More Kerberos + Windows issues

2023-05-03 Thread Jeffrey E Altman

On 5/3/2023 11:45 AM, Ben Huntsman (b...@huntsmans.net) wrote:

Setting tokens. adUser @ mydomain.com
aklog: a pioctl failed while setting tokens for cell mydomain.com



pioctl issue usually means no cache manager is running



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] NoAuth not working?

2023-05-02 Thread Jeffrey E Altman

On 5/2/2023 4:42 PM, Ben Huntsman (b...@huntsmans.net) wrote:

Hi Jeffrey-
   Thank you for the quick reply!  If I understand you correctly, that 
essentially means that there's no way to access the /afs filespace 
without setting up some sort of authentication infrastrcture, even in 
an "emergency" basis.


   Thank you!

-Ben



Did you try adding "anonymous" to the "system:administrators" group?





smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] NoAuth not working?

2023-05-02 Thread Jeffrey E Altman

On 5/2/2023 12:32 PM, Ben Huntsman (b...@huntsmans.net) wrote:

Hi there!
   I'm trying to test a few things without having all the kerberos and 
auth stuff in place.  I run the following command:


bos setuath  off

   I'm using Transarc paths, so this creates the NoAuth file in 
/usr/afs/local.  bosserver is running with -noauth.  I am logged in as 
a user who is listed in UserList.


The NoAuth file only applies to services that rely upon the UserList for 
authorization (bosserver, vlserver and volserver) or that have an 
explicit check (ptserver).  It does not include services that have an 
ACL based model such as the the fileserver.   The ptserver only checks 
at startup so the service needs to be restarted after the NoAuth file is 
created.



However, I still can't run fs setacl commands, nor even do an ls of 
/afs.  I get various messages such as:


fs: You don't have the required access rights on '/afs'
ls: /afs: The file access permissions do not allow the specified action.


Correct because the authorization decisions are made based upon the 
authenticated identity and the contents of the applicable ACL.



The NoAuth(5) man page is incorrect when it implies that all AFS server 
processes running on the machine look for it.




   Do I have to do something else to get afsd to skip permissions checks?
I have not tried it but after restarting the ptserver with NoAuth in 
place you might try adding "anonymous" to the "system:administrators" group.


   Again, this is just for testing.  But it appears that the NoAuth 
file is not honored.


Thank you!

-Ben


Anytime.


Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


[OpenAFS] 1:32 PM,AFS Tech Workshop June 12-14 - Site Report Submission Deadline tomorrow April 18

2023-04-17 Thread Jeffrey E Altman

Dear Community,

This year's AFS Technologies Workshop is scheduled for Monday June 12th 
to Wednesday June 14th and will be held as a virtual conference 
beginning each day at 9:30am EDT (UTC-4) and ending at 3pm EDT (UTC-4).


The deadline for the call for presentations which includes Site Reports 
is tomorrow April 18th.


https://workshop.openafs.org/afsbpw23/cfp/

If you are considering attending the workshop please submit a request to 
present a Site Report. Not only will your active participation via a 
site report increase the value of the workshop for all attendees but as 
per the conference web site.


   The workshop registration fee will be refunded for speakers.

   Areas of interest include, but are not limited to:

 * Site Reports
 * Best Practices and Lessons Learned
 * Novel use cases for AFS
 * Tools and technology integration
 * Performance, networking, security
 * Testing and test automation
 * Or anything else AFS related you would like to discuss or showcase

Come share with the community how your organization uses AuriStorFS, 
OpenAFS or any other AFS implementation with your peers.  Share what 
excites you, what bores you, and even what worries you.


A site report does not need to be complicated or even particularly 
long.   Consider it as an ice breaker for conversations that might take 
place in the virtual hallway between sessions.  Or to educate other 
speakers about what questions you would like to have answered during the 
rest of the workshop.


A ten minute site report can be as simple as answering the following 
questions about yourself, your organization and your deployments:


 * who are you?
 * how long have you administered AuriStorFS or AFS cells?
 * who is your organization?
 * where is your organization located?
 * what is the core mission of your organization?
 * how does /afs contribute to satisfying that mission?
 * how many cells?
 * how many fileserver and databaservers in each?
 * are the cell's infrastructure in one location or spread across
   multiple data centers and or clouds?
 * how old is the cell?  (Creation date of "root.afs" and/or "root.cell")
 * how many volumes?
 * what is the largest volume by number of vnodes or total storage?
 * how many clients?
 * what kind of clients?  personal laptops, compute nodes, web
   services, IoT, other?
 * what are the primary use cases?
 * how are servers deployed?  bare metal?  virtual machines?  containers?
 * how are clients deployed? bare metal? virtual machines? kubernetes /
   openshift? other?
 * what do you use for configuration management?
 * what do you use for disaster recovery?
 * what Kerberos implementations are used for authentication?
 * are the cell(s) open to the public Internet or locked behind firewalls?
 * what are the greatest benefits of /afs to your organization?
 * what are the greatest fears?
 * what do you wish to learn by attending this workshop?

I realize the deadline is fast approaching but please consider 
submitting a Site Report request.  All that is required by the deadline 
is an abstract, not the fully written presentation.


If you are interested in presenting something more substantial than a 
site report either alone or in conjunction with someone else, please 
draft an abstract and submit the idea.


Thank you for your consideration of my request.

Jeffrey Altman

P.S. I am not a member of the workshop organizing committee.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] RFC: Altering the processing of IPv4 (aka Host) ACLs to enforce negative rights

2023-03-29 Thread Jeffrey E Altman

On 3/20/2023 4:21 PM, Jeffrey E Altman (jalt...@auristor.com) wrote:


Proposal:

I propose that OpenAFS treat the current behavior as a bug.  The use 
of negative rights is discouraged because they are hard to analyze.  
It is hoped that their use is rare.  If negative rights are not in 
use, then changing the behavior when IP ACLs exist will not alter the 
computed outcome.  However, if negative rights are in use, they are 
likely being used because it wasn't easy to limit the access any other 
way.  In which case, granting more access then was specified is 
problematic.   A CVE can be published to document the existing 
behavior and the behavior as it will appear beginning with a specific 
version of the fileserver.


If required, a configuration option can be provided to enable the AFS 
3.2 behavior until all of the fileservers within a cell have been 
updated.   I discourage using a configuration option to enable the 
stricter interpretation of ACLs as that will result in some sites 
being vulnerable when they did not intend to be.



Just this week a site privately discussed a desire to enforce readonly 
access to all machines in a particular subnet. Likewise, a site might 
want to enforce readonly access on all machines outside of approved 
subnets.  This is simply not possible to do without altering the current 
behavior to enforce negative ACLs.


OpenAFS Gerrit 13926 https://gerrit.openafs.org/#/c/13926/ provides for 
an "afsd readonly" option to enforce readonly behavior but use of client 
side configuration cannot be enforced by the fileserver.


Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Advice on using BTRFS for vicep partitions on Linux

2023-03-22 Thread Jeffrey E Altman

On 3/22/2023 3:47 PM, spacefrogg-open...@spacefrogg.net wrote:

OpenAFS does not maintain checksums.  Checksums are neither transmitted in
the RXAFS_FetchData and RXAFS_StoreData RPCs messages nor are checksums
stored and compared when reading and writing to the vice partition.

Thanks for clearing this up. So, volume inconsistencies are just detected on 
the metadata level?


The salvager can update Volume metadata to be consistent with what is 
stored in the vice partition, it can attach orphaned vnodes, it can 
remove directory entries for missing vnodes, it can correct vnode size 
information,  it can rebuild directories, and a few other things.   
However, there is no ability to repair a damaged block or restore a 
missing vnode.


btrfs and zfs have benefits in being able to recover from damaged disk 
sectors by maintaining multiple copies of each data block (if that 
functionality is configured.)   However, they each have downsides as 
well.   Both require vasts sums of memory compared to other file 
systems.   Both have very inconsistent performance characteristics 
especially as free space falls below 40% and as memory pressure 
increases.   Both can fail under memory pressure or when their storage 
volumes are full.


AuriStor recommends that AuriStorFS vice partitions be deployed using 
xfs.   There are severalFS AuriStor end users that build xfs filesystems 
on zfs exported block devices when zfs is desired for additional 
reliability.


If zfs is going to be installed on Linux I recommend using a Linux 
distribution that packages zfs to ensure that an incompatible kernel 
update is never issued.  I have observed sites lose vice partitions 
hosted using zfs of rhel because of subtle kernel incompatibilities.  A 
risk with all out-of-tree filesystems that are not tested against every 
released kernel version but especially with out-of-tree non-GPL filesystems.


Jeffrey Altman





smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Advice on using BTRFS for vicep partitions on Linux

2023-03-22 Thread Jeffrey E Altman

On 3/22/2023 9:34 AM, Ciprian Craciun (ciprian.crac...@gmail.com) wrote:

On Wed, Mar 22, 2023 at 10:30 AM  wrote:

OpenAFS implements its own CoW and using CoW below that again has no benefits and 
disturbs the fileservers "free-space" assumptions. It knows when it makes 
in-place updates and does not expect to run out of space in that situation.

At what level does OpenAFS implement CoW?  Is it implemented at
whole-file-level, i.e. changing a file that is part of a replicated /
backup volume it will copy the entire file, or is it implemented at
some range or smaller granularity level (i.e. it will change only that
range, but share the rest)?
OpenAFS performs CoW on whole files within a Volume on the first update 
after a
Volume clone is created.  The clone can be a ROVOL, BACKVOL or untyped 
clone.


OpenAFS CoW is limited to sharing a vnode between multiple Volume instances.
Once a vnode can no longer be shared between two or more Volumes, the vnode
is copied.

OpenAFS does not perform CoW at a byte range level.

On Linux btrfs and xfs both support CoW at the block level.   A one byte 
change to
a 1GB file on btrfs will result in one block being copied and 
modified.   Whereas

OpenAFS will copy the entire 1GB.


Unfortunately (at least for my use-case) losing the checksumming and
compression is a no-go, because these were exactly the features that
made BTRFS appealing versus Ext4.

If you say so...
AFS does its own data checksumming.


OpenAFS does not maintain checksums.  Checksums are neither transmitted in
the RXAFS_FetchData and RXAFS_StoreData RPCs messages nor are checksums
stored and compared when reading and writing to the vice partition.

Granted, RAID is not a backup solution, but it should instead protect
one from faulty hardware.  Which is exactly what it doesn't do 100%,
because if one of the drive in the array returns corrupted data, the
RAID system can't say which one is it (based purely on the returned
data).  Granted, disks don't just return random data without any other
failure or symptom.
Bit flips occur more frequently than we would like which was the 
rationale behind

adding checksums and multiple copies and self-healing to ZFS.

With regard to file-system scrubbing, to my knowledge, only those that
actually have checksumming can do this, which currently is either
BTRFS or ZFS.
There are some other examples but these are the only two regularly 
available as

a local Linux filesystem.

Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


[OpenAFS] RFC: Altering the processing of IPv4 (aka Host) ACLs to enforce negative rights

2023-03-20 Thread Jeffrey E Altman
On 7 March Andrew Deason submitted a patch to OpenAFS documenting the 
existing behavior of the OpenAFS fileserver when computing Anonymous and 
Caller Access Rights if the IPv4 address from which the RXAFS RPC was 
received matches a PTS host entry and that PTS entry matches an Access 
Control Entry (ACE).


https://gerrit.openafs.org/#/c/15340/

Quoting Andrew's submission to the fs_setacl man page:

   "Combining _Negative rights_ granted from machine entries (IP
   addresses) and _Normal rights_ granted from non-machine entries (or
   vice versa) will generally not work as expected. Permissions granted
   by machine entries and by non-machine entries are calculated
   separately, and both sets of permissions are given to an accessing
   user. For example, if permissions are granted to an authenticated
   user or group (or _system:anyuser_), you cannot remove those
   permissions from specific hosts by adding machine entries to a group
   in an ACL in the _Negative rights_> section."

The IBM AFS Administrator's Guide "Protecting Data in AFS" section states:

   "When determining what type of access to grant to a user, theFile
   Server first compiles a set of permissions by examiningall of the
   entries in the Normal rights section of the ACL. Itthen subtracts
   any permissions associated with the user (orwith groups to which the
   user belongs) on the Negative rightssection of the ACL. Therefore,
   negative permissions alwayscancel out normal permissions."

IBM/Transarc AFS 3.2 introduced the granting of permissions based upon 
the host's IPv4 address in addition to those granted to the caller. The 
implementation evaluates the caller's rights independently of the host's 
rights and then ORs the results. This approach violates the statement 
that negative permissions always cancel out normal (aka positive) 
permissions. If a caller is granted "read" but there is a matching 
negative "read" ACE (aka permission) for the host, the negative "read" 
ACE is ignored. Likewise if "lookup" is granted to the host but the 
caller matches a negative "lookup" ACE, then the caller's negative 
"lookup" ACE is ignored.


The problem can be demonstrated with a couple of examples.   First, lets 
define some PTS entities and membership relations:


 * user: jane = 1000
 o member: system:authusers
 o member: system:anyuser
 o member: no-admin
 * user: 128.66.0.130 = 2000
 o member: local-hosts
 * group: no-admin = -100
 o member: jane
 * group: local-hosts = -500
 o member: 128.66.0.130

Example 1:  RXAFS RPC received from a host that is not a member of 
local-hosts


ACL

 * system:anyuser: l; -none
 * system:authuser: lrk; -none
 * jane: none; -r
 * local-hosts: r; -none

Rights:

 * system:anyuser: lookup
 * system:authuser: lookup, read, lock
 * jane: lookup, lock

When "jane" accesses a file with this ACL the granted rights will be 
"lk" because the negative read permission cancels the positive read 
permission granted by the membership in the system:authuser group.



Example 2: RXAFS RPC received from a host that is a member of local-hosts

ACL

 * system:anyuser: l; -none
 * system:authuser: lrk; -none
 * jane: none; -r
 * local-hosts: r; -none

Rights:

 * system:anyuser: lookup, read
 * system:authuser: lookup, read, lock
 * jane: lookup, read, lock

In this case, even though "jane" is denied the "read" permission granted 
to members of "system:authuser" because of the negative "read" in the 
"jane" ACE she is granted the permission because of the positive read 
permission granted to "local-hosts" members. The granting of "read" 
permission to "jane" is an unexpected result!



Example 3: RXAFS RPC received from a host that is not a member of 
local-hosts


ACL

 * system:anyuser: l; -none
 * system:authuser: lrk; -none
 * jane: lrkwid; -none
 * local-hosts: none; -wida

Rights:

 * system:anyuser: lookup
 * system:authuser: lookup, read, lock
 * jane: lookup, read, lock, write, insert, delete

In this case, "jane" is granted all of the permissions other than "admin".


Example 4: RXAFS RPC received from a host that is a member of local-hosts

ACL

 * system:anyuser: l; -none
 * system:authuser: lrk; -none
 * jane: lrkwid; -none
 * local-hosts: none; -wida

Rights:

 * system:anyuser: lookup
 * system:authuser: lookup, read, lock
 * jane: lookup, read, lock, write, insert, delete

In this case, "jane" is granted all of the permissions other than 
"admin".   However, because the RPC was issued from a host that is a 
member of "local-hosts" the expected result would be "jane" receiving 
only the "lookup, read, lock" rights.    The granting of "write, insert 
and delete" permission is an unexpected outcome!



In examples 2 and 4 rights are granted to the caller that would appear 
to be contrary to the explicit use of negative rights in the access 
control entries.  The example 4 use case might represent client systems 
which are intended to be read-only consumers of the /afs content 

Re: [OpenAFS] Potential connection loss to CERN AFS cell (retirement of old VLDB servers)

2023-01-26 Thread Jeffrey E Altman

On 1/26/2023 10:18 AM, Diogo Castro (diogo.cas...@cern.ch) wrote:


In the next week, CERN will turn off the last two original AFS CERN 
VLDB servers (or rather, the machines using their IP addresses). For 
reasons related to our network structure and IP allocation, we could 
not keep the old IPs when retiring the current server generation.


AFS clients still using (only) these IPs will no longer be able to 
connect to the CERN AFS cell.


We have attempted to get the central CellServDB updated ahead of this 
change (first with new IPs, then to use (only) DNS for "cern.ch"). 
However, CellServDB is shipped by various distributions, and anyway 
only considered at (Linux) client start.


CERN sent the requests to update the GRAND.CENTRAL.ORG Public Cell 
Service Database [https://grand.central.org/csdb.html] and OpenAFS 
[https://gerrit.openafs.org/#/c/14842/] in November 2021.   I do not 
believe there is anything more that CERN could have done to prevent end 
user inconvenience.


Thank you for trying.


How to check whether a particular AFS client is affected:

$ fs getserverprefs -vlservers | grep -E 'afsdb[0-9]+.cern.ch'

- if the output only mentions afsdb1{1,2,3,4}.cern.ch, the 
configuration is DNS-based and correct - no issues expected.


- if the output mentions a mix of afsdb{1,2}.cern.ch and 
afsdb1{1,2,3}.cern.ch (the current central CellServDB config), this 
client will switch automatically to our new servers, possibly after a 
short hiccup - no major issues expected.


- if the output only has afsdb{1,2}.cern.ch, this client will not be 
able to connect to CERN.CH in the future.


Our recommendation is to use DNS - CellServDB should have an entry for 
the cell but no IPs:


>cern.ch    #European Laboratory for Particle Physics, Geneva

/>(next cell info here)


For those that have AuriStorFS clients deployed, graceful transition to 
the afsdb1{1,2,3,4}.cern.ch servers occurred when the afsdb{1,2}.cern.ch 
sites were removed from the published _afs3-vlserver._udp.cern.ch DNS 
SRV record.


Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Zabbix monitoring AFS health

2022-09-20 Thread Jeffrey E Altman

On 9/20/2022 2:45 PM, Christopher D. Clausen (cclau...@acm.org) wrote:
Back when I ran a cell that people other than me cared about, I had 
implemented various checks from:

https://www.eyrie.org/~eagle/software/afs-monitor/

I do not know anything about Zabbix, but I assume it is possible to 
take these nagios checks and make them work?


<

Russ' afs-monitor repository has been orphaned.

AuriStor maintains a fork with some changes to existing functionality 
and a new "afs_check_fs_vldb" check which verifies that a fileserver is 
registered with the location service using the correct UUID.


https://github.com/auristor/afs-monitor

Enjoy!

Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] aklog: unknown RPC error (-1765328370) while getting AFS tickets

2022-09-14 Thread Jeffrey E Altman
On 9/14/2022 2:17 PM, Jose M Calhariz (jose.calha...@tecnico.ulisboa.pt) 
wrote:

On Wed, Sep 14, 2022 at 02:00:02PM -0400, Jeffrey E Altman wrote:


If your cell name is "your-cell-name.com" then these need to be

addprinc -randkey -e aes256-cts-hmac-sha1-96 afs/your-cell-name.com
ktadd -k /root/rxkad.keytab afs/your-cell-name.com

The use of "afs@REALM" is ambiguous in environment where there are multiple 
cells authenticated by a single REALM.


Good to know, in my case I am setting up new kerberos realm and new
OpenAFS cells just for testing.  This ambiguos afs principal is good
for me, but maybe not enough for other people.
When searching for a service principal, aklog will search for principals 
in this order


1. afs/your-cell-name.com@   referral request sent to the client
   principal's REALM
2. afs/your-cell-name.com@REALM
3. afs@REALM

If afs/your-cell-name.com@REALM does not exist, there will be a negative 
lookup and the cost of the extra round trips.


"afs@REALM" should not be used for a new cell.  That name made sense 
when there was a one-to-one mapping between cell and realm due to the 
existence of "kaserver".


The preference for afs/your-cell-name.com@REALM over afs@REALM has been 
present in OpenAFS since the MIT AFS-Kerberos 5 Migration Kit was merged 
in November 2004.


OpenAFS 1.4.0 was the first release which integrated Kerberos v5 support.

Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] aklog: unknown RPC error (-1765328370) while getting AFS tickets

2022-09-14 Thread Jeffrey E Altman
On 9/14/2022 12:57 PM, Jose M Calhariz 
(jose.calha...@tecnico.ulisboa.pt) wrote:


My updated instructions are:

kadmin.local
addprinc -randkey -e aes256-cts-hmac-sha1-96 afs
ktadd -k /root/rxkad.keytab afs
getprinc afs
quit


If your cell name is "your-cell-name.com" then these need to be

addprinc -randkey -e aes256-cts-hmac-sha1-96 afs/your-cell-name.com
ktadd -k /root/rxkad.keytab afs/your-cell-name.com

The use of "afs@REALM" is ambiguous in environment where there are multiple 
cells authenticated by a single REALM.

Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] aklog: unknown RPC error (-1765328370) while getting AFS tickets

2022-09-12 Thread Jeffrey E Altman
On 9/12/2022 11:49 AM, Jose M Calhariz 
(jose.calha...@tecnico.ulisboa.pt) wrote:
Todo the setup of the cell I was following the instrtuctions from 
Debian 9. So I have done:

kadmin.local
addprinc -randkey -e des-cbc-crc:v4 afs
ktadd -k /root/afs.keytab -e des-cbc-crc:v4 afs
getprinc afs
quit


There are a couple of things wrong with these directions.

1. The service principal that should be created is "afs/" not
   "afs".
2. The encryption types that must be added are afs256-cts-hmac-sha1-96
   and rc4-hmac (if you wish to support Windows clients)



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] aklog: unknown RPC error (-1765328370) while getting AFS tickets

2022-09-12 Thread Jeffrey E Altman
On 9/12/2022 10:10 AM, Jose M Calhariz 
(jose.calha...@tecnico.ulisboa.pt) wrote:

Hi,

I have setup a test cell of OpenAFS 1.6.x, Debian 9.  For testing the
upgrade to Debian 11.  When I do the initial setup of the cell and do
the first aklog I get the following error:

aklog: unknown RPC error (-1765328370) while getting AFS tickets

How do I get the meaning of this error?  This error number is not on
Google.  I have just tested the aklog command on the client against
another cell and it worked.  So my problem is the new cell.


The error is Kerberos v5 error KRB5KDC_ERR_ETYPE_NOSUPP, "KDC has no 
support for encryption type".


Is the OpenAFS client version older than 1.6.5?

Prior to 1.6.5 aklog explicitly requested AFS service tickets with a 
DES-CBC-CRC session key.


Alternatively, the AFS service principal for the test cell might have 
been created without an AES key.


Jeffrey Altman






smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] OpenAFS with GDM in Ubuntu 22.04 (or 20.04)?

2022-08-28 Thread Jeffrey E Altman

On 8/28/2022 3:14 AM, jukka.tuomi...@finndesign.fi wrote:

Hi all,

I wonder if anybody has OpenAFS client working with GDM in Ubuntu 
22.04 (or 20.04)? That is, allowing users to log into their homedirs 
graphically.


The underlying problem is that GDM heavily relies upon processes 
launched as children of "systemd --user" services.  As a result they do 
not share the same session keyring as the child processes of login.   
The "systemd --user" expectation is that all processes executing as a 
"uid" have access to the same authentication credentials whether they be 
local or remote.  In such an environment, AFS Process Authentication 
Groups (PAGs) cannot be created as a side-effect of login.


Modify the pam configuration to disable PAG creation for GDM logins.

If the expectation is that "sshd" logins should be separate from the 
desktop, then "sshd" logins can continue to create a PAG.


Sincerely,

Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Limiting mount point to known cells

2022-08-27 Thread Jeffrey E Altman

On 8/27/2022 4:34 AM, Harald Barth (h...@kth.se) wrote:


But wait a moment... Can't we assume that all cell names that we
ask in DNS contain at least one dot "." in the middle? I doubt
that there are AFS cells named without dot that we need to
resolve with DNS. What do you think about that?


Please keep in mind that /afs/.git might be a cell whose alias is "git" 
or that "git" is to be combined with a domain in the DNS search list.


I seem to remember seeing many paths of the form /afs/cs/ or /afs/ece/ 
where the full cell names were cs.cmu.edu or ece.cmu.edu.


A question for the original poster is "what are the DNS queries that are 
being issued to the DNS resolver at 127.0.0.53?


Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Limiting mount point to known cells

2022-08-26 Thread Jeffrey E Altman

On 8/26/2022 5:13 PM, Ingo van Lil (ing...@gmx.de) wrote:

Hello OpenAFS experts,

is there any way to run an AFS client with both the -dynroot and -afsdb
options, but still limit the /afs mount point to known cells
(specifically: only my home cell)?


There is no explicit support for this behavior in OpenAFS but you might be
able to approximate it by

 * enabling -dynroot
 * disabling -afsdb
 * removing the OpenAFS distributed CellServDB file
 * creating a CellServDB file contain only one line for the cell and no
   servers
>my.cell # My personal cell

A cell entry with no servers is an implicit request to lookup the 
servers via DNS.

I do not remember if this works with -afsdb disabled but it might.



Longer explanation of my problem:

When I run "git status" somewhere inside the AFS hierarchy it freezes
for a minute or two. git tries to access the directory /afs/.git, and I
see that afsd sends multiple DNS requests to the loopback address
127.0.0.53. Not sure why it does that, it seems to be somehow related to
systemd-resolved in Fedora Linux.

Running without -dynroot solves the issue, but according to the manual
it will keep my machine from booting in case my home cell can't be
contacted. Not very attractive.

Running without -afsdb solves the issue. That's what I do now, but it
requires to manually specify the servers for my home cell in CellServDB.
Ideally I'd like to get that info from DNS.

Thanks in advance for any advice you can give!

Regards,
Ingo

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Kerberos + Windows

2022-08-24 Thread Jeffrey E Altman

On 8/24/2022 12:53 PM, Ben Huntsman (b...@huntsmans.net) wrote:


   Here's some configuration info:

   Let's say my cell is going to be mydomain.com.  My Active Directory 
is ad.mydomain.com, and my AFS service account is srvAFS.


When installing Active Directory for a domain "mydomain.com" it is best 
if the Active Directory domain is "MYDOMAIN.COM" instead of 
"AD.MYDOMAIN.COM".  This is because Kerberos clients will attempt to use 
the DNS name of the host as the Kerberos realm name.   The use of 
"AD.MYDOMAIN.COM" or "WIN.MYDOMAIN.COM" naming is common only in cases 
where there is a pre-existing Kerberos realm for "MYDOMAIN.COM".




Here's my krb5.conf:

[libdefaults]
        default_realm = AD.MYDOMAIN.COM
        default_keytab_name = FILE:/etc/krb5/krb5.keytab
        dns_lookup_realm = true
        dns_lookup_kdc = true
        forwardable = true

[realms]
        AD.MYDOMAIN.COM = {
                kdc = ad.mydomain.com:88
                admin_server = ad.mydomain.com:749
                default_domain = ad.mydomain.com
        }

[domain_realm]
        .ad.mydomain.com = AD.MYDOMAIN.COM
        ad.mydomain.com = AD.MYDOMAIN.COM



You also need to add


    .mydomain.com = AD.MYDOMAIN.COM

    mydomain.com = AD.MYDOMAIN.COM


since the Kerberos realm is not the same as the DNS domain name used for 
the AFS service principal.




[logging]
        kdc = FILE:/var/krb5/log/krb5kdc.log
        admin_server = FILE:/var/krb5/log/kadmin.log
        kadmin_local = FILE:/var/krb5/log/kadmin_local.log
        default = FILE:/var/krb5/log/krb5lib.log


I then created the service account srvAFS, and extracted a keytab on 
the Domain Controller using the following command:


ktpass /princ afs/mydomain@ad.mydomain.com /mapuser srvAFS /mapop 
add /out rxkad.keytab +rndpass /crypto all /ptype KRB5_NT_PRINCIPAL 
+dumpsalt


The use of "afs/mydomain@ad.mydomain.com" is correct. The 
"a...@ad.mydomain.com" service principal name should no longer be used 
and must never be used with Active Directory.





I verified that the account did not have the "Use only Kerberos DES 
encryption types for this account" box checked. I then copied the 
rxkad.keytab over to the UNIX host.  I built OpenAFS with a prefix of 
/opt/openafs, so I put the keytab in /opt/openafs/etc/openafs/server


I used ktutil to delete the two des entries in the keytab. ktutil 
indicates that the KVNO is 5.


I then added the keys to OpenAFS using the command:

asetkey add rxkad_krb5 5 17 
/opt/openafs/etc/openafs/server/rxkad.keytab afs/mydomain.com
asetkey add rxkad_krb5 5 18 
/opt/openafs/etc/openafs/server/rxkad.keytab afs/mydomain.com


For an Active Directory realm you most likely also need to add rc4-hmac, 
enctype 23.



Did the above "asetkey" commands succeed?   Since the cell is named 
"mydomain.com" I would expect asetkey to expand "afs/mydomain.com" to 
"afs/mydomain@mydomain.com" which is not going to be present in the 
rxkad.keytab file.



What is the output of "asetkey list" after the above commands were executed?



But things aren't quite working:

# ls /afs
afs: Tokens for user of AFS id 204 for cell mydomain.com are discarded 
(rxkad error=19270408, server 192.168.0.114)

ls: /afs: The file access permissions do not allow the specified action.

# kvno adu...@ad.mydomain.com
kvno: Server not found in Kerberos database while getting credentials 
for adu...@ad.mydomain.com

This is not expected to work.


# vos listvol myserver
Could not fetch the list of partitions from the server
rxk: ticket contained unknown key version number
Error in vos listvol command.
rxk: ticket contained unknown key version number


19270408 = rxk: ticket contained unknown key version number


It means the OpenAFS servers are not finding the expected key entry.   
There is not a match for the combination of enctype and key version 
number and name.



# kinit -kt /opt/openafs/etc/openafs/server/rxkad.keytab 
afs/mydomain@ad.mydomain.com


The above command is using the afs/mydomain@ad.mydomain.com keytab 
entry to obtain a client Ticket Granting Ticket.    I doubt that is what 
you intended.



Instead you wanted to "kinit" using a client principal and then execute 
the kvno command below.



# kvno afs/mydomain@ad.mydomain.com
afs/mydomain@ad.mydomain.com: kvno = 5


In addition to the key version number you also need to know the 
encryption type used to encrypt the service private portion of the 
afs/mydomain@ad.mydomain.com service ticket.  It is that encryption 
type which does not need to match either the encryption type used to 
encrypt the client private portion of the ticket or the session key 
which needs to match the keys added via asetkey.



After adding the keys via "asetkey" did you install KeyFileExt on every 
server in the cell?



Did you restart the services or touch the server instance of the 
CellServDB file to force the new keys to be loaded?





Did I miss something, or make a mistake 

Re: [OpenAFS] Kerberos + Windows

2022-08-24 Thread Jeffrey E Altman
On 8/23/2022 9:24 PM, Ben Huntsman (b...@huntsmans.net) wrote:
> Hi guys-
>    Does anyone have a working krb5.conf that works with Windows 2012
> R2 or newer?
>
>    The docs do show how to set up using the new scheme but assume
> Kerberos, not AD.  I've tried a few different things but I can't seem
> to get default_tkt_enctypes and default_tks_enctypes set correctly.
>
Ben,


A krb5.conf is configuration for an MIT or Heimdal Kerberos client but
not for a Microsoft Windows Kerberos client.

Please clarify which Kerberos client implementation you are configuring.


I agree with Ken that default_tkt_enctypes and default_tks_enctypes
should never be configured on clients.


Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] linking afs.ext.64 on AIX fails with missing symbol vprintf

2022-08-13 Thread Jeffrey E Altman

On 8/13/2022 12:20 PM, Ben Huntsman (b...@huntsmans.net) wrote:
Ah, yes, that is what I thought.  The problem is that AIX's kernel 
doesn't have vprintf.  Only printf.  However, the change set you 
linked indicates that previously, osi_Msg used fprintf, and indeed 
that goes all the way back to the beginning.  That's why I wonder how 
it worked on AIX in the past.  With no vprintf in the kernel, what 
alternative should we use here?


The prior change had osi_Msg as a macro not a function.  That means that 
the fprintf reference would have been substituted for wherever osi_Msg 
was used.  Assuming that osi_Msg was defined for AIX at all.


The referenced change was intended to fix a problem specific to Linux.  
You might want to revert the change and see whether or not you make 
further progress.



As an aside, development discussions such as this are not of general 
interest to the end user community and are more appropriate to be held 
on openafs-de...@openafs.org instead.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] linking afs.ext.64 on AIX fails with missing symbol vprintf

2022-08-13 Thread Jeffrey E Altman
On 8/13/2022 1:57 AM, Ben Huntsman (b...@huntsmans.net) wrote:
> After a few tweaks to some of the source files (which I will submit
> later), I have all the code for afs.ext.64 compiling, but it fails to
> link due to a missing symbol .vprintf.  The AIX man pages show that
> this is included in /lib/libc.a, and nm confirms it.  
>
libc is a userspace library.   The failure is when linking the kernel
module and there is no vprintf in the kernel.


> The reference is from src/rx/rx_kcommon.c:
>
> void
> osi_Msg(const char *fmt, ...)
> {
>     va_list ap;
>     va_start(ap, fmt);
> #if defined(AFS_LINUX_ENV)
>     vprintk(fmt, ap);
> #else
>     vprintf(fmt, ap);
> #endif
>     va_end(ap);
> }
>
>
> Just as another sloppy fix I tried several variants of print functions
> that could substitute on AIX, but they all fail with a missing
> symbol.  How did this work on AIX in the past?
>
The vprintf usage in kernel on AIX was introduced by


  https://gerrit.openafs.org/14791





smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] OpenAFS vs IBM AFS

2022-08-12 Thread Jeffrey E Altman

On 8/12/2022 2:01 PM, Ben Huntsman (b...@huntsmans.net) wrote:
   That is about what I thought.  I guess I ask because for those of 
us who work more with AIX than the other platforms, it would be 
interesting and valuable to be able to track the IBM code base as 
well, even if that were kept in a separate repository from OpenAFS.
I'm not sure how tracking the IBM code base is all that helpful. The IBM 
AFS client does not support new RPCs added to OpenAFS nor does it 
include the most of the other changes.  In the end its the developer 
time and access to AIX systems and development tools that are required 
to support an AFS client and server.   If there is an end user community 
for an AIX AFS client, then it would be helpful if that community would 
provide resources to OpenAFS to make it happen.


   I'm also very interested in what it took to clean the code base to 
achieve the 1.0 release.  I know some things were removed such as that 
washtool thing, and the special version of AIX's fsck that is AFS-aware.
The primary changes were to remove code that IBM didn't have permission 
to re-license, comments that referenced customers, or functionality that 
was specific to certain private builds, and any references to 
individuals by name.    There was more but that was the practical work.  
Reviewing a million line code base so legal can sign off on things is a 
lot of work.
But that was a long time ago.  I wonder if times have changed and if 
there would be fewer legal and technical hurdles to releasing some of 
those things?


I doubt it.  All of the original work would need to be repeated.


The AIX AFS-aware fsck would be worthwhile even now.
I disagree.  In OpenAFS any validation of the contents of the Volume 
Group object stores located in the vice partition file systems should be 
performed by the on-demand salvager.   There should be no need to run an 
external tool while the services are shutdown.



Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] OpenAFS vs IBM AFS

2022-08-12 Thread Jeffrey E Altman

On 8/12/2022 12:50 PM, Ben Huntsman (b...@huntsmans.net) wrote:

Hi guys-

   So I know IBM released the AFS code to the community at the 
beginning and that is what became OpenAFS.  But from various release 
notes on the IBM site, it would seem that IBM continued (and 
continues) to develop its own AFS internally as well.


   Does anyone know how far the IBM vs OpenAFS code bases have 
diverged?  I know they at least have more AIX ports than the OpenAFS 
code currently does...


IBM released OpenAFS 1.0 on 31 Oct 2000.   That release was a fork from 
IBM AFS 3.6.  The fork itself at this point was substantial.  IBM had to 
clean the code base before it could be released.   The diff stat between 
these releases was not inconsequential.



IBM has continued development of IBM AFS 3.6.  There has been no effort 
to synchronize with OpenAFS.  They are very much independent creatures 
at this point.  Since the openafs-ibm-1_0 release OpenAFS has undergone 
substantial change



  6127 files changed, 1308387 insertions(+), 567306 deletions(-)



   Does anyone know anyone at IBM that could be asked if IBM would be 
willing to re-contribute it's current codebase?


Yes we know people and they know us.  It wouldn't be worth asking.    
There is simply too much churn to merge code changes.


At best, concepts and features added to IBM AFS 3.6pXXX could be 
re-implemented in OpenAFS.



Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Question for admins regarding pts membership output

2022-07-15 Thread Jeffrey E Altman
On 7/15/2022 6:18 PM, Richard Brittain (richard.britt...@dartmouth.edu) 
wrote:

On 2022-07-15, 09:04, "Jeffrey E Altman"  wrote:

 On 7/13/2022 6:07 PM, Richard Brittain (richard.britt...@dartmouth.edu)
 wrote:
 > I hope that doesn't lead people to expect 'pts membership 
system:authuser' to show all users.
 >
 > Richard

 I'm curious.  Why would it be wrong for users to expect 'pts membership
 system:authuser' and 'pts membership system:anyuser' to list their
 membership assuming the caller had the necessary access rights?



Only that the output of system:authuser would be confusingly long, and what 
would system:anyuser generate anyway ?.  We also have scripts for 'show me 
everyone who has access to this entity', which gets complicated with nested 
groups, and I couldn't figure out what to display for 'everyone'.  It would be 
valid to ignore named users in the ACL and just say 'everyone' in that case.


What to display for "everyone" is easy, its "system:anyuser".

The output of system:authuser in OpenAFS would be close to the output of

    pts listentries -user | grep -v '@' | grep -v 'anonymous' | gawk 
'{print $1}'


In other words, the list of all user entries that are not foreign and 
are not "anonymous".   it would also exclude any IP address entries.


The output of system:anyuser would be

    pts listentries -user | gawk '{print $1}'

again with the exception of all IP address entries.   The difference is 
that system:anyuser output includes "anonymous" and the foreign entities.


In an AuriStorFS world the system:authuser and system:anyuser lists 
would also exclude "machine" and "network" entities.


Enumerating the membership of system:anyuser and system:authuser would 
by default be restricted to "-showmembers self" which means that only 
members of the system:administrators group would be able to enumerate 
the membership.


A cell that wished to offer broader access might set "-showmembers 
members" on system:authuser but that would be the same as "-showmembers 
anyone" for "system:anyuser".   I think the default is appropriate for 
all cells.



Tangentially related, we use a wrapper to list AFS groups, which looks up a few 
bits of useful information about each member besides their AFS username.  This 
is very user-friendly, but means lots of LDAP lookups and would take forever on 
the full output of system:authuser.


Makes sense.   That would take a while for a cell with several hundred 
thousand users.


I can imagine a plugin for both the protection service and the pts 
client that would allow the protection service to query LDAP or some 
other service and return an opaque blob to the pts client to be unpacked 
and displayed by the pts plugin.


Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Question for admins regarding pts membership output

2022-07-15 Thread Jeffrey E Altman
On 7/13/2022 6:07 PM, Richard Brittain (richard.britt...@dartmouth.edu) 
wrote:

I hope that doesn't lead people to expect 'pts membership system:authuser' to 
show all users.

Richard


I'm curious.  Why would it be wrong for users to expect 'pts membership 
system:authuser' and 'pts membership system:anyuser' to list their 
membership assuming the caller had the necessary access rights?  My 
primary objection to the existing behavior is that these groups are 
special and end users / administrators must understand that they are 
special.   If an authorized user can obtain the membership list from 
'pts membership system:authuser@foreign' why shouldn't the same be true 
for 'system:authuser'?   If the concern is the cost of generating the 
result set, its no more expensive then executing 'pts listentries'.


In a private response to my original message someone wrote that their 
cell uses the output of 'pts membership' to generate the list of 
entities that have access to a file object given the assigned ACL.  This 
is a perfectly reasonable action to expect to work.  However, the 
generated list will be incomplete when 'pts membership system:anyuser' 
and 'pts membership system:authuser' succeed while at the same time 
generate empty output.


Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Question for admins regarding pts membership output

2022-07-13 Thread Jeffrey E Altman
t; switch.


Switches such as -no-system-anyuser and -no-system-authuser could be 
added and when set the "pts" command could filter out the existence of 
groups -101 and -102.  Although I find such options ugly compared to 
ensuring that there is no failure when attempting to remove an explicit 
user-group membership that is not present.



thanks.


Thank you all for the feedback.

Jeffrey Altman

P.S. - I really dislike top posting on mailing lists.



On Wed, Jul 13, 2022 at 09:49:29AM -0400, Jeffrey E Altman wrote:

The Protection Service groups fall into two categories.   Those with
explicit membership lists and those with implicit membership lists.   For
example, the "system:anyuser" and "system:authuser" groups are implicit
whereas "system:administrators", "system:ptsviewers", and
"system:authuser@foreign-realm" groups are explicit.

The output of "pts membership" only includes memberships in explicit
membership groups.   This has a negative impact inexperienced end users that
might be unaware that they are members of the "system:anyuser" and
"system:authuser" groups. This behavior also leads to an inconsistency
between the behavior for foreign and local users because foreign users are
not members of "system:authuser" and are members of
"system:authuser@foreign" which is included in the membership list because
that group has an explicit membership list.

The AuriStorFS  Protection service also makes a distinction between "user"
and "machine" or "network" entities where "machine" and "network" entities
are not members of the "system:authuser" or "system:authuser@foreign"
groups.   This distinction is not apparent from the output of "pts
membership" because of the exclusion of implicit groups.

AuriStor is considering a change to "pts membership" output to include
implicit memberships in the output of "pts membership". With this change the
output of these commands

   $ pts membership anonymous
   Groups anonymous (id: 32766) is a member of:

   $ pts membership testuser
   Groups anonymous (id: 112) is a member of:

   $ pts membership testuser@foreign
   Groups anonymous (id: 43282) is a member of:
     system:authuser@foreign

becomes

   $ pts membership anonymous
   Groups anonymous (id: 32766) is a member of:
     system:anyuser

   $ pts membership testuser
   Groups anonymous (id: 112) is a member of:
     system:anyuser
     system:authuser

   $ pts membership testuser@foreign
   Groups anonymous (id: 43282) is a member of:
     system:authuser@foreign
     system:anyuser

The question for cell admins is whether anyone is aware of any internal
scripts which process the output of "pts membership" which will break as a
result of the inclusion of the implicit groups "system:anyuser" and
"system:authuser" in output.

Your assistance is appreciated.

Jeffrey Altman
AuriStor, Inc.






smime.p7s
Description: S/MIME Cryptographic Signature


[OpenAFS] Question for admins regarding pts membership output

2022-07-13 Thread Jeffrey E Altman
The Protection Service groups fall into two categories.   Those with 
explicit membership lists and those with implicit membership lists.   
For example, the "system:anyuser" and "system:authuser" groups are 
implicit whereas "system:administrators", "system:ptsviewers", and 
"system:authuser@foreign-realm" groups are explicit.


The output of "pts membership" only includes memberships in explicit 
membership groups.   This has a negative impact inexperienced end users 
that might be unaware that they are members of the "system:anyuser" and 
"system:authuser" groups. This behavior also leads to an inconsistency 
between the behavior for foreign and local users because foreign users 
are not members of "system:authuser" and are members of 
"system:authuser@foreign" which is included in the membership list 
because that group has an explicit membership list.


The AuriStorFS  Protection service also makes a distinction between 
"user" and "machine" or "network" entities where "machine" and "network" 
entities are not members of the "system:authuser" or 
"system:authuser@foreign" groups.   This distinction is not apparent 
from the output of "pts membership" because of the exclusion of implicit 
groups.


AuriStor is considering a change to "pts membership" output to include 
implicit memberships in the output of "pts membership". With this change 
the output of these commands


  $ pts membership anonymous
  Groups anonymous (id: 32766) is a member of:

  $ pts membership testuser
  Groups anonymous (id: 112) is a member of:

  $ pts membership testuser@foreign
  Groups anonymous (id: 43282) is a member of:
    system:authuser@foreign

becomes

  $ pts membership anonymous
  Groups anonymous (id: 32766) is a member of:
    system:anyuser

  $ pts membership testuser
  Groups anonymous (id: 112) is a member of:
    system:anyuser
    system:authuser

  $ pts membership testuser@foreign
  Groups anonymous (id: 43282) is a member of:
    system:authuser@foreign
    system:anyuser

The question for cell admins is whether anyone is aware of any internal 
scripts which process the output of "pts membership" which will break as 
a result of the inclusion of the implicit groups "system:anyuser" and 
"system:authuser" in output.


Your assistance is appreciated.

Jeffrey Altman
AuriStor, Inc.



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] How to replace pam_krb5 on RHEL 8 systems

2022-07-11 Thread Jeffrey E Altman

reply inline

On 7/11/2022 4:30 AM, Stephan Wonczak (a0...@rrz.uni-koeln.de) wrote:

Hi Jeffrey,
  Thanks for having a look at the problem.
  However, I obviously did not do a very good job detailing exactly 
what we did ... so here's my next try. Warning: It is going to be 
lengthy :-)


  First off: We do not use SSSD. And we would like to keep it that 
way, since it caused various massive problems in the past.


  On RHEL-7, everything works perfectly. We are using the 
RedHat-supplied RPM of pam_krb5: pam_krb5-2.4.8-6.el7.x86_64


The version of pam_krb5 is not the only variable that matters. As I 
mentioned in my earlier replies pam_krb5-2.4.8-6.el7 does not include 
support for rxkad-kdf which is required in order to make use of Kerberos 
encryption types other than des-cbc-crc for example 
aes256-cts-hmac-sha1-96.   Without that functonality pam_krb5 only works 
with Kerberos v5 service tickets whose session keys are des-cbc-crc.






We then took the source PRM: pam_krb5-2.4.8-6.el7.src.rpm and did a 
rebuild on a RHEL-8-Machine. This worked without any errors.

  However, when we try to use this to get a token, this happens:

...
Jul  8 15:14:57 kicktest.rrz.uni-koeln.de sshd[2204130]: 
pam_krb5[2204130]: error obtaining credentials for 
'afs/rrz.uni-koeln...@rrz.uni-koeln.de' (enctype=1) on behalf of 
'a0...@rrz.uni-koeln.de': No credentials found with supported 
encryption types
Jul  8 15:14:57 kicktest.rrz.uni-koeln.de sshd[2204130]: 
pam_krb5[2204130]: error obtaining credentials for 
'afs/rrz.uni-koeln...@rrz.uni-koeln.de' (enctype=2) on behalf of 
'a0...@rrz.uni-koeln.de': No credentials found with supported 
encryption types
Jul  8 15:14:57 kicktest.rrz.uni-koeln.de sshd[2204130]: 
pam_krb5[2204130]: error obtaining credentials for 
'afs/rrz.uni-koeln...@rrz.uni-koeln.de' (enctype=3) on behalf of 
'a0...@rrz.uni-koeln.de': No credentials found with supported 
encryption types
Jul  8 15:14:57 kicktest.rrz.uni-koeln.de sshd[2204130]: 
pam_krb5[2204130]: attempting to obtain tokens for "rrz.uni-koeln.de" 
("a...@rrz.uni-koeln.de")
Jul  8 15:14:57 kicktest.rrz.uni-koeln.de sshd[2204130]: 
pam_krb5[2204130]: error obtaining credentials for 
'a...@rrz.uni-koeln.de' (enctype=1) on behalf of 
'a0...@rrz.uni-koeln.de': No credentials found with supported 
encryption types
Jul  8 15:14:57 kicktest.rrz.uni-koeln.de sshd[2204130]: 
pam_krb5[2204130]: error obtaining credentials for 
'a...@rrz.uni-koeln.de' (enctype=2) on behalf of 
'a0...@rrz.uni-koeln.de': No credentials found with supported 
encryption types
Jul  8 15:14:57 kicktest.rrz.uni-koeln.de sshd[2204130]: 
pam_krb5[2204130]: error obtaining credentials for 
'a...@rrz.uni-koeln.de' (enctype=3) on behalf of 
'a0...@rrz.uni-koeln.de': No credentials found with supported 
encryption types

...


   ETYPE_DES_CBC_CRC(1)
   ETYPE_DES_CBC_MD4(2)
   ETYPE_DES_CBC_MD5(3)

The pam_krb5 from rhel7 only knows how to request tickets with DES 
encryption types.  It assumes that OpenAFS cannot support anything else 
because it does not have the rxkad-kdf functionality that was added to 
pam_krb5 post-rhel7 (Jan 4, 2016):


https://github.com/frozencemetery/pam_krb5/commit/3be27655bf9d2520e776ef22ba6bb9486005fff1

To reiterate: We get both kerberos ticket and AFS-Token on RHEL-7. On 
RHEL-8, we still get a valid kerberos ticket, but getting the 
AFS-Token fails. It -is- possible, however, to get a valid AFS-Token 
by klog.krb5. So -in principle- everything is in place to have this 
done by pam_afs.
  The problem is: I have no way to determine why it is complaining 
about "no supported encryption types" when other tools have no 
problems at all!


The answer to this is simple.  The krb5 libraries included in rhel7 
support DES encryption types.   The krb5 libraries included with rhel8 
do not.   As a result, a pam_krb5 that supports rxkad-kdf is required.




  Additional infO. Yes, we did rekey our AFS-cell quite a while ago, 
and our afs-Principal has two keys:


kadmin.local:  getprinc afs/rrz.uni-koeln.de
Principal: afs/rrz.uni-koeln...@rrz.uni-koeln.de

Anzahl der Schlüssel: 2
Key: vno 5, aes256-cts-hmac-sha1-96
Key: vno 4, des-cbc-crc
MKey: vno 1
Attribute: REQUIRES_PRE_AUTH
Richtlinie: [keins]

I hope the vno 4 des-cbc-crc key is not present on any of the 
rrz.uni-koeln.de servers.   If it is, the servers are still vulnerable to


  OPENAFS-SA-2013-003 - Brute force DES attack permits compromise of 
AFS cell

  http://www.openafs.org/pages/security/#OPENAFS-SA-2013-003


Like I said before, I looked at the sources of our version of 
pam_krb5, and the part where it is failing starts at line 775 inside 
the function "minikafs_5log_with_principal" (I'll attach the 
minikafs.c to this mail for reference)


This version of minikafs.c does not support rxkad-kdf.



  If you or anyone else has any ideas how to tackle the problem, any 
help would be greatly appreciated.


Deploy a version of pam_krb5 which contains the required rxkad-kdf 
functionality.   

Re: [OpenAFS] How to replace pam_krb5 on RHEL 8 systems

2022-07-08 Thread Jeffrey E Altman
Sounds like the version of pam_krb5 you are attempting to build does not 
include support for rxkad-kdf.


https://lists.openafs.org/pipermail/afs3-standardization/2013-July/002738.html

The version of pam_krb5 that supports rxkad-kdf contains a 
minikafs_kd_derive() function at minikafs.c line 775.


See https://github.com/frozencemetery/pam_krb5.

As mentioned in my prior reply pam_krb5 should not be used in 
conjunction with sssd.


Jeffrey Altman

On 7/8/2022 8:35 AM, Stephan Wonczak (a0...@rrz.uni-koeln.de) wrote:

Hi everyone!
  (Berthold's colleague here)

  We dug a little deeper and found the part in the pam_krb5-sources 
where it fails. It is in the file "minikafs.c" starting in line 775. 
It looks like the call to krb5_get_credentials() gets a non-zero 
return value, thus making it bail out.
  The problem is that we (well, at least me!) have no idea which 
enctype is expected, and which enctypes are actually tried. Debug 
output is not too helpful here. Any ideas on how to get useful 
information?
  (I should mention I am waaay out of depth here with my knowledge of 
Kerberos, and my C-fu is severely lacking, too ;-) )


  To be absolutley clear: We can ssh-login to the machine running this 
pam_krb.so-module, and get a valid krb5-ticket. No AFS-token after 
login, thus no access to AFS. If I do "klog.krb5", I -do- get an 
AFS-Token without any issues, and AFS-access starts working as it should.
  It's maddening that only pam_krb5 complains, while other tools work 
out of the box.


  Any advice would be greatly appreciated!

  Stephan

On Fri, 8 Jul 2022, Berthold Cogel wrote:


Am 07.07.22 um 19:04 schrieb Dirk Heinrichs:

 Benjamin Kaduk:


 Are you aware of pam_afs_session
 (https://github.com/rra/pam-afs-session)? Without knowing more about
 what you're using pam_krb5 for it's hard to make specific suggestions
 about what alternatives might exist.


 BTW: pam_krb5 != pam_krb5. There are two different modules with the 
same

 name out there. The one shipped with RedHat family distributions comes
 with integrated AFS support, while the one shipped with Debian family
 distributions doesn't. That's the reason why Debian also ships
 pam_afs_session and RH does not.

 Bye...

      Dirk



We're using the pam_krb5 shipped with Red Hat.

I've rebuild the module from the RHEL 7 source rpm on RHEL 8. And it 
seems to work for some value of working


Supported enctypes in our kdc:
aes256-cts-hmac-sha1-96:normal des-cbc-crc:normal des:afs3

We 'rekeyed' our AFS environment with aes256-cts-hmac-sha1-96:normal 
to get connections from newer Ubuntu/Debian and Fedora 35 working.


We get a krb5 ticket and a login, but getting the AFS token gives 
errors:


"error obtaining credentials for 
'afs/rrz.uni-koeln...@rrz.uni-koeln.de' (enctype=1) on behalf of 
: No credentials found with supported encryption types"


Same for two other enctypes.

So something else changed in RHEL 8, which we haven't found yet.


Regards
Berthold
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info



Dipl. Chem. Dr. Stephan Wonczak

    Regionales Rechenzentrum der Universitaet zu Koeln (RRZK)
    Universitaet zu Koeln, Weyertal 121, 50931 Koeln
    Tel: +49/(0)221/470-89583, Fax: +49/(0)221/470-89625

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] How to replace pam_krb5 on RHEL 8 systems

2022-07-08 Thread Jeffrey E Altman

On 7/7/2022 1:04 PM, Dirk Heinrichs (dirk.heinri...@altum.de) wrote:

Benjamin Kaduk:


Are you aware of pam_afs_session
(https://github.com/rra/pam-afs-session)? Without knowing more about
what you're using pam_krb5 for it's hard to make specific suggestions
about what alternatives might exist.

BTW: pam_krb5 != pam_krb5. There are two different modules with the same
name out there. The one shipped with RedHat family distributions comes
with integrated AFS support, while the one shipped with Debian family
distributions doesn't. That's the reason why Debian also ships
pam_afs_session and RH does not.

Bye...

     Dirk


Red Hat's pam_krb5 is not shipped nor supported for RHEL8 (or later).   
The replacement is sssd which supports Kerberos ticket acquisition but 
not AFS token acquisition.   The recommendation for acquiring AFS tokens 
on sssd enabled systems is to use pam_afs_session


  https://github.com/SSSD/sssd/issues/1505 "Support/Cache OpenAFS 
Authentication"


Use of the RHEL7 pam_krb5 on a sssd enabled system will do the wrong 
thing since its going to step on the toes of sssd's Kerberos ticket 
processing.


pam-afs-session is the correct tool to use on RHEL8 and later. The 
pam-afs-session bundled with AuriStorFS clients is known to acquire 
tokens in conjunction with sssd.   The primary differences between 
AuriStorFS pam_afs_session and Russ' are code quality improvements and 
use of external aklog and unlog instead of built-ins.


Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] vos release stops at 2^64 packets sent.

2022-06-27 Thread Jeffrey E Altman
On 6/27/2022 3:18 PM, Richard Brittain (richard.britt...@dartmouth.edu)
wrote:
> I know this is a long shot, but I've got a no-quota volume of approx
> 6TB, and I'm trying to replicate it.  It appears to be going fine
> until the packetRead counter reaches 2^64 and then it stops (doesn't
> abort).

I believe you meant 2^32 packets not 2^64.

> Servers are 1.6.22 (I thought I'd retire then before now, so didn't
> bother upgrading to 1.8.x).  If 1.8 might change this limit, I can
> upgrade, but I didn't find any hints in the release notes.

There are going to be two issues with moving a volume larger than 2TB.

First, the volume header diskused counter is going to wrap negative
above 2TB which will throw off all of the estimated counts for how much
space is required on the destination and how long the transfer will require.

Second, there is a hard limit in OpenAFS RX on the number of packets
that can be transmitted in a single call.   2^32 - 2.

>
> Based on how long it ran, my guess is > 5TB was transferred.

The 2^32 - 2 packets can transfer approximately 5.6TB of data.    After
that the call's data flow will stall and the call will be kept alive
forever by PING x PING_RESPONSE exchanges.

If rxkad is used to protect the call, then the maximum number of packets
that can be transferred in a single call without risk of replay attacks
is 2 ^ 30 - 2.   Approximately 1.4TB of data.   There are no checks in
the rxkad code to prevent sending more than 2 ^ 30 packets.

> Is this affected by volser buffer sizes ?

No.

Not all RX implementations have this limitation.   AuriStor RX will
support single calls up to 2^64 - 1 packets.   Approximately 22ZB.  
yfs-rxgk is not susceptible to replay attacks during a call consisting
of 2^64 - 1 packets.

Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


[OpenAFS] Please attend virtual AFS Tech Workshop 2022 - June 14 to 16 (10a -> 4p EDT [UTC-4])

2022-06-06 Thread Jeffrey E Altman
The virtual 2022 AFS Tech Workshop will take place Tuesday 14 June 2022,
Wednesday 15 June 2022 and Thursday 16 June 2022 from 10am EDT (UTC-4)
until 4pm EDT (UTC-4) each day.  Registration is free for speakers and a
nominal US$50.00 otherwise.  Proceeds support the OpenAFS Foundation.

This year's workshop includes discussions of four AFS-family filesystem
implementations:

  * the original Transarc/IBM AFS 3.x
  * OpenAFS
  * Linux Kernel AFS and AF_RXRPC
  * AuriStor File System

 
The talks loosely fall into the following categories:

  * three community status reports (OpenAFS, OpenAFS Foundation and
AuriStor)
  * two end user organization site reports (Goldman Sachs and CERN)
  * three talks that together describe how the /afs file namespace lifts
and shifts organization infrastructures to multiple clouds, and will
accelerate the transition to modern container deployment strategies.   
  * four talks on enhancements to /afs file namespace capabilities
covering proprietary extensions to AFS3, features already in
production as part of AuriStorFS, and others which (if implemented)
open the door to new application use cases.
  * three talks on the past, present and future of the RX RPC protocol. 
RX RPC and its extensible security frameworks have been critical to
the longevity of the /afs file namespace.  Its performance
limitations on long pipes and fat pipes have adversely impacted wide
area network and high performance network use cases.   Understanding
the design decisions of the past is necessary to understanding the
road to a brighter future.
  * five talks on tools and techniques that can be leveraged by
  * OpenAFS developers and system administrators to root cause
undesireable behaviors; the first step to improving code quality and
performance.
  * two Birds of a Feather meetings
  * and more

The detailed schedule can be found at:

  AFS Tech Workshop 2022 Schedule


The best AFS workshops of the past twenty years have included a solid
mix of attendees from a broad range of sites which deploy AFS-family
file systems as well as those that develop and support the products. 
This year's speaker list is heavily weighted towards developers and
vendors.  The timetable for the workshop (10am - 4pm EDT [UTC-4]) is not
particularly friendly to those far from the East Coast of the Americas. 
However, if you can join us I encourage you do to so.  The talks will
not be recorded so the only opportunity to hear them and participate
will be do to so live.

The two Birds of a Feather slots at the end of the Tuesday and Wednesday
sessions do not yet have topics assigned to them.  Perhaps one or both
can be turned into "lightning talk sessions" for general attendees. 
Possible topics might include:

  * mini site reports
  * technical difficulties
  * troubleshooting request
  * live public cell reviews
  * cloud migration experiences
  * anything that if might be interesting to others

Please Register for the workshop via EventBrite at

  AFS Tech Workshop 2022 Registration

 
As a reminder, speakers must register as well to obtain access to the
Zoom sessions.

I hope to see you next week.  Please join us.

Jeffrey Altman
AuriStor, Inc.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] a question about CellServDB and DNS alias

2022-03-27 Thread Jeffrey E Altman
On 3/23/2022 11:15 AM, Giovanni Bracco (giovanni.bra...@enea.it) wrote:
> In the documentation for the CellServDB file (both client & server)
> https://docs.openafs.org/Reference/5/CellServDB.html
>
> it is declared that is the "fully qualified hostname"
> that must be provided in the line defining a dbserver.

The specified DNS hostname can either be an A record or a CNAME record.

The hostname is only used by the Windows Cache Manager, Windows
authentication provider, and Windows network provider.

All of the command line tools, the UNIX cache manager, and the servers
only use the specified IP address.

Jeffrey Altman


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Networking AFS related problem

2022-02-03 Thread Jeffrey E Altman
On 2/3/2022 2:42 AM, Harald Barth (h...@kth.se) wrote:
> Hi Jeff!
>
>> It is unlikely that an ISP is blocking UDP traffic.
> For some value of "ISP". I have been to Karolinska Institutet who did
> supply Internet through the same "eduroam" cooperation as my home
> university. However, the "AFS experience" was totally different
> as in "non existent" on that "eduroam" as they had implemented ...

I do not consider "eduroam" networks provided by member institutions 
to be an ISP.

Many academic institutions which provide public or guest wifi access 
block a broad range of services including RDP, VNC, SSH, SMTP, 
SUBMISSION (e-mail), CIFS, Kerberos, and more.  Several "eduroam" sites 
block AFS ports due so because they have internal AFS cells that they 
consider insecure and want to ensure that they cannot be accessed by the
public.  Such sites expect endusers of these service to VPN to their 
home institutions once they are on the network.

rx/tcp would not help in such situations although a rx/https alternative
would.

Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Networking AFS related problem

2022-02-02 Thread Jeffrey E Altman
On 2/2/2022 6:38 AM, Harald Barth (h...@kth.se) wrote:
> I guess your IP provider lives in the IT world of 2022 where "Internet
> service" consists of mostly TCP/HTTPS and definitely not UDP ;-)

It is unlikely that an ISP is blocking UDP traffic.   The most likely
causes are a poorly implemented firewall in a home router or an
organizational firewall that is blocking a range of IP addresses which
have been used in prior attacks.

UDP is not only used for DNS and Kerberos but also for QUIC.  It has
been reported[1] that more than half of the web browser connections from
Chrome browers to Google's servers are performed using the UDP based
QUIC protocol.

Jeffrey Altman

[1]
https://techcrunch.com/2015/04/18/google-wants-to-speed-up-the-web-with-its-quic-protocol/



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Fwd: CRITICAL: RHEL7/CentOS7/SL7 client systems - AuriStorFS v2021.05-10 released > OpenAFS versions?

2022-01-11 Thread Jeffrey E Altman
On 11/24/2021 10:41 PM, Jeffrey E Altman (jalt...@auristor.com) wrote:
> On 11/11/2021 9:01 AM, Jeffrey E Altman (jalt...@auristor.com) wrote:
>> Any version of OpenAFS cache manager configured with a disk cache
>> running on an impacted el7 kernel is affected.   All kernels from
>> 3.10.0_861.el7 through 3.10.0_1160.42.2.el7 are impacted.   When a new
>> el7 kernel containing the AuriStor provided fix is available and
>> deployed, then OpenAFS will no longer be vulnerable.
> AuriStor expects the bug fix to be included in
> kernel-3.10.0-1160.51.1.el7 which we hope will ship by the middle of
> December.  The kernel released yesterday kernel-3.10.0-1160.49.2.el7
> does not include the fix.

The kernel containing the fix is kernel-3.10.0-1160.53.1.el7

Red Hat customers can read the announcement at
https://access.redhat.com/errata/RHSA-2022:0063

Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Slow loading of virtually hosted web content

2021-12-14 Thread Jeffrey E Altman
On 12/14/2021 12:51 PM, Kendrick Hernandez (kendrick.hernan...@umbc.edu)
wrote:
>
> On Fri, Dec 10, 2021 at 6:25 PM Jeffrey E Altman
>  wrote:
>
> Do you know what the issued DNS queries were for?
>
> We believe they were triggered by requests for /afs/.htaccess, as
> these web servers have it enabled.

If an AuriStorFS client were deployed these lookups could be disabled
using the following yfs-client.conf configuration file entry.

[afsd]

    ignorelist-afsmountdir = .htaccess

Instead of attempting to resolve a cellular mountpoint for
/afs/.htaccess Apache would be provided a zero-length file.

Happy Holidays.

Jeffrey Altman


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Slow loading of virtually hosted web content

2021-12-10 Thread Jeffrey E Altman
On 11/29/2021 1:11 PM, Kendrick Hernandez (kendrick.hernan...@umbc.edu)
wrote:
> We were able to narrow the problem down to DNS timeouts from an
> internal DNS server that had reached its limit for NF connection
> tracking. Once that limit was increased, the issue went away.
> Along with some forwarded insights from the folks at CMU and some
> isolated testing, we were able to confirm that disabling dynamic root
> and DNS-based server discovery on the cache manager also worked around
> issue.
>
I'm glad you identified a solution.

Do you know what the issued DNS queries were for?

The primary reason to avoid disabling dynamic root is the machine
restarts and the OpenAFS cache manager cannot read the "root.afs" volume
from the cell, the system will panic.  This could be due to the machine
booting without a network interface or a failure of the cell similar to
what occurred on January 14th of this year.

The afsd -afsdb option is not required for a web server that will only
be serving content from the local cell if the cell's location service
list of servers is present in the local CellServDB file.   Sites that
want the option of being able to dynamically relocate their location
service instances will want to avoid local CellServDB entries. 

AuriStorFS clients implement configurable ignorelists [1] to permit use
of dynroot and DNS SRV/AFSDB lookups while blocking lookups for specific
names either in the dynroot directory or any volume root directory.

Jeffrey Altman

[1] fs_ignorelist (auristor.com)




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Fwd: CRITICAL: RHEL7/CentOS7/SL7 client systems - AuriStorFS v2021.05-10 released > OpenAFS versions?

2021-11-24 Thread Jeffrey E Altman
On 11/11/2021 9:01 AM, Jeffrey E Altman (jalt...@auristor.com) wrote:
> Any version of OpenAFS cache manager configured with a disk cache
> running on an impacted el7 kernel is affected.   All kernels from
> 3.10.0_861.el7 through 3.10.0_1160.42.2.el7 are impacted.   When a new
> el7 kernel containing the AuriStor provided fix is available and
> deployed, then OpenAFS will no longer be vulnerable.

AuriStor expects the bug fix to be included in
kernel-3.10.0-1160.51.1.el7 which we hope will ship by the middle of
December.  The kernel released yesterday kernel-3.10.0-1160.49.2.el7
does not include the fix.

Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Slow loading of virtually hosted web content

2021-11-19 Thread Jeffrey E Altman
On 11/10/2021 3:27 PM, Kendrick Hernandez (kendrick.hernan...@umbc.edu)
wrote:
> Hi all,
>
> We host around 240 departmental and campus web sites (individual afs
> volumes) across 6 virtual web servers on AFS storage. The web servers
> are 4 core, 16G VMs, and the 4 file servers are 4 core 32G VMs. All
> CentOS 7 systems.
>
> In the past week or so, we've encountered high-load on the web servers
> (primary consumers being apache and afsd) during periods of increased
> traffic, and we're trying to identify ways to tune performance.
"In the past week or so" appears to imply that the high-load was not
observed previously.  If that is the case, one question to ask is "what
changed?"  Analysis of the Apache access and error logs compared to the
prior period might provide some important clues.
> After seeing the following in the logs:
>
> 2021 11 08 08:52:03 -05:00 virthost4 [kern.warning] kernel: afs:
> Warning: We are having trouble keeping the AFS stat cache trimmed
> down under the configured limit (current -stat setting: 3000,
> current vcache usage: 18116).
> 2021 11 08 08:52:03 -05:00 virthost4 [kern.warning] kernel: afs:
> If AFS access seems slow, consider raising the -stat setting for afsd.
>
There is a one-to-one mapping between AFS vnodes and Linux inodes. 
Unlike some other platforms with OpenAFS kernel modules, the Linux
kernel module does not strictly enforce the vnode cache (aka vcache)
limit.  When the limit is reached instead of finding a vnode to recycle,
new vnodes are created and a background task attempts to prune excess
vnodes.   Its that background task which is logging the text quoted above. 

> I increased the disk cache to 10g and the -stat parameter to 10,
> which has improved things somewhat, but we're not quite there yet.

As Ben Kaduk mentioned in his reply, callback promises must be tracked
by both the fileserver and the client.  Increasing the vcache (-stat)
limit increases the number of vnodes for which callbacks must be
tracked.  The umbc.edu cell is behind a firewall so its not possible for
me to probe the fileserver statistics to determine if increasing to
100,000 on the clients also requires an increase on the fileservers.  If
the fileserver callback table is full, then it might have to prematurely
break callback promises to satisfy the new allocation.  A callback break
requires issuing an RPC to the client whose promise is being broken.

> This is the current client cache configuration from one of the web
> servers:
>
> Chunk files:   281250
> Stat caches:   10
> Data caches:   1
>
The data cache might need to be increased if the web servers are serving
content from more than 18,000 files
>
> Volume caches: 200
>
If the web servers are serving data from 240 volumes, then 200 volumes
is too small.
>
> Chunk size:    1048576
> Cache size:    900 kB
> Set time:      no
> Cache type:    disk
>
>
> Has anyone else experienced this? I think the bottleneck is with the
> cache manager and not the file servers themselves, because they don't
> seem to be impacted much during those periods of high load, and I can
> access files in those web volumes from my local client without any
> noticable lag.

Apart from the cache settings how the web server is configured and how
it accesses content from /afs matters. 

* Are the web servers configured with mod_waklog to obtain tokens for
authenticated users?

* Are PAGs in use?

* How many active RX connections are there from the cache manager to the
fileservers?

* Are the volumes being served primarily RW volumes or RO volumes?

* Are the contents of the volumes frequently changing?

Finally, compared to the AuriStorFS and kafs clients, the OpenAFS cache
manager suffers from a number of bottlenecks on multiprocessor systems
due to reliance on a global lock to protect internal data structures. 
The cache manager's callback service is another potential bottleneck
because only one incoming RPC can be processed at a time and each
incoming RPC must acquire the aforementioned global lock for the life of
the call.  

Good luck,

Jeffrey Altman



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Fwd: CRITICAL: RHEL7/CentOS7/SL7 client systems - AuriStorFS v2021.05-10 released > OpenAFS versions?

2021-11-11 Thread Jeffrey E Altman
On 11/11/2021 7:12 AM, Giovanni Bracco (giovanni.bra...@enea.it) wrote:
> Are all OpenAFS versions 1.6.x and 1.8.x affected by the bug described
> in the enclosed mail?
>
Any version of OpenAFS cache manager configured with a disk cache
running on an impacted el7 kernel is affected.   All kernels from
3.10.0_861.el7 through 3.10.0_1160.42.2.el7 are impacted.   When a new
el7 kernel containing the AuriStor provided fix is available and
deployed, then OpenAFS will no longer be vulnerable.

Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Problem building a Debian package in AFS != RX CID bug

2021-09-13 Thread Jeffrey E Altman
On 9/13/2021 11:35 AM, deb...@lewenberg.com wrote:
> On 9/11/2021 8:44 PM, Jeffrey E Altman wrote:
>> On 9/11/2021 10:57 PM, deb...@lewenberg.com wrote:
>>> buster:
>>> Trying 192.168.225.188 (port 7001):
>>> AFS version: OpenAFS 1.8.2-1+deb10u1-debian 2021-07-21
>>> root@buster-server
>>
> While you are undoubtedly correct about the buster version of the
> OpendAFS client having that bug, it does not seem to affect the
> building of the Debian package using pbuilder (chroot): the pbuild
> _works_ on my buster server and _fails_ on my bullseye server.
>
As Sergio pointed out, the buster 1.8.2-1+deb10u1-debian 2021-07-21
package is patched for the CID bug and the bullseye version doesn't
contain the RX CID bug.

Have you run strace against your build process?

Or configured the OpenAFS fstrace to collect OpenAFS kernel module
tracing messages?

Or examined the FileAuditLog output from the fileserver on which the
build volume is hosted?





smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Problem building a Debian package in AFS == RX CID bug

2021-09-11 Thread Jeffrey E Altman
On 9/11/2021 10:57 PM, deb...@lewenberg.com wrote:
> buster:
> Trying 192.168.225.188 (port 7001):
> AFS version: OpenAFS 1.8.2-1+deb10u1-debian 2021-07-21 root@buster-server

This is a totally broken client because of the RX CID bug and it cannot
successfully communicate with any AFS location server or file server. 
Executing

  rxdebug localhost 7001 -allconn -onlyclient

after attempting to access /afs should show a number of connections
whose CID is 8000.  Every connection between a client and a server
is required to have a unique CID.  If more than one connection has the
8000 CID then the communication between the peers will be unsuccessful.

The Debian build of OpenAFS labeled version 1.8.6-5 for bullseye
contains a fix for the RX CID bug as does any OpenAFS tagged release
1.8.7 or later.   As indicated by 

  https://packages.debian.org/search?searchon=sourcenames=openafs

the latest buster package of OpenAFS is 1.8.2-1+deb10u1 which does not
contain the RX CID fix. 

You can build and deploy a later OpenAFS or you can use the AuriStorFS
client that your employer licenses.

Jeffrey Altman




smime.p7s
Description: S/MIME Cryptographic Signature


[OpenAFS] Re: [OpenAFS-devel] OpenAFS Licensing Update Discussion

2021-06-15 Thread Jeffrey E Altman
Following today's AFS Technology Workshop session many participants met
via Zoom to discuss the proposal to dual-license portions of the OpenAFS
source tree required to build the Linux kernel module under both the IBM
Public License 1.0 and GPLv2.   The following preliminary conclusions
were reached:

  * Follow up discussions on re-licensing will occur on the
openafs-de...@openafs.org mailing list.  The Reply-To on this e-mail
has been set accordingly.  Please join the openafs-de...@openafs.org
mailing list to participate in the discussions. 
https://lists.openafs.org/mailman/listinfo/openafs-devel
  * The target will be re-licensing of the minimal source code set
necessary to produce a GPLv2 kernel module.  There are members of
the community (myself included) that will object to re-licensing BSD
or MIT licensed contributions as GPLv2 unless doing so is required.
  * Cheyenne Willis pointed the group at the LWN article "Relicensing:
what's legal and what's right"
https://lwn.net/Articles/247872/
  * Simon Wilkinson referenced
https://www.kernel.org/doc/html/latest/process/license-rules.html
  * We discussed the re-licensing of the Sun RPC code which was included
by IBM in OpenAFS 1.0 but is not covered by the IBM Public License 1.0.
https://lwn.net/Articles/319648/
https://spot.livejournal.com/315383.html
https://github.com/krb5/krb5/commit/b61c02cc3fb9f5309dd43b4c61ec76465d8b2263
Based upon the discussions I have pushed to openafs gerrit a change
to apply the new Oracle 3-clause BSD license to OpenAFS on the
condition that IBM and all contributors that modified the affected
files agree.
https://gerrit.openafs.org/#/c/14640/
  * Subsequent to IBM publishing the revised IBM DeveloperWorks OpenAFS
1.0 dual-licensed version, I have agreed to seek the assistance of
the Software Freedom Law Center in drafting appropriate Contributor
License Agreements (CLAs) for individuals and organizations to
execute.   The CLAs will be scoped to the set of files necessary to
build a "Dual IPL20 and GPLv2" kernel module.
  * It is proposed that the CLAs be maintained by the OpenAFS Foundation
or the Software Freedom Law Center.

In order to narrow the scope of work the contributors that are known to
have contributed tens or hundreds of commits to the necessary source
files will be approached to execute CLAs first.  

The re-licensing of the the Sun RPC sources will be performed in
parallel with the relicensing of the Linux kernel sources.  As soon as
the necessary CLAs are obtained for the Sun RPC sources, the relicensing
of those files to 3-clause BSD can be merged.  

Gerrit will be modified to contain an additional column to record
acceptance of GPLv2 and IPL10 dual licensing for new submissions.   A
list of source files that must remain dual licensed will be maintained
in the source tree.   The addition and removal of source files from the
Linux kernel build will require modification of this list.

Although the initial work will be performed on the "master" branch, it
is the hope of many that dual licensing can be back-ported to the
existing stable releases.

To the other participants, if I made any mistakes or omissions in this
summary, please follow up with corrections.

Thank you for all that have expressed interest in a GPL licensed OpenAFS
kernel module.

Jeffrey Altman

On 6/10/2021 7:23 PM, Todd DeSantis (a...@us.ibm.com) wrote:
>
> Greetings OpenAFS Community
>
> I would like to introduce an OpenAFS community proposal to change the
> licensing terms for future releases of the kernel components of
> OpenAFS. Today OpenAFS is exclusively available under the IBM Public
> License (IPL-1.0). Due to OpenAFS having some Linux kernel modules,
> the IPL is not optimal for development and consumption of the kernel
> code on Linux. We're proposing that, going forward, OpenAFS kernel
> components should be available under a dual licensing model - the GNU
> GPL Version 2 and the existing IPL-1.0. Based on a recent request from
> the community, IBM is already in support of and working toward the
> dual licensing change for the OpenAFS kernel components.
>
> This change has many benefits for the OpenAFS community as well as
> users of OpenAFS in Linux environments. Having the OpenAFS kernel code
> available under the GNU GPLv2 will provide the appropriate licensing
> model for the OpenAFS Linux kernel code that meets current Linux
> kernel licensing standards. The time has come to institute this change
> and your agreement and support is needed.
>
> *Among the many benefits of OpenAFS kernel components under the GPLv2
> include: *
>
>   * Avoidance of tainting Linux kernels when OpenAFS kernel components
> are installed
>   * Ability to leverage modern Linux kernel features
>   * Opportunity to distribute OpenAFS kernel modules in Linux
> distributions
>   * Hosting of OpenAFS kernel support on POWER architecture
>
>
> 

Re: [OpenAFS] openafs 1.8.7 clients and server 1.6.24: rx ping burst?

2021-06-01 Thread Jeffrey E Altman

On 6/1/2021 10:24 AM, Giovanni Bracco (giovanni.bra...@enea.it) wrote:
But the real strange thing is that there are 1.8.7 clients that are 
sending hundreds of rx ping to this server in less that 30s, messages 
like this:


15:50:37.414106 IP cresco4cx021.casaccia.enea.it.afs3-callback > 
linafs3.frascati.enea.it.afs3-fileserver:  rx version (29)


384 similar lines up to:

15:51:17.494530 IP cresco4cx021.casaccia.enea.it.afs3-callback > 
linafs3.frascati.enea.it.afs3-fileserver:  rx version (29)


This bug is known and a fix is scheduled for inclusion in 1.8.8.

  https://gerrit.openafs.org/14364

The bug is most likely to impact multiuser systems.

Jeffrey Altman

<>

OpenPGP_signature
Description: OpenPGP digital signature


Re: [OpenAFS] Migration and slow AFS performance

2021-05-29 Thread Jeffrey E Altman

Hi Dan,

Since no one from the OpenAFS community has replied I will chime in.

On 5/25/2021 10:21 AM, Daniel Mezhiborsky 
(daniel.mezhibor...@cooper.edu) wrote:

Hello all,

We currently have a relatively small (~250 users, 2TB) AFS cell that I 
am planning on retiring soon. 


If you are willing to explain, I'm sure the community would appreciate 
hearing the reasons behind the migration away from OpenAFS and how the 
replacement was selected.


I'd like to get our data out of AFS-space, 
but this is proving problematic because of performance issues. Our setup 
is one AFS server VM with iSCSI-attached /vicepa. 


It would be helpful if you could provide more details of the server 
configuration.


1. Is the client-fileserver traffic sharing the same network adapter as 
the iSCSI attached filesystem backing the vice partition?


2. How much network bandwidth is available to the client-fileserver path?

3. How much network bandwidth is available to the fileserver-iSCSI path?

4. What is the network latency (Round Trip Time) between the client and 
fileserver?


5. What version is the OpenAFS fileserver?

6. What command line options has the fileserver been started with?

7. What AFS client version and operating system (and OS version) is the 
client?


8. What are the configuration options in use by the client?

9. What is the destination file system that the rsync client is writing to?

10. Is the destination filesystem also being accessed via the network?

11. What is the OpenAFS fileserver VM configuration?  number of cores, 
amount of RAM, clock speed, etc.


For single large-file 
transfers with encryption off, I can get around 30MB/s read/write, but 
it seems like metadata operations are very slow, so copying our data 
(mostly home directories) directly from AFS with cp/rsync is taking a 
prohibitively long time. 


Which metadata operations are slow?

Are you referring to operations reading the metadata from /afs or 
writing it to the destination?


I see similar slowness with vos dump. We do 
take regular backups with vos backupsys and backup dump that take about 
36 hours for full backups.


Does anyone have any recommendations on problem areas to try and tune to 
get better performance or advice on a better way copy our data out of AFS?


You will need to provide more details about the environment before 
specific recommendations can be provided.  However, I will mention a few 
things to consider about the extraction methodology.


First, each new AFS RPC behaves similarly to a new TCP connection.  It 
starts with a minimal congestion window size and grows the window via a 
slow start algorithm until either the maximum window size of 32 * 1412 
packets has been reached or packet loss occurs.   Combined with the RTT 
for the link you can compute the bandwitdh delay product for a single 
call.


In general RPCs issued by an AFS client to read from a fileserver are 
going to be one disk cache chunk size at a time.  Smaller if the average 
file size is less than the chunk size.


If authentication is required to access the directories and files, a 
separate FetchStatus RPC will be issued for most files prior to the 
first FetchData RPC.  At least in an OpenAFS client.


One of the strengths of AFS is the caching client, but in this case 
caching is not beneficial because the data will be read once and 
discarded during this workflow.   The workflow will also be painful for 
an OpenAFS cache manager because of the callback tracking.   If the 
cache recycles before the callback expires, the cache manager will 
cancel the outstanding callbacks.


Likewise, if the fileserver callback table fills then it will need to 
expend considerable effort searching for unexpired callbacks to discard 
early.  Discarding callbacks requires issuing RPCs to the cache manager. 
 So an insufficiently large cache and callback pool can result in the 
callback lifecycle dominating the workflow.


Instead of performing one rsync at a time.  You should consider 
executing multiple rsyncs in parallel potentially from multiple client 
systems.


Provide details of the environment and more specific recommendations can 
be provided.


Jeffrey Altman
AuriStor, Inc.

<>

OpenPGP_signature
Description: OpenPGP digital signature


Re: [OpenAFS] error in compiling openafs 1.6.24 on CentOS 8.3

2021-05-09 Thread Jeffrey E Altman

On 5/9/2021 12:35 PM, Giovanni Bracco (giovanni.bra...@enea.it) wrote:
I have tried to compile openafs-1.6.24 on CentOS 8.3, kernel 
4.18.0-240.22.1.el8_3.x86_64.


The build terminates with

fatal error: rpc/types.h: No such file or directory
  #include "rpc/types.h"
   ^


rpc/types.h is no longer available on systems with glibc 2.26 or newer.

There are patches cherry-picked to openafs-stable-1_6_x which appear to 
address this change.  However, there have been no releases from the 
openafs-stable-1_6_x branch beyond 1.6.24.  There is no mention of 
support for RHEL/CentOS beyond the 7.5 release for the 
openafs-stable-1_6_x branch in the NEWS file.


Your options appear to be switch to the 1.8.x series for CentOS 8.3 or 
try to use the tip of the unreleased openafs-stable-1_6_x branch.


Jeffrey Altman







<>

OpenPGP_signature
Description: OpenPGP digital signature


Re: [OpenAFS] Occasional "VLDB: no permission access for call"

2021-03-29 Thread Jeffrey E Altman

On 3/29/2021 12:23 AM, Ian Wienand (iwien...@redhat.com) wrote:

A new thing I've noticed after we have upgraded everything to 1.8.6


openstack.org also deployed a new database server and this problem is 
most likely due to a failure to synchronize the super-user list onto the 
new vlserver.  If the vos command randomly chooses to speak with the new 
vlserver instance first, it receives a permission denied error and does 
not contact any other servers.


Jeffrey Altman
<>

OpenPGP_signature
Description: OpenPGP digital signature


Re: [OpenAFS] OpenAFS 1.8.7 on Linux systems running Crowdstrike falcon-sensor

2021-03-08 Thread Jeffrey E Altman

On 3/8/2021 7:20 PM, Benjamin Kaduk (ka...@mit.edu) wrote:

On Mon, Mar 08, 2021 at 07:35:19PM +, Martin Kelly wrote:

Below is the LKML LSM thread regarding this. Please let me know if you have any 
other questions:

https://www.spinics.net/lists/linux-security-module/msg39081.html
https://www.spinics.net/lists/linux-security-module/msg39083.html


Thanks for spotting this thread and the quick follow-up.


This is the same thread that Yadav discussed with the openafs-release 
team on 11 Dec 2020.



I suspect that the changes at https://gerrit.openafs.org/#/c/13751/ are
going to be relevant in this space, but without seeing the stack trace of
the crash in question it's hard to be sure.  Can you speak to whether this
is touching the "right" part of the code with respect to the crashes you
were investigating?


The suggested change was cherry-picked to openafs-stable-1_8_x as
https://gerrit.openafs.org/14082 and merged as 
ee578e92d9f810d93659a9805d0c12084fe2bb95.


As Jonathan wrote to IRC OpenAFS:

> (4:53:15 PM) billings: I built openafs from the latest commit in
> 1_8_x and crowdstrike still panics, so it doesnt look like any
> merged commits there fix my issue.

Martin's e-mail describes the call pattern:

> - A process exits, calling task_exit().

I think Martin meant do_exit().

> - exit_fs() is called, setting current->fs = NULL.

task_struct field struct fs_struct *fs;

> - Next, exit_task_work() is called,

exit_task_work() calls task_work_run() which flushes any pending works.

> which calls fput().

which must have been called by a pending work.

> - In response to the fput(), the filesystem opens a file

disk cache

>   to update some metadata, calling dentry_open().

dentry_open() in turn will trigger a call to any configured LSM.
If task_struct.fs is NULL, Kaboom!!!

Jeffrey Altman
<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] aklog and AFS DB server timeouts

2021-01-29 Thread Jeffrey E Altman
Rainer,

OpenAFS UNIX/Linux clients and server only use the IP addresses in the
CellServDB file.  The fully qualified domain names are only used by
OpenAFS Windows clients.

Jeffrey Altman

On 1/29/2021 2:38 PM, RL (rainer.laat...@t-online.de) wrote:
> On the relevant clients, are all three with full name in /etc/hosts ?
> Else failure is standard as
> 
>   192.168.*.*
> is a private thingie that never gets resolved with DNS
> Regards, R.
> 
> 
> 
> 
> On 1/29/21 7:32 PM, A. Lewenberg wrote:
>> On our buster servers the OpenAFS client (1.8.2) has an issue with
>> provisioning an AFS token. When I attempt to get an AFS token it very
>> often takes a long time.
>>
>> $ aklog (this can up to 30 seconds or more)
>>
>> After some investigation it looks like aklog is trying the AFS DB
>> servers listed in /etc/openafs/CellSrvDB and timing out on some of the
>> DB servers. Here is the relevant contents of that file:
>>
>> >example.com   # My Company
>> 192.168.1.102    #afsdb1.example.com
>> 192.168.1.104    #afsdb2.example.com
>> 192.168.1.106    #afsdb3.example.com
>>
>> Running aklog and sniffing the network I see that the client attempts
>> to contact one of the three afsdb servers. If the one it chooses to
>> contact first is afsdb2 or afsdb3 the connection does not succeed
>> until it finally gives up and tries anther one. If the second one it
>> tries is afsdb2 or afsdb3 it gives up and tries the only remaining
>> one: afsdb1. In other words:
>>
>> afsdb3 (fail), afsdb2 (fail), afsdb1 (succeeds)
>> afsdb2 (fail), afsdb3 (fail), afsdb1 (succeeds)
>> afsdb3 (fail), afsdb1 (succeeds)
>> afsdb2 (fail), afsdb1 (succeeds)
>> afsdb1 (succeeds)
>>
>> This sounds like both afsdb2 and afsdb3 are simply not working.
>> However...
>>
>> If I remove afsdb1 and afsdb2 from the CellSrvDB leaving only afsdb3
>> it works instantly every time! That is, the following CellSrvDB works
>> without delay:
>>
>> >ir.example.com   # My Company
>> 192.168.1.106    #afsdb3.example.com
>>
>> Similarly, if afsdb2 is the only entry in CellSrvDB running aklog
>> works without delay. So it cannot be that afsdb2 and afsdb3 are
>> completely broken.
>>
>> The AFS DB servers are running OpenAFS version 1.6.9.
>>
>> What the heck is going on?
>> ___
>> OpenAFS-info mailing list
>> OpenAFS-info@openafs.org
>> https://lists.openafs.org/mailman/listinfo/openafs-info
> ___
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] 14 Jan 2021 08:25:36 GMT Breakage in RX Connection ID calculation

2021-01-18 Thread Jeffrey E Altman
On 1/18/2021 11:46 AM, Richard Brittain (richard.britt...@dartmouth.edu)
wrote:
> I'm  a bit confused about what versions are affected by this bug.  I've got 
> mostly 1.8.[56] clients, which I'm upgrading now.  My servers are still 
> running 1.6.22 and appear to be fine for vos operations between themselves, 
> and the DB servers have been restarted since this happened (but not the 
> fileservers).  Will I need to upgrade the servers too ?

The bug was introduced in OpenAFS 1.8.0 and is present in OpenAFS 1.9.0
as well.

All clients and servers running OpenAFS 1.8.[0123456] and OpenAFS 1.9.0
will fail to communicate with OpenAFS peers (clients or servers) and
MUST be updated.

OpenAFS 1.2.x, 1.4.x and 1.6.x clients and servers are unaffected.  They
might require updates for other reasons depending upon the version but
they are not impacted by this issue.

All AuriStorFS clients/servers and Linux rxrpc clients are unaffected by
this issue.

Jeffrey Altman

<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] 14 Jan 2021 08:25:36 GMT Breakage in RX Connection ID calculation

2021-01-14 Thread Jeffrey E Altman
On 1/14/2021 1:20 PM, Jeffrey E Altman (jalt...@auristor.com) wrote:
> On 1/14/2021 10:55 AM, Jeffrey E Altman (jalt...@auristor.com) wrote:
>> This morning at 14 Jan 2021 08:25:36 GMT all restarted or newly started
>> OpenAFS 1.8 clients and servers began to experience RX communication
>> failures.  The RX Connection ID of all calls initiated by the peer are
>> the same:
>>
>>   0x8002
>>
>> Patches to correct the flaw are available from OpenAFS Gerrit
>>
>>   https://gerrit.openafs.org/14491
>>   rx: rx_InitHost do not overwrite RAND_bytes rx_nextCid
>>
>>   https://gerrit.openafs.org/14492
>>   rx: update_nextCid overflow handling is broken
> 
> One more patch
> 
>   https://gerrit.openafs.org/14495
>   rx: modify RX_CIDMASK to match update_nextCid()

which has been abandoned in favor of

https://gerrit.openafs.org/14496
Remove overflow check from update_nextCid

Jeffrey Altman
AuriStor, Inc.

<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] 14 Jan 2021 08:25:36 GMT Breakage in RX Connection ID calculation

2021-01-14 Thread Jeffrey E Altman
On 1/14/2021 10:55 AM, Jeffrey E Altman (jalt...@auristor.com) wrote:
> This morning at 14 Jan 2021 08:25:36 GMT all restarted or newly started
> OpenAFS 1.8 clients and servers began to experience RX communication
> failures.  The RX Connection ID of all calls initiated by the peer are
> the same:
> 
>   0x8002
> 
> Patches to correct the flaw are available from OpenAFS Gerrit
> 
>   https://gerrit.openafs.org/14491
>   rx: rx_InitHost do not overwrite RAND_bytes rx_nextCid
> 
>   https://gerrit.openafs.org/14492
>   rx: update_nextCid overflow handling is broken

One more patch

  https://gerrit.openafs.org/14495
  rx: modify RX_CIDMASK to match update_nextCid()

> Jeffrey Altman
> AuriStor, Inc.

<>

smime.p7s
Description: S/MIME Cryptographic Signature


[OpenAFS] 14 Jan 2021 08:25:36 GMT Breakage in RX Connection ID calculation

2021-01-14 Thread Jeffrey E Altman
This morning at 14 Jan 2021 08:25:36 GMT all restarted or newly started
OpenAFS 1.8 clients and servers began to experience RX communication
failures.  The RX Connection ID of all calls initiated by the peer are
the same:

  0x8002

Patches to correct the flaw are available from OpenAFS Gerrit

  https://gerrit.openafs.org/14491
  rx: rx_InitHost do not overwrite RAND_bytes rx_nextCid

  https://gerrit.openafs.org/14492
  rx: update_nextCid overflow handling is broken

IBM AFS 3.x and OpenAFS clients and servers prior to 1.8 performing
unauthenticated will suffer from a lack of randomness when selecting
the initial CID value.  As a result, communication failures might
occur depending upon the selected CID value.   Further research to
determine the impact is required.

Please note that all versions of AuriStor RX and Linux rxrpc as used by
clients, servers and administrative tooling are unaffected.

Jeffrey Altman
AuriStor, Inc.



<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] low write/read performance with 1.8.x 1.9.0 client

2020-10-13 Thread Jeffrey E Altman
On 10/13/2020 3:05 PM, Giovanni Bracco (giovanni.bra...@enea.it) wrote:
> Thank you for the suggestion, but I have tried to use the command
> 
> fs setcrypt -crypt off
> 
> on 1.8.x clients
> 
> and
> 
> fs setcrypt -crypt
> 
> on 1.6.x clients
> 
> without any effect on performance in both cases, no increase for  1.8.x
> and no decrease for 1.6.x
> 
> Is that the way to control the encryption, isn't it?
> 
> In both cases I have checked the status with fs getcrypt
> 
> Giovanni

"fs setcrypt" only applies to new connections on the UNIX cache manager.
 You need to follow it with "aklog -force" to create new connections
with the altered mode.


<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] low write/read performance with 1.8.x 1.9.0 client

2020-10-13 Thread Jeffrey E Altman
On 10/13/2020 9:28 AM, Giovanni Bracco (giovanni.bra...@enea.it) wrote:
> I have seen that the first release of OpenAFS 1.9.0 is out and so I
> thought that it was time to try at least 1.8.x and also 1.9 on our
> production Linux x86-64 nodes, where we have used 1.6.x up to now.
> 
> Our AFS cell has file servers with OpenAFS 1.6.x and while over WAN the
> performance can be rather poor, due to the well known rx latency
> problem, in the LAN we have values between 70 e 80 MB/s both for read
> and write. The clients are CentOS 6.x and 7.x all with OpenAFS 1.6.x.
> 
> So we have tried OpenAFS 1.8.6 clients on some production nodes (from
> CentOS 7.3 to 7.8) and the performance on LAN have been poor, 15 MB/s
> while the same nodes with 1.6.x clients has the normal performance.
> 
> Then we have installed a test AFS cell with OpenAFS 1.8.6 but no change
> and at that point I we have checked also on user desktops using ubuntu.
> 1.8.4 and the performance are as low as for production nodes with 1.8.x
> release.
> No improvement going to 1.9.0 (both server and clients) either.
> 
> Any suggestion?
> Are we missing something important?
> No special tuning in our installation.
> 
> Giovanni

I suspect the observed performance difference is due to:

commit 6d59b7c4b4b712160a6d60491c95c111bb831fbb
Author: Benjamin Kaduk 
Date:   Sun Jul 30 20:57:05 2017 -0500

  Default to crypt mode for unix clients

  Though the protection offered by rxkad, even with rxkad-k5 and
  rxkad-kdf, is insufficient to protect traffic from a determined
  attacker, it remains the case that the internet is not a safe place
  for user data to travel in the clear, and has not been for a long
  time.  The Windows client encrypts by default, and all or nearly all
  the Unix client packaging scripts set crypt mode by default.  Catch
  up to reality and default to crypt mode in the Unix cache manager.

  Change-Id: If0061ddca3bedf0df1ade8cb61ccb710ec1181d4
  Reviewed-on: https://gerrit.openafs.org/12668
  Reviewed-by: Benjamin Kaduk 
  Tested-by: BuildBot 

Jeffrey Altman

<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Weird issue with AFS hang on a couple of legacy RHEL 6 servers

2020-09-19 Thread Jeffrey E Altman
On 9/8/2020 1:31 PM, Sebby, Brian A. (se...@anl.gov) wrote:
> Hi,
> 
> I have a few legacy RHEL 6 servers that are still running an older 1.6.x
> series DKMS client, which were recently patched and rebooted.  On a
> couple of them, access to AFS is now just hanging – and I cannot figure
> out why.  They have the same kernel release and kernel module as some
> other systems on the same network that are working, so it doesn’t look
> like it would be any sort of firewall issue.  Would anyone have any
> suggestions on how to debug this?  I don’t know if I can upgrade them to
> 1.8 since we are trying to keep our legacy systems fairly static.

I debugged one of the problem systems.  The issue would not be fixed by
any newer version of OpenAFS.  The underlying problem is related to the
computation of the rx interface MTU when the host has a network
interface with an MTU smaller than the minimum jumbo packet size.  The
work around is "afsd -rxmaxmtu " where

  N =  - 56

It should be noted that unlike the OpenAFS servers and the Windows
client, the UNIX cache manager does not disable jumbo by default.

Jeffrey Altman
AuriStor, Inc.
<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Migrating away from single DES

2020-09-15 Thread Jeffrey E Altman
Hi Rainer,

The DES only limitation of the afs/cell@REALM service principal was
removed in the 2013 release of OpenAFS 1.4.15 and 1.6.5.  Since those
releases neither the server ticket key nor the session key are
restricted to the des-cbc-crc encryption type.  All cells should be
upgraded to current versions of OpenAFS on the servers and should rekey
the afs/cell@REALM service principal with the aes256-cts-hmac-sha1-96
encryption type.

This includes cells that have deployed gssklogd.  If the KeyFile
contains a des-cbc-crc key, the cell is vulnerable to the Brute Force
Attacks described by

  http://www.openafs.org/pages/security/OPENAFS-SA-2013-003.txt

Changing the service principal encryption type protects against this
brute force attack.  However, it is important to note that even when an
aes256-cts-hmac-sha1-96 session key is negotiated, the OpenAFS client
and server will derive from that key a 56-bit key to use for the fcrypt
encryption type used by rxkad for wire security.

Jeffrey Altman

On 9/15/2020 12:32 PM, r. l. (rainer.laat...@t-online.de) wrote:
> The simplest solution: use  gssklog  of D.E.Engert.  The token then
> comes from an AFS vlservers KeyFile
> 
> and not from an entry afs/**@*** in some krb5kdc. Just run some gssklogd
> and switch from aklog to
> 
> gssklog in your profiles. Some times ago, even CERN.ch used it.
> 
> The original tarfile can still be found at
> 
>   http://www.hep.man.ac.uk/u/masj/gssklog/
> 
> or try my updated version at
> 
>   http://95.217.219.185/ContribAFS/Gssklog-0.11.tar
> 
> The binaries were done on ScientificLinux-6.10 with a newer KRB5 in
> /opt/krb5/
> 
> and a static compilation of openafs (had to fix hcrypto and roken libs
> there)
> 
> 
> Best regards
> 
> R. Laatsch
> 
> 
> 
> 
> 
> 
> 
> 
> =
> 
> On 2020-09-14 10:32, ProbaNet SRLS wrote:
>> Hello!
>>
>>  Recent releases of krb5 (> 1.18) no longer support single des
>> encryption (the "allow_weak_crypto = yes" option in krb5.conf client
>> side has no longer effect), so now we get this error with "aklog -d":
>>
>> ---
>>
>> Kerberos error code returned by get_cred : -1765328370
>> aklog: Couldn't get X AFS tickets:
>> aklog: KDC has no support for encryption type while getting AFS tickets
>>
>> ---
>>
>> How should we proceed?
>>
>>
>> Stefano
>>
>> ___
>> OpenAFS-info mailing list
>> OpenAFS-info@openafs.org
>> https://lists.openafs.org/mailman/listinfo/openafs-info
> ___
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] a question about user capability for a given a directory with its ACL.

2020-05-17 Thread Jeffrey E Altman
Hi Giovanni,

The cache manager doesn't know either the contents of the ACL or the PTS
group memberships.  The computation of a caller's access rights are
performed entirely by the fileserver.  The cache manager makes access
decisions based upon the access rights obtained from the fileserver in
the AFSFetchStatus structure.

If you have a token for the user you can obtain a good approximation of
the user's access rights by issuing the "fs getcalleraccess" (aka "fs
gca") command.  This command will return the access rights returned from
the fileserver for the requested path.  However, this is an
approximation because the IBM AFS/OpenAFS fileservers only report the
explicit access rights in the AFSFetchStatus structure returned to the
cache manager.  There are also implicit rights granted to the file
owner, volume owner and members of the system:administrators group.

One difference in the AuriStorFS fileserver is that the AFSFetchStatus
structure reports the computed access rights including the implicit
rights.  This is important because if a cache manager makes a decision
about whether or not to issue an RPC based upon the cached access rights
for the user, the cache manager might deny a request that the fileserver
would in fact perform.

Operations that are permitted based upon implicit rights include
fetching and storing access control lists, listing the contents of
directories, fetching and storing status information.  Many of the
implicitly permitted operations are blocked when a UNIX cache manager
communicates with an OpenAFS fileserver because the permissions are not
advertised in the AFSFetchStatus structure.

To satisfy your request would require a new RXAFS RPC, something like

  RXAFS_FetchStatusAsUser(
  IN  AFSFid *Fid,
  IN  UserId  User,
  OUT AFSFetchStatus *OutStatus,
  OUT AFSCallBack *CallBack,
  OUT AFSVolSync *Sync)

which could be issued only by the file owner, volume owner or members of
the system:administrators group and then extend the

  fs getcalleraccess [-path +]

command with a

  -nameorid 

optional parameter.

I believe that the addition of this functionality is a good idea and
AuriStor will consider adding it to our August release.

Jeffrey Altman


On 5/17/2020 9:11 AM, Giovanni Bracco wrote:
> Given an AFS directory and a userid, is there a direct way to understand
> what are the user capabilities, according to the directory ACL?
> 
> Of course one can prepare a script which reads the directory ACL and the
> user membership to PTS groups and make a combined analysis to discover
> if  the user can, let's say, read the files in the directory, if any ,
> but I wonder if there is  some OpenAFS command that provides directly
> the answer, as of course the client has to know all that..
> 
> Giovanni
> 
<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Borderline offtopic: OpenAFS as ~ for Samba AD?

2020-02-15 Thread Jeffrey E Altman
On 2/15/2020 7:55 AM, Måns Nilsson wrote:
> Subject: Re: [OpenAFS] Borderline offtopic: OpenAFS as ~ for Samba AD? Date: 
> Mon, Jan 20, 2020 at 04:42:24PM -0500 Quoting Jeffrey E Altman 
> (jalt...@auristor.com):
>> No need for cross-realm.  Create an afs/cell@SAMBA4.REALM service principal
>> with a kvno
>> that differs from the afs/cell@HEIMDAL.REALM service principal and add the
>> key to your
>> AFS servers as well as adding both realm names to the AFS servers' krb.conf.
> 
> Thanks! 
> 
> I've finally mustered enough bravery to tackle this.  Would proper DNS
> find-a-bility for Kerberos serve as complete substitute for "as adding
> both realm names to the AFS servers' krb.conf" ?

NO! The list of realms in the krb.conf are used to specify which realms
will be chopped off the authenticated principal name so there will be a
match with protection service user or group entries.

Kerberos DNS SRV records are used by clients to find the Kerberos KDCs
for the realm.  The AFS servers never contact the KDCs themselves.

> I've added the afs/cell@SAMBA4.REALM principals, with identical keytypes
> and different kvno to the rxkad.keytab on all my servers, restarted
> processes on them.
> 
> After having fixed the krb5.conf for Heimdal on the Windows clients to
> point to the right domain, I can login without delay.
> 
> I've mapped my home directory in AFS to H:\ and that's where I end up
> when logging in, and I have a token issued for user@SAMBA4.REALM in my
> cell. But it is not giving me any rights.  
> 
> I suspect I must map my SAMBA4.REALM user to rights management in my cell,
> some way. Or is there some magic I'm missing?

The is what the AFS krb.conf is for.  All of the local authentication
realms must be listed there and the servers restarted for the change to
take effect.

> I've tried adding user@samba4.realm to various pts entities like groups
> and the list of users, but no such luck; I'get error messages
> (no such user for group or acl membership, "badly formed name" for
> user creation). I'm on way too old software versions in my cell, of
> course. Would upgrading help?
You only would create a system:authuser@smab4.realm group and then
create @samb4.realm entries if you were treating the two sets of
identifies as unique.

Jeffrey Altman

<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Borderline offtopic: OpenAFS as ~ for Samba AD?

2020-01-20 Thread Jeffrey E Altman
No need for cross-realm.  Create an afs/cell@SAMBA4.REALM service 
principal with a kvno
that differs from the afs/cell@HEIMDAL.REALM service principal and add 
the key to your

AFS servers as well as adding both realm names to the AFS servers' krb.conf.

On 1/19/2020 4:53 PM, Måns Nilsson wrote:

I'm running a very small site (home) with family members accessing
computing resources. Now, some users are requesting windows clients,
and since I'm not trusting them I decided to make my own life more
complicated by running an Active Directory site, but I'm too cheap to
buy real Windows Server licenses, so have opted for Samba 4.

Being the glutton for punishment I am, I want their home directories to
be their usual home directories in AFS.  This means, that I'd like to
cross-realm ("AD Trust", but not entirely) between my Heimdal realm (where
I run the AFS cell) and the Heimdalish Kerberos that is part of Samba 4.

I've found the windows documentation for setting up trust/cross-realm
to external realms, and I believe I've tried most permutations of those
commands, but no such luck.

It is really not entirely appropriate for this forum, but if anyone has
done this, they probably are here, so I'm asking anyway.  Any pointers?
For instance, is there a Samba-native command for cross-realm? All of my
testing has been from Windows clients using the management tools for AD,
and that won't work for this even if it works for an impressive amount
of other things.

Thanks,




smime.p7s
Description: S/MIME Cryptographic Signature


[OpenAFS] Linux Kernel AFS Hackathon, Future of AFS/AuriStorFS BoF, and Vault '20, Santa Clara CA - Feb 24/25

2020-01-13 Thread Jeffrey E Altman
AuriStor is proud to once again sponsor the Linux Kernel AFS Hackathon &
BoF and the USENIX Vault '20 conference (co-located with FAST '20 and
NSDI '20).  Here are a few schedule highlights

Monday Feb 24th 9:00am to 5:00pm PST

Linux Kernel AFS Hackathon.  David Howells, the AuriStor developers,
other Linux kernel filesystem/network developers and Linux distribution
packagers participate in a hackathon to enhance/test:

 1. the functionality of the native Linux afs filesystem module (kafs)
https://www.infradead.org/~dhowells/kafs/

 2. the functionality of the rxrpc network module and AF_RXRPC
socket class

 3. the functionality of FS-Cache, the persistent caching layer
that can be used by kafs, nfs, cifs, plan9, and cephfs.

 4. the kafs-client configuration and systemd integration for
automatically mounting kafs and tools for managing authentication.

Linux Kernel AFS and AF_RXRPC are a standard part of the Fedora Core 31
Linux distribution.  One explicit goal of this year's hackathon is
packaging kafs-client for use with Debian and Ubuntu.

  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=933326

The hackathon is open to all.  Register at

  https://www.auristor.com/events/kafsvault20


Monday Feb 24th 6:30pm to 8:30pm PST

Birds of a Feather Meeting: The Future of AFS / AuriStorFS

Join us for pizza, status reports on the development of Linux Kernel AFS
and AuriStorFS, and open discussion of the future of only true global
file namespace.  All existing AuriStorFS and AFS end users and
administrators are encouraged to attend.


Tuesday Feb 25th 3:00pm to 3:30pm PST
"Using kAFS on Linux for Network Home Directories"
Jonathan Billings, University of Michigan, College of Engineering, CAEN
(Vault '20 registration required)

The AFS filesystem has been widely in use at educational and research
institutions since the mid-80s, and continues to be a service that many
universities, including the University of Michigan, provides to
students, staff and faculty. The Linux kernel has recently improved
support for the AFS filesystem, and now some Linux distributions provide
support for AFS out of the box. I will discuss the history of AFS, the
in-kernel AFS client, and its performance compared to the out-of-kernel
OpenAFS client. I will demonstrate some of the benefits and limitations
when using AFS as a home directory in a modern Linux distribution such
as Fedora, including working with systemd and GNOME.


Tuesday Feb 25th 4:00pm to 4:30pm PST
"Understanding Kubernetes Storage: Getting in Deep by Writing a CSI Driver"
Gerry Seidman, AuriStor
(Vault '20 registration required)

Understanding the many Kubernetes storage ‘objects’ along with their
not-always-obvious interaction and life-cycles can be daunting (Volumes,
Persistent Volumes, Persistent Volume Claims, Volume Attachments,
Storage Classes, Volume Snapshots, CSIDriver, CSINode, oh my...)

Perhaps the best ways to gleen a deep understand of these storage
objects and how storage-related scheduling works in Kubernetes is to
write a Container Storage Initiative (CSI) driver. While most of us will
never need to write a CSI driver, in this session we will make storage
with Kubernetes more accessible by exploring it from an inside-out
approach learned by writing a CSI Driver.

Gerry Seidman has a long career having designed and implemented many
complex, secure, high-performance, high-availability and fault tolerant
distributed systems. He is President at AuriStor where he is still very
hands-on including the design and implementation of the AuriStor/AFS
Kubernetes/CSI Driver.


As a sponsor, AuriStor is thrilled to offer a 20% discount on Vault '20
registration.  Provide the code "ASFSVLT20" when completing your online
registration to receive the discount.

  https://www.usenix.org/conference/241743/registration/form

We hope to see you there.







smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] iperf vs rxperf in high latency network

2019-08-08 Thread Jeffrey E Altman
Hi Simon,

response inline ...

On 8/8/2019 2:54 PM, xg...@reliancememory.com wrote:
> To make sure I captured all the explanations correctly, please allow me to 
> summarize my understandings:
> 
> Flow control over a high-latency, potentially congested link is a fundamental 
> challenge that both TCP and UDP+Rx face. Both protocol and implementation can 
> pose a problem. The reason why I did not see an improvement when enlarging 
> the window size in rxperf is that firstly I chose too few data bytes to 
> transfer and secondly that OpenAFS's Rx has some implementation limitations 
> that become a limiting factor before the window size limit kicks in. They are 
> non-trivial to fix, as demonstrated in the 1.5.x throughput "hiccup". But 
> AuriStor fixed a significant amount of it in its proprietary Rx 
> re-implementation. 
> 
> One can borrow ideas and principals from algorithm research in TCP's flow 
> control to improve Rx throughput. I am not an expert on this topic, but I 
> wonder if the principals in Google's BBR algorithm can help further improve 
> Rx throughput, and I wonder if there is anything that makes TCP fundamentally 
> superior than UDP in implementing flow control. 

There is nothing specific to TCP that makes it better than RX in
implementing flow control other than the fact that TCP has more than
thirty years of active research applied to it and RX does not.

AuriStor continues to invest in RX as we believe that RX can perform as
well as TCP while benefiting from its unique security binding
capabilities.  Reliance Memory's RRAM is targeted at IoT devices.  I
believe that RX is should be the network transport of choice for IoT.

One of the requirements for implementing BBR is fine grained accurate
measurements of RTT which is very hard to obtain from within a userland
implementation that relies upon an operating system's UDP sockets.
However, BBR principals can be applied to the Linux kernel's af_rxrpc
implementation and userland implementations built to use Intel's Data
Plane Development Kit (DPDK).  I would be happy to speak with you
off-list about either.

> When it comes to deployment strategy, there may be workarounds to the 
> high-latency limitation. Each of them, of course, has limitations. I can 
> probably use the technique mentioned below to leverage the TCP throughput in 
> RO volume synchronization, 
> https://lists.openafs.org/pipermail/openafs-info/2018-August/042502.html
> and wait until DPF becomes available in vos operations:
> https://openafs-workshop.org/2019/schedule/faster-wan-volume-operations-with-dpf/

As part of AuriStor's SBIR we were funded to research RX/TCP and
implement it if appropriate.  The accepted theory was that RX/TCP would
permit RX based applications to benefit from all of the research and
implementation improvements that TCP benefited from over the decades.
However, we quickly discovered that an RX application that implemented
both RX/TCP and current day RX/UDP could not ensure fairness for the
RX/UDP connections.  The RX/TCP flows would dominate the network at the
expense of RX/UDP flows because RX/UDP could not properly adjust to
network congestion levels.

Some people argued "good riddance, let RX/UDP die" but the reality is
that RX/UDP is where the existing user base is and it was unacceptable
to me that one class of users should be penalized in favor of another.
In order to permit TCP flows to be mixed with RX/UDP flows fairly,
RX/UDP needed fixing; and once RX/UDP was fixed there was little
justification for RX/TCP.

The same fairness issues apply to Sine Nomine Associate's DPF and prior
Out-of-Band TCP proposals.

> I can also adopt a small home volume, distributed subfolder volume strategy 
> that allows home volumes to move with relocated users across WAN, but keep 
> subdirectory volumes at their respective geographic location. Users can pick 
> a subdirectory that is closest to their current location to work with. When 
> combined with a version control system that uses TCP in syncing, project data 
> synching can be alleviated. 

AuriStor has several ideas that would be beneficial to your deployment
scenarios:

 1. floating master read/write replication.

 2. split horizon volume location service

I would be happy to discuss both topics with you off-list.

> There is a commercial path that we can pursue with AuriStor or other vendors. 
> But I guess that is out of the scope of this mail list. 
> 
> Any other strategies that may help?
> 
> Thank you, Jeff!

You are welcome.

> Simon Guan

Jeffrey Altman

<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Question regarding vos release and volume

2019-08-07 Thread Jeffrey E Altman
On 8/5/2019 4:37 PM, n...@phobos.ws wrote:
> Hello every1,
> 
> a (maybe) minor problem I'm getting with OpenAFS and I'm quite lost, what to 
> do. Given are 2 nodes running OpenAFS 1.8.2 on a Linux system.

> [...]

> Doing a "vos release" for "root.vids", I get:
> 
> --- SNIP ---
> Failed to clone the volume 536870938
> Volume not attached, does not exist, or not on line
> Error in vos release command.
> Volume not attached, does not exist, or not on line
> --- SNIP ---

The OpenAFS volserver does not convey useful error details to the vos
tool.  Nor does the OpenaFS vos indicate to you which server the error
was returned from.  You need to examine the VolserLog file on both
servers to identify the actual error.

> While it works for all other volumes without any problem. I've already 
> removed "server2" and added it again, though without any change.

You are assuming that the problem occurred on server2 when it might have
occurred on server1.  Since there is no RO site on server1 the vos tool
will attempt to create a clone for the purpose of replicating it.

[...]

> I can successfully access the data through the client, though the release 
> process fails for this volume. 

You are most likely accessing the RW volume not the RO volume.

Examine the VolserLog files and report back what you find.  If there are
no useful details, raise the log level to 125 on both volservers and
re-attempt the release.  Then examine the VolserLog files.

Jeffrey Altman

<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] iperf vs rxperf in high latency network

2019-08-07 Thread Jeffrey E Altman
On 8/7/2019 9:35 PM, xg...@reliancememory.com wrote:
> Hello,
> 
> Can someone kindly explain again the possible reasons why Rx is so painfully
> slow for a high latency (~230ms) link? 

As Simon Wilkinson said on slide 5 of "RX Performance"

  https://indico.desy.de/indico/event/4756/session/2/contribution/22

  "There's only two things wrong with RX
* The protocol
* The implementation"

This presentation was given at DESY on 5 Oct 2011.  Although there have
been some improvements in the OpenAFS RX implementation since then the
fundamental issues described in that presentation still remain.

To explain slides 3 and 4.  Prior to the 1.5.53 release the following
commit was merged which increased the default maximum window size from
32 packets to 64 packets.

  commit 3feee9278bc8d0a22630508f3aca10835bf52866
  Date:   Thu May 8 22:24:52 2008 +

rx-retain-windowing-per-peer-20080508

we learned about the peer in a previous connection... retain the
information and keep using it. widen the available window.
makes rx perform better over high latency wans. needs to be present
in both sides for maximal effect.

Then prior to 1.5.66 this commit raised the maximum window size to 128

  commit 310cec9933d1ff3a74bcbe716dba5ade9cc28d15
  Date:   Tue Sep 29 05:34:30 2009 -0400

rx window size increase

window size was previously pushed to 64; push to 128.

and then prior to 1.5.78 which was just before the 1.6 release:

  commit a99e616d445d8b713934194ded2e23fe20777f9a
  Date:   Thu Sep 23 17:41:47 2010 +0100

rx: Big windows make us sad

The commit which took our Window size to 128 caused rxperf to run
40 times slower than before. All of the recent rx improvements have
reduced this to being around 2x slower than before, but we're still
not ready for large window sizes.

As 1.6 is nearing release, reset back to the old, fast, window size
of 32. We can revist this as further performance improvements and
restructuring happen on master.

After 1.6 AuriStor Inc. (then Your File System Inc.) continued to work
on reducing the overhead of RX packet processing.  Some of the results
were presented in Simon Wilkinson's 16 October 2012 talk entitled "AFS
Performance" slides 25 to 30

  http://conferences.inf.ed.ac.uk/eakc2012/

The performance of OpenAFS 1.8 RX is roughly the same as the OpenAFS
master performance from slide 28.  The Experimental RX numbers were the
AuriStor RX stack at the time which was not contributed to OpenAFS.

Since 2012 AuriStor has addressed many of the issues raised in
the "RX Performance" presentation

 0. Per-packet processing expense
 1. Bogus RTT calculations
 2. Bogus RTO implmentation
 3. Lack of Congestion avoidance
 4. Incorrect window estimation when retransmitting
 5. Incorrect window handling during loss recovery
 6. Lock contention

The current AuriStor RX state machine implements SACK based loss
recovery as documented in RFC6675, with elements of New Reno from
RFC5682 on top of TCP-style congestion control elements as documented in
RFC5681. The new RX also implements RFC2861 style congestion window
validation.

When sending data the RX peer implementing these changes will be more
likely to sustain the maximum available throughput while at the same
time improving fairness towards competing network data flows. The
improved estimation of available pipe capacity permits an increase in
the default maximum window size from 60 packets (84.6 KB) to 128 packets
(180.5 KB). The larger window size increases the per call theoretical
maximum throughput on a 1ms RTT link from 693 mbit/sec to 1478 mbit/sec
and on a 30ms RTT link from 23.1 mbit/sec to 49.39 mbit/sec.

AuriStor RX also includes experimental support for RX windows larger
than 255 packets (360KB). This release extends the RX flow control state
machine to support windows larger than the Selective Acknowledgment
table. The new maximum of 65535 packets (90MB) could theoretically fill
a 100 gbit/second pipe provided that the packet allocator and packet
queue management strategies could keep up.  Hint: at present, they don't.

To saturate a 60 Mbit/sec link with 230ms latency with rxmaxmtu set to
1344 requires a window size of approximately 1284 packets.

> From a user perspective, I wonder if there is any *quick Rx code hacking*
> that could help reduce the throughput gap of (iperf2 = 30Mb/s vs rxperf =
> 800Kb/s) for the following specific case. 

Probably not.  AuriStor's RX is significant re-implementation of the
protocol with one eye focused on backward compatibility and the other on
the future.

> We are considering the possibility of including two hosts ~230ms RTT apart
> as server and client. I used iperf2 and rxperf to test throughput between
> the two. There is no other connection competing with the test. So this is
> different from a low-latency, thread or udp buffer exhaustion scenario. 
> 
> iperf2's UDP test shows a bandwidth of ~30Mb/s without packet loss, 

Re: [OpenAFS] aklog: a pioctl failed while setting tokens for cell

2019-07-25 Thread Jeffrey E Altman
On 7/25/2019 5:06 PM, Marcio Barbosa wrote:
> 
>> 10.13.6 is the first version of High Sierra to validate notarized kernel
>> extensions.
> 
> I believe the first version with this requirement is 10.14.5.

10.14.5 is the first to require notarization to run.

10.13.6 is the first High Sierra release to validate notarized kernel
extensions but notarization is not required to execute.

Jeffrey Altman

<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] aklog: a pioctl failed while setting tokens for cell

2019-07-25 Thread Jeffrey E Altman
On 7/25/2019 3:51 PM, Marcio Barbosa wrote:
> Hello,
> 
> One of my VMs is running macOS 10.13.6 (including this security update) and 
> could not reproduce this problem.
> But I am running the OpenAFS-1.8.2 client with MIT Kerberos.
> 
> Best,
> Marcio Barbosa.

10.13.6 is the first version of High Sierra to validate notarized kernel
extensions.  The AuriStorFS v0.188 kernel extension is notarized even
though notarization is not required.

Is SNA's OpenAFS 1.8.2 kernel extension notarized?

Jeffrey Altman
<>

smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] auristor client with AFS servers, timeout at aklog

2019-06-07 Thread Jeffrey E Altman
On 6/7/2019 7:20 AM, Måns Nilsson wrote:
> Hi, 
> 
> I'm a little uncertain how to discuss this, because it is a
> cross-implementation problem, but this problem surely has hit others
> here. I hope.
> 
> I have three db servers in my OpenAFS cell. They all have -- for various
> reasons -- v4 and v6 addresses and corresponding DNS records. When I'm
> trying to use the Auristor-supplied OSX client "aklog" implementation to
> get tokens, the client tries to connect to the IPv6 addresses of the db
> servers. Most likely because it is an Auristor client and it is expecting
> Auristor db servers. Only after some 10 seconds does the client timeout
> and retry over v4, which of course immediately succeeds.
> 
> Is there a fix for this? 
> 
> Or: Am I the only one crazy enough to have  records for my db servers? 

The AuriStorFS aklog relies upon DNS SRV records to find the list of
service endpoints for the protection servers.  If the DNS SRV record
refers to a name with a  record, that entry will be trusted as valid.

Create separate IPv4 A records to refer to your hosts and list those in
the DNS SRV records instead of the hostname that includes both A and
 records.  Note that SRV records reference A and  records and
not CNAME records.  Many sites have

  afsdb1.cell
  afsdb2.cell
  afsdb3.cell

names in DNS.  Only create A records for those names and use them for
the DNS SRV records

  _afs3-vlserver._udp.cell
  _afs3-prserver._udp.cell
  ...

You wouldn't create SRV records indicating that your cell supports TCP
connections

  _afs3-vlserver._tcp.cell
  _afs3-prserver._tcp.cell

so do not create SRV records that indicate that the service supports
IPv6 when it doesn't.

The AuriStorFS rx stack will terminate calls within one second if an
ICMP6 port unreachable response is received.  I wonder if the 10 second
delay is due to ICMP6 packets being firewalled.

Jeffrey Altman

<>

smime.p7s
Description: S/MIME Cryptographic Signature