Hi Jeffrey,

(directory, my error)

Ah, that makes more sense... if I understand corerctly, it makes it look like 
each slot is 32 characters long, the first 16 characters of the first slot are 
occupied by other file metadata so only 16 are available for file name data but 
in each subsequent slot all 32 characters are available for file name data. On 
that basis 1 - 16 character names take one slot, 17 - 48 character names take 
2, 49 - 80 character names take 3 slots, 81 - 112 chaarcter names take 4 slots 
and so on. I presume that if we were working in a locale where double-byte 
characters were in file names the packing wouldn't be as good.

Incidentally, isn't the function equivalent to:

{
    int i;
    i = strlen(name);
    return 1 + ((i + 16) >> 5);
}

Implementing the algorithm below against the real data indicates that we are 
using 63,959 slots which is even more painfully close to 64k and explains why 
the mirror update sometimes fails with error messages relating to running out 
of slots..

Thank you, that resolves the anomaly... and as a team we'd already discovered 
there are things about OpenAFS that limit what it can be applied to... there 
have been some restrictions and tweaks needed in the mirroring application. 
I'll point my manager at Auristor for the next tech refresh.

Matthew
________________________________________
From: Jeffrey Altman [[email protected]] 
Sent: 04 July 2016 15:27
To: Matthew Lowy; [email protected]
Subject: Re: [OpenAFS] Number of files in an OpenAFS volume...

On 7/4/2016 6:55 AM, Matthew Lowy wrote:
> Hello,
>
> We have a number of OpenAFS volumes that serve as storage for (public)
> mirrors and one of them is misbehaving when updated from upstream - the
> error indicates we've reached the limit of file names allowed in a volume.

In a volume or a directory?

The theoretical limit of directories in a volume is 2^30 and
non-directories in a volume is 2^30.  There have been incomplete efforts
to raise those limits by treating signed values as unsigned values but I
wouldn't count on them.

> The limit I am seeing is not compatible with my understanding of how
> OpenAFS handles file names in a directory. I've seen in the mail list
> archives the statements about how many file names can fit, that there
> are 64k slots and a file name < 16 in length occupies one slot, a file
> name from 16 to 32 characters long occupies two slots and so on. The
> earliest reference I've found is at
> http://lists.openafs.org/pipermail/openafs-info/2002-September/005812.html

Your understanding is roughly correct except that the numbers Todd
specified are wrong.   The actual number are determined by this function:

/* Find out how many entries are required to store a name. */
int
afs_dir_NameBlobs(char *name)
{
    int i;
    i = strlen(name) + 1;
    return 1 + ((i + 15) >> 5);
}

A slot contains per file metadata followed by name data.  When multiple
slots all of the space in the 2nd and subsequent slots are used for file
name data.

> However...
>
> The directory concerned has more than 21,000 files in it, almost all of
> them have names exceeding 52 characters... as at today there are
> 1,220,000 characters in filenames in that directory. Even assuming they
> pack down perfectly into directory name slots that's over 76,000
> slots... and working them out using the rule above indicates that the
> directory is using over 87,000 slots. These are both significantly above
> 64k.
>
> I don't know if I'm misinterpreting the information in the OpenAFS
> archive or if the information is out of date - but I've not found
> anything that fundamentally is different from the information in the
> archive and I'm looking at a volume that seems to break the limits.

The AFS3 directory format is part of the wire protocol as it is shared
by both the file server and the clients.

> I'd really benefit from understanding what's going on ... how we appear to
> be getting more file name information into a directory than should be
> possible.
>
> /mirror.ox.ac.uk/sites/archive.ubuntu.com/ubuntu/pool/main/l/linux$ ls
> |wc -l
> 21731

This number is within the existing limits.

> /mirror.ox.ac.uk/sites/archive.ubuntu.com/ubuntu/pool/main/l/linux$ ls
> |wc -c
> 1250894

File names of length 52 through 70 characters require three slots.  If
all of the file names are of length 60 and are perfectly packed they
would require 62545 slots which is very close to the limit.

> This is one directory in a mirror of archive.ubuntu.com so you can see
> the contents from (e.g)
> https://launchpad.net/ubuntu/+mirror/mirror.ox.ac.uk-archive which
> points to the presentation of our mirror. The number of files has
> recently gone up because of upstream changes.

The directory size restrictions are one of the reasons that /afs cannot
be used for a large number of applications.  The AuriStor File System
implements a new directory format which is understood only by AuriStor
clients.  This format permits directories to grow to store an unlimited
number of entries.  However, the AuriStor file servers currently apply
an artificial limit of approximately 20 million entries.

More details on the AuriStor File System can be obtained at

  https://www.auristor.com/openafs/migrate-to-auristor/

Jeffrey Altman


_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to