Hi all,

I was attempting to take notes about things that we could do as developers to improve the experience for users and administrators, based on the feedback we were getting from the various talks at the conference. I'll post them here so they are better archived and reach a broader audience.

I think there were a couple of sites that indicated they had tuned the client cache size and chunk size away from the defaults, and got noticably improved performance. I seem to recall that our defaults are mostly unchanged from when they were created a long time ago, for hardware several generations removed from the current state of affairs. Revisiting the defaults seems reasonable.

Arne noted that creating a user with id 32766 (ANONYMOUSID) results in great confusion. I have pushed gerrit 10950, which requires a little tweaking, and I will follow up there.

CERN has a "big loop" that will have the client retry the vldb lookup if a volume has been "moved behind its back" (due to the storage and uuid being reconnected to a different file server on a different IP). It looks like this is already in gerrit, as 10858 if I am reading things correctly.

CERN also has a patch that does per-volume throttling, in my notes. I am less sure I accurately remember what this one was doing, but I think it was limiting the number of threads servicing requests for a given volume, so that other ("normal") volume accesses were not affected by a single user thrashing one volume.

Simon's talk mentioned that rx hot threads are actively harmful, and Mike pushed gerrit 10957 to disable them in the fileserver during the meeting, which got merged to master (yay!).

Simon also talked about how there were three "classes" of VL_ queries: 'O', 'N', and 'U'; the vos(1) client has not been updated to use the 'U' family. This is probably not terribly urgent, if I understand correctly, but could perhaps go on a "simple jobs" wiki page.

Speaking of that "simple jobs" wiki page (http://wiki.openafs.org/OpenAFSSimpleJobs/), it sounded like most (all?) of the items there have been completed, so we should update it with more things that could help get new contributors familiar with things.

A few of us started looking at some linux panics due to stack overflow (e.g., RT 131831) during one of the sessions. We have a number of routines that use more stack than they ought to. Chas posted a script to look at a build tree and pull out stack usage for the various functions, to gerrit 10881. I came away with the impression that Mike was going to do some more work and submit patches to remove some more stack usage; Mike, can you confirm this?

During the gatekeepers open session, I pondered whether we could do more to make the process of setting up a new development environment more streamlined for new contributors (I gave an example of having the gerrit change-ID script be in the repo instead of something scp'd from gerrit), but a few people in the room who were in that position said that they didn't mind the process as-is.

Simon's performance talk gave some examples of things that can be done to improve performance in (e.g.) the rx stack. We've known that there are issues here for a long time, but haven't really gottent to do much in the area.

We also heard some ideas about what steps to take towards (at least partial) IPv6 support.

At various points (at least some of which were "hallway conversations") we talked about the testing framework, and how it would be nice to get more things covered. I pondered whether it would be worth having a script that could start up a test cell (servers), if run as root (and bail out early if unprivileged).

There are probably more things that didn't make it into my notes. If you remember any, please chime in.

Thanks,

Ben
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Reply via email to