[OpenAFS-devel] developer takeaways from EAKC 2014

Benjamin Kaduk Mon, 31 Mar 2014 08:30:15 -0700

Hi all,

I was attempting to take notes about things that we could do as developersto improve the experience for users and administrators, based on thefeedback we were getting from the various talks at the conference. I'llpost them here so they are better archived and reach a broader audience.

I think there were a couple of sites that indicated they had tuned theclient cache size and chunk size away from the defaults, and got noticablyimproved performance. I seem to recall that our defaults are mostlyunchanged from when they were created a long time ago, for hardwareseveral generations removed from the current state of affairs. Revisitingthe defaults seems reasonable.

Arne noted that creating a user with id 32766 (ANONYMOUSID) results ingreat confusion. I have pushed gerrit 10950, which requires a littletweaking, and I will follow up there.

CERN has a "big loop" that will have the client retry the vldb lookup if avolume has been "moved behind its back" (due to the storage and uuid beingreconnected to a different file server on a different IP). It looks likethis is already in gerrit, as 10858 if I am reading things correctly.

CERN also has a patch that does per-volume throttling, in my notes. I amless sure I accurately remember what this one was doing, but I think itwas limiting the number of threads servicing requests for a given volume,so that other ("normal") volume accesses were not affected by a singleuser thrashing one volume.

Simon's talk mentioned that rx hot threads are actively harmful, and Mikepushed gerrit 10957 to disable them in the fileserver during the meeting,which got merged to master (yay!).

Simon also talked about how there were three "classes" of VL_ queries:'O', 'N', and 'U'; the vos(1) client has not been updated to use the 'U'family. This is probably not terribly urgent, if I understand correctly,but could perhaps go on a "simple jobs" wiki page.

Speaking of that "simple jobs" wiki page(http://wiki.openafs.org/OpenAFSSimpleJobs/), it sounded like most (all?)of the items there have been completed, so we should update it with morethings that could help get new contributors familiar with things.

A few of us started looking at some linux panics due to stack overflow(e.g., RT 131831) during one of the sessions. We have a number ofroutines that use more stack than they ought to. Chas posted a script tolook at a build tree and pull out stack usage for the various functions,to gerrit 10881. I came away with the impression that Mike was going todo some more work and submit patches to remove some more stack usage;Mike, can you confirm this?

During the gatekeepers open session, I pondered whether we could do moreto make the process of setting up a new development environment morestreamlined for new contributors (I gave an example of having the gerritchange-ID script be in the repo instead of something scp'd from gerrit),but a few people in the room who were in that position said that theydidn't mind the process as-is.

Simon's performance talk gave some examples of things that can be done toimprove performance in (e.g.) the rx stack. We've known that there areissues here for a long time, but haven't really gottent to do much in thearea.

We also heard some ideas about what steps to take towards (at leastpartial) IPv6 support.

At various points (at least some of which were "hallway conversations") wetalked about the testing framework, and how it would be nice to get morethings covered. I pondered whether it would be worth having a script thatcould start up a test cell (servers), if run as root (and bail out earlyif unprivileged).

There are probably more things that didn't make it into my notes. If youremember any, please chime in.


Thanks,

Ben
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

[OpenAFS-devel] developer takeaways from EAKC 2014

Reply via email to