[OpenAFS] Re: MacOS AppleDouble excretions
Derrick Brashear sha...@gmail.com writes: On Wed, Oct 13, 2010 at 12:18 AM, Adam Megacz a...@megacz.com wrote: Brandon S Allbery KF8NH allb...@ece.cmu.edu writes: No, he wants AFS to simply refuse to create resource forks, Almost. I would like the AFS CLIENT to refuse, if the user has explicitly requested this behavior. The AFS client does what the Darwin kernel tells it to. If the kernel honors whatever setting, we'd never see the request. Yes, I agree that it would be preferable if Apple added a setting to do this. But I don't think that's going to happen. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: MacOS AppleDouble excretions
Derrick Brashear sha...@gmail.com writes: Since you suggest your first comments are what we misinterpret: I do not suggest that. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: MacOS AppleDouble excretions
omall...@msu.edu writes: I can understand where large sites don't want to go this route globally since it could break something. ... I can understand where AFS Team doesn't want to make it a global default option. I can understand where a user would want the ability to just not create the files I am in complete agreement with these three statements -- and was before I started this thread (assuming an error code is returned to the userspace application when the files are not created). Booker Bense bbe...@slac.stanford.edu writes: I see this as a complete waste of time. Actually I was going to volunteer to write the patch for the Mac client. It's not a waste of _my_ time if it stands a reasonable chance of being included. But, based on this thread, that appears not to be the case. Derrick Brashear sha...@gmail.com writes: POSIX extended attributes are stored in the files. Until we deal with them natively (which requires new RPCs) deleting them actively loses data. Look, this fuss about losing data is a real distraction; can we handle it and stick to the important issues? An error code should ALWAYS be returned to the userspace application by a write operation if the AFS client has declined to perform that action due to end-user configuration choices. I don't think anybody in their right mind (or on this list) is proposing that the resource forks be silently discarded. I think I was pretty clear about this in my original post. Construing my proposal as discarding the forks is not helpful at all, and muddies the issue a lot. If the filesystem reports an error, it has not taken responsibility for the data, so it does not actively lose[s] data. Jeffrey Altman jalt...@secure-endpoints.com writes: The fix for the I don't like DoubleFiles issue is to find the financial or development resources necessary to implement support for EAs I agree, although that is a fix, not the fix. I don't have access to those kind of resources, but I do have sufficient resources to add the client-side option. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: MacOS AppleDouble excretions
Steve Simmons s...@umich.edu writes: Is there any chance of a setting being included in the MacOS client that ^^ Doing this at our site would result in a firestorm of complaints from You and I seem to be talking about different things. It sounds like you're suggesting we modify afs so it understands resource forks properly and generate an error message if someone attempts to create a file whose name might be mistaken for a resource fork. No. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: MacOS AppleDouble excretions
Brandon S Allbery KF8NH allb...@ece.cmu.edu writes: No, he wants AFS to simply refuse to create resource forks, Almost. I would like the AFS CLIENT to refuse, if the user has explicitly requested this behavior. because in his world they never have any use whatsoever. In my world they have no use whatsoever for the kinds of files I personally happen to store in /afs/. So I would like to be able to instruct the AFS client on my personal laptop not to store those files. (And apparently his use case is to be considered the common one.) I'm having trouble parsing this. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: MacOS AppleDouble excretions
Adam Megacz a...@megacz.com writes: There's a MacOS setting to disable the first kind of litter. ^^^ Is there any chance of a setting being included in the MacOS client that ^^^ ^^ It appears that everybody who replied to this thread somehow got the impression that I was asking for a change in the *default* behavior of the client, or a change in the behavior of the server. But I guess jumping up and down hollering with indignation isn't quite as much fun unless you first misinterpret the proposal! ;) - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: bos killed fileserver before it was shut down cleanly.
Russ Allbery r...@stanford.edu writes: The problem is that it's also not uncommon for the fileserver to completely or nearly completely stall when shutting down, Just curious, is this stall a bug in the fileserver, or something which happens for a good reason? If so, what is the reason? In general, I find that these sorts of unexplained stalls (both in the client and on the server components) are the sorts of problems I have the most trouble understanding. It's probably asking too much to hope for a simple FAQ answer when AFS goes out to lunch, what is it eating?. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] MacOS AppleDouble excretions
MacOS seems to litter network shares with two kinds of files: .DS_Store (Finder data) ._filename (AppleDouble resource fork) There's a MacOS setting to disable the first kind of litter. Unfortunately it seems like there is no way to get MacOS to refrain from writing the second kind of file, and it seems like Apple deliberately doesn't want there to be one. Is there any chance of a setting being included in the MacOS client that stops this from happening? The crude way would be to simply refuse to create files whose name starts with the prefix ._, reporting permission-denied or something like that. The more sophisticated approach would probably be to claim to MacOS that /afs/ supports resource forks, and report permission-denied when an attempt to write a resource fork is made. This has the advantage of not being filename-based and not breaking programs which access the filesystem through the POSIX APIs. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: SYNC_connect: temporary failure on circuit 'FSSYNC' (will retry)
Andrew Deason adea...@sinenomine.net writes: Refusing to start means... they start and do nothing? Precisely. Unfortunately I couldn't wait any longer and had to downgrade to 1.4.11. =( - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: OpenAFS on ext4?
Jaap Winius jwin...@umrk.nl writes: Yeah, I've been running OpenAFS (v1.4.12) with ext4 on my private server (Debian squeeze) since June. Its workload and specs are nothing compared to the systems that others have described here, but it's seen almost constant use and has so far not given me any problems. Same here, fairly small/simple installation has been running 1.4.11 server with /vicepX on ext4 for at least six months now. No problems, except this one, which caused total and catastrophic data loss (which was not even remotely close to being OpenAFS's fault): http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2ec0ae3a So glad I run nightly backups. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: SYNC_connect: temporary failure on circuit 'FSSYNC' (will retry)
Andrew Deason adea...@sinenomine.net writes: If you see it a lot, I'd like to know more about what's going on with your server, but otherwise it's not anything to worry about. Well, the FileServer and VolServer are both refusing to start, and this message is the only thing in their logs that seems to explain the situation... What's a circuit? Just a name someone gave for the communication mechanism being used, I guess. I don't associate much meaning with the term. I humbly suggest that using standard unix terminology like named pipe might be a good idea. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: is this what windows folks call integrated login? (but with local hashed password)
Derrick Brashear sha...@gmail.com writes: My laptop has a local copy of my password in hashed form, so it can let Oh. You're not typing a password, so this won't help you. Er, sorry, I should have been more clear about that. I am typing in my password physically at the keyboard. My laptop has a copy of that password on the disk in hashed format so that it can verify that I typed in the correct password, but if somebody steals my laptop they can't simply read my password off the disk (at least I assume MacOS does this like all good unices do -- it would be a shame if it didn't; this is the only reason I consider it safe to use the same password for both my laptop's local login and my Kerberos principal). So, anyways, lack of network access will not delay the local operating system's decision about whether or not to let me proceed with my login. But it may delay the acquisition of tickets. But if I'm not on the network, then ending up logged in locally without tickets is no big deal -- especially if there's a daemon sitting around waiting for the network to come back. I guess it would need to be holding my unhashed password in memory, but with encrypted swap and a screensaver password that's still not a huge concern. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: is this what windows folks call integrated login? (but with local hashed password)
Dale Pontius pont...@btv.ibm.com writes: Be very careful of an integrated login on a laptop. I think this might be where Windows integrated login and what I'm looking for are different. My laptop has a local copy of my password in hashed form, so it can let me log in even when there's no network access. I just want it to spawn a separate background thread that tries to get tickets for me. If I'm not on the network it's no big deal; I won't care that I have no tickets because I won't be able to do anything with them. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: is this what windows folks call integrated login?
Derrick Brashear sha...@dementia.org writes: Did you try the feature that 'obtains tickets at login' in the prefs pane? Hey, that's pretty nifty; when was it added? Is there documentation anywhere? It's not quite what I wanted, though... it makes me type in my username (er, pricipal) and password a second time. I was hoping I could just tell it hey, assume that my kerberos principal is the same as ThisCell (but upcased), and use the username and password I typed into the MacOSX Login Dialog instead of asking me again. Anything out there that can do this? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] is this what windows folks call integrated login?
I have a MacOS laptop. My username and local password on the laptop happen to match my kerberos username and password. My kerberos tickets expire after 10 hours, but are renewable for 10 *days*. It occurred to me that it would be nifty if my laptop acquired kerberos tickets for me when I logged in (during the brief window when my un-hashed password is present in laptop RAM), and made an attempt to renew them once an hour (if connected to the network). This would save me having to do a separate kinit after logging in, and having to re-kinit every 10 hours. I've got a screensaver lock and encrypt my swapfile, so I'm not too worried about physical theft issues resulting in ticket theft. Is there a piece of software that does this? It's been a long, long time since I used Windows, but it sounds like this feature is what the Windows client calls integrated login. Or maybe not. Either way, is there a way to get MacOS to do this? Thanks, - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] ERROR: Cache dir check failed (cannot use tmpfs as cache partition)
Is there a reason why tmpfs isn't supported (1.5.74)? I was really hoping to start using it for my cache partitions with 1.5.x and 1.6. It has all the advantages of memcache plus it can pushed out to swap when there's memory pressure (which IIRC memcache cannot). Currently I use dd to make a huge image in a tmpfs partition and then loopback-mount that, but aside from all the overhead of passing through the VFS layer twice this also means that each page is in memory twice (once in tmpfs's memory and once in the buffercache). Indeed, eliminating this double buffering is one of the big selling points of tmpfs. Thanks, - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Starting AFS cache scan...
Andrew Deason adea...@sinenomine.net writes: If afsd is being backgrounded, it is not by us. afsd only exits after it tries to mount /afs. From what Russ says and from what I see in the Debian init scripts, the Debian init scripts do not background it either. Yikes, I tried putting ls /afs/megacz.com in the init script right after the line that launches afsd, and I got this: r...@mute:~#/etc/init.d/openafs-client- start Starting AFS services: openafs afsd. afsd: All AFS daemons started. afs started; ls /afs/megacz.com: /etc/init.d/openafs-client-: line 156: 2152 Segmentation fault ls /afs/megacz.com/ Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.595464] Oops: [#1] SMP Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] Stack: Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] 0001 c62b19e0 c62b19fc d1ce4007 581b0001 Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013]0001 c62b19c0 0001 01cd c716fe64 c62b19c0 Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] Call Trace: Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1ce4007] afs_GetServer+0x4f4/0x52d [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1cfafb6] InstallUVolumeEntry+0x334/0x38f [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1cfb5f0] afs_SetupVolume+0x2cf/0x3a7 [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1cfbb3c] afs_NewVolumeByName+0x474/0x515 [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1cf061a] EvalMountData+0x298/0x43d [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1cf081f] EvalMountPoint+0x60/0x129 [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [c01b92a8] request_key+0x28/0x50 Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1cf09cc] afs_EvalFakeStat_int+0xe4/0x34b [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1cf0c43] afs_EvalFakeStat+0x7/0x9 [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1cf3774] afs_open+0x98/0x50c [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1cf0c3a] afs_TryEvalFakeStat+0x7/0x9 [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1d097ec] afs_linux_open+0x0/0xd6 [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [c01731a3] __dentry_open+0x10d/0x1fc Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [c01732ae] nameidata_to_filp+0x1c/0x2c Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [d1d0984f] afs_linux_open+0x63/0xd6 [openafs] Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] Process ls (pid: 2152, ti=c716e000 task=cc71c900 task.ti=c716e000) Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [c017d9fa] do_filp_open+0x34f/0x684 Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [c0172fc0] do_sys_open+0x40/0xb0 Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [c0103857] sysenter_past_esp+0x78/0xb1 Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] === Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] Code: 19 01 00 00 89 e0 c7 44 24 24 00 00 00 00 c7 44 24 20 00 00 00 00 e8 5a 70 fd ff 85 c0 0f 85 fa 00 00 00 8b 46 20 b9 04 00 00 00 8b 50 10 8b 04 24 e8 fb 1e 00 00 85 c0 89 c3 0f 84 dd 00 00 00 Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [c0173074] sys_open+0x1e/0x23 Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] [c02b] quirk_vt8235_acpi+0x10/0x7a Message from sysl...@mute at Jun 20 21:40:11 ... kernel:[ 157.597013] EIP: [d1ce317f] afs_GetCapabilities+0x45/0x13e [openafs] SS:ESP 0068:c716fc28 ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] gerrit now has bugzilla integration
FYI http://code.google.com/p/gerrit/issues/detail?id=124 ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Starting AFS cache scan...
Derrick Brashear sha...@dementia.org writes: AFS can't mount any faster than it does with dynroot. That's okay! I don't need it to mount any faster; I just need it to not background itself until it is done mounting. is afsd being backgrounded or is the issue that /afs/(something) isn't up yet? Er, they are the same issue. Obviously afsd needs to get backgrounded eventually (otherwise the boot process would not continue). The problem is that afsd is being backgrounded too early. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Starting AFS cache scan...
Derrick Brashear sha...@dementia.org writes: Is this dynroot or not? It is dynroot. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] compile error with 1.5.74 on kernel 2.6.28.10
This is using openafs_1.5.74.1-1.dsc (Debian). I can probably kludge my way around it, but I figured in the run-up to 1.6 y'all would like to know about it. - a CC [M] /usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP/osi_gcpags.o /usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP/osi_gcpags.c: In function 'afs_osi_proc2cred': /usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP/osi_gcpags.c:111: error: implicit declaration of function 'set_cr_group_info' make[5]: *** [/usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP/osi_gcpags.o] Error 1 make[4]: *** [_module_/usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP] Error 2 make[4]: Leaving directory `/usr/src/linux-2.6.28.10' make[3]: *** [openafs.ko] Error 2 make[3]: Leaving directory `/usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP' make[2]: *** [linux_compdirs] Error 2 make[2]: Leaving directory `/usr/src/modules/openafs/src/libafs' make[1]: *** [all] Error 2 make[1]: Leaving directory `/usr/src/modules/openafs' make: *** [build-stamp] Error 2 ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Starting AFS cache scan...
Derrick Brashear sha...@dementia.org writes: does your rc file include an afs post inst hook? Yep have it run rxdebug localhost 7001 in a while loop until it succeeds Hrm, okay, I guess that will work. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Starting AFS cache scan...
Derrick Brashear sha...@gmail.com writes: so set it to a script, and provide a script which just runs rxdebug localhost 7001 in a loop until it succeeds, then exits. Hrm, actually, this doesn't seem to be working. Perhaps I'm doing it wrong? Apparently the RX server will respond to debug requests before /afs is mounted. - a r...@mute:~#rxdebug localhost 7001 echo x Trying 127.0.0.1 (port 7001): Free packets: 215/97, packet reclaims: 0, calls: 0, used FDs: 64 not waiting for packets. 0 calls waiting for a thread 1 threads are idle 0 calls have waited for a thread Done. x ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Starting AFS cache scan...
Russ Allbery r...@stanford.edu writes: Is there any reason not to do this? If not, I can just make this change in the Debian package. I don't recall why start-stop-daemon was used there in the first place. +1 (but then again I'm a runit guy) - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Starting AFS cache scan...
I've got my openafs cache in tmpfs (via a loopback-mount), so every time the machine comes up it sees an empty cache directory and spends some time Starting AFS cache scan... -- creating the directory structure. Unfortunately the AFS startup script returns before this process finishes. This means that other startup scripts -- which depend on /afs being mounted -- will end up running before /afs has been mounted. Is there any way to change this behavior so that /etc/init.d/openafs-client doesn't yield control until it has at least attempted to mount /afs? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: group prefix doesn't match owner
Derrick Brashear sha...@gmail.com writes: When creating a group foo:bar as admin, I often find that I have to use the -owner parameter to see the owner to foo(something). I see. Is it official AFS policy that this usage is supported? Which usage? I'm not sure what you're asking. Sorry, let me rephrase. The following sequence of commands generates an error, but appears to work -- by which I mean that it leaves me in a state where there is a group named blah:booh but no user named blah. $pts cu blah $pts creategroup blah:booh -owner blah $pts delete blah $pts ex blah:booh Is it official AFS policy that this is supposed to work this way, and will continue to work this way in the future? If so, perhaps we should consider changing the error Badly formed name (group prefix doesn't match owner) into a warning if it's being invoked by system:administrators (who could just use the sequence of commands above instead). Or maybe let -force override the error. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: experience of SQLite on AFS
Simon Wilkinson s...@inf.ed.ac.uk writes: Please notify me once the this is fixed in the Linux CM and I will test it for you. Derrick has pushed changes Still working on this. It's been a long time since I had this setup running, so I have to reproduce the corruption first before I can see if it has been fixed. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] group prefix doesn't match owner
Is there any reason why pts won't let system:administrator create groups whose prefix does not match any user? $pts ex blah pts: User or group doesn't exist so couldn't look up id for blah $pts creategroup blah:booh pts: Badly formed name (group prefix doesn't match owner?) ; unable to create group blah:booh Clearly this can be circumvented by system:administrator: $pts cu blah User blah has id 100015 $pts creategroup blah:booh -owner blah group blah:booh has id -1012 $pts delete blah $pts ex blah:booh Name: blah:booh, id: -1012, owner: 0, creator: megacz, membership: 0, flags: S-M--, group quota: 0. is there a danger in doing this, other than perhaps confusion? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: experience of SQLite on AFS
Jeffrey Altman jalt...@secure-endpoints.com writes: When a whole file lock is write-held, all of the dirty data in the cache must be written back to the file server before the lock is released. This is currently not being done and as a result, the database becomes corrupted. I suspect this will be fixed shortly. Please notify me once the this is fixed in the Linux CM and I will test it for you. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: experience of SQLite on AFS
Ken Dreyer ktdre...@ktdreyer.com writes: SQLite has an option in os_unix.c (SQLITE_ENABLE_LOCKING_STYLE) to automatically figure out the database's filesystem type and use the most appropriate locking mechanism for that filesystem. Adam Megacz wrote a patch to SQLite back in 2006 that added AFS to this list of filesystems SQLite could detect. I'm not certain, but I think this only works for OSX (Adam, correct me if I'm wrong :-) IIRC that is correct. Also, DRHipp never merged the patch (even though I sent him the legal papers he asked for). Additionally, SQLite also has the (undocumented?) ability to define a fixed locking style at compile-time with SQLITE_FIXED_LOCKING_STYLE. I must hasten to add that I have never been able to get sqlite working in a scenario where multiple client machines are concurrently accessing the same database -- even when whole file locking is in use. I originally thought that using whole-file locks only (and no byte-range locks) would work, but as far as I have been able to determine, it *does not*. We hope we can make use of byte-range locking some day when OpenAFS supports this on *nix. Me too, but my hopes are not high. The fact that the databases become corrupted when using whole-file locks only suggests that there is a more subtle problem lurking here. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] sqlite on AFS will not work, even with whole-file locking
Brandon Simmons brandon.m.simm...@gmail.com writes: Thanks for the response. It seems like whole-file locking in sqlite would be a good choice for me in any case, In a situation where the whole-file locking scheme is used, would AFS be an acceptable choice? Would it be better than NFS? I had the same idea, and tried it. It does not work. Your databases will get corrupted. I never figured out why, although I did confirm that sqlite was in fact requesting only whole-file locks. It would be nice if it worked, though. There are a lot of applications out there where writes to the database are extremely rare, so invalidating all the clients' caches is not a problem. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] why PAGs?
I recently found out that Coda does not have PAGs, and deliberately omits them (it's not just that they haven't had time to implement them). This got me to wondering: why does AFS have PAGs? Restricting the focus to UNIX for a moment, if we assume that there is a local userid for every PTS identity, are PAGs really necessary? Even for something like mod_waklog, it should be possible to use local userids for credential isolation. Just curious. I'm not seriously proposing getting rid of PAGs or anything like that. Just trying to understand things. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: another MacOS cache manager wedging
Derrick Brashear sha...@gmail.com writes: I'd be interested to know if this still happens with 1.5.72, No, it does not still happen. I've been using 1.5.72 for at least a week now and I am VERY happy with it. On MacOS it is a massive improvement over the 1.4.x sieres. Thank you! - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] more kernel panics
Hope these reports are helpful... - a meg...@quine:~$sudo ~/bin/openafs-decode.pl -i /Library/Logs/DiagnosticReports/Kernel_2010-02-14-170743_Adam-Megaczs-MacBook.panic /Library/OpenAFS/Tools/root.client/usr/vice/etc/afs.kext appears to be loadable (not including linkage for on-disk libraries). /tmp/afsdebugb4JqKK/gdb.input:6: Error in sourced command file: Cannot access memory at address 0x0 Interval Since Last Panic Report: 299638 sec Panics Since Last Report: 2 Anonymous UUID:06FDBD1C-6E37-46D3-92E6-009678548AC0 Sun Feb 14 17:07:43 2010 panic(cpu 1 caller 0x2a7ac2): Kernel trap at 0x, type 14=page fault, registers: CR0: 0x8001003b, CR2: 0x, CR3: 0x00101000, CR4: 0x06e0 EAX: 0x4613b080, EBX: 0x, ECX: 0x, EDX: 0x051dea04 CR2: 0x, EBP: 0x3513bcc0, ESI: 0x00b8, EDI: 0x EFL: 0x00010216, EIP: 0x, CS: 0x0004, DS: 0x3513000c Error code: 0x0010 Backtrace (CPU 1), Frame : Return Address (4 potential args on stack) 0x3513ba78 : 0x21b2bd (0x5cf868 0x3513baac 0x223719 0x0) 0x3513bac8 : 0x2a7ac2 (0x591c30 0x0 0xe 0x591dfa) 0x3513bba8 : 0x29d968 (0x3513bbc0 0x3513bbf8 0x3513bcc0 0x0) 0x3513bbb8 : 0x0 (0xe 0x35130048 0x2a000c 0xc) 0x3513bcc0 : 0x460bedca (0x45e56864 0x5e43004 0xb8 0x0) 0x3513bdd0 : 0x460c3746 (0x5e43004 0x45e56864 0x3513bf5c 0x4) 0x3513bf18 : 0x460e80e1 (0x45e56864 0x3513bf5c 0x4 0x3513bf3c) 0x3513bf3c : 0x460a6c38 (0x45e56864 0x3513bf5c 0x227595 0x4613bd00) 0x3513bf80 : 0x460a72c1 (0x4613bd00 0x0 0x3fc 0x57e0390) 0x3513bfac : 0x46127f87 (0x4613ba48 0x3513bfc8 0x227595 0x2) 0x3513bfc8 : 0x29d68c (0x35173ad0 0x0 0x29d69b 0x3fe1078) Kernel Extensions in backtrace (with dependencies): org.openafs.filesystems.afs(1.5.71)@0x4609a000-0x46148fff BSD process name corresponding to current thread: kernel_task Mac OS version: 10C540 Kernel version: Darwin Kernel Version 10.2.0: Tue Nov 3 10:37:10 PST 2009; root:xnu-1486.2.11~1/RELEASE_I386 System model name: MacBook1,1 (Mac-F4208CC8) System uptime in nanoseconds: 10802080978895 unloaded kexts: com.apple.iokit.IOUSBMassStorageClass 2.5.1 (addr 0x34e81000, size 0x45056) - last unloaded 3666399322849 loaded kexts: org.virtualbox.kext.VBoxNetAdp 3.0.6 - last loaded 3537690793655 org.virtualbox.kext.VBoxNetFlt 3.0.6 org.virtualbox.kext.VBoxUSB 3.0.6 org.virtualbox.kext.VBoxDrv 3.0.6 com.cisco.nke.ipsec 2.0.1 org.openafs.filesystems.afs 1.5.71 com.apple.driver.IOBluetoothBNEPDriver 2.2.4f3 com.apple.filesystems.autofs2.1.0 com.apple.Dont_Steal_Mac_OS_X 7.0.0 com.apple.iokit.CHUDUtils 201 com.apple.driver.AppleIntelYonahProfile 14 com.apple.iokit.CHUDProf214 com.apple.driver.AudioIPCDriver 1.1.2 com.apple.driver.AppleHDA 1.7.9a4 com.apple.driver.AppleUpstreamUserClient3.1.0 com.apple.driver.AppleIntelGMA950 6.0.6 com.apple.driver.SMCMotionSensor3.0.0d4 com.apple.iokit.AppleYukon2 3.1.14b1 com.apple.driver.AirPort.Atheros421.19.8 com.apple.driver.ACPI_SMC_PlatformPlugin4.0.1d0 com.apple.driver.AppleLPC 1.4.9 com.apple.driver.AppleBacklight 170.0.14 com.apple.driver.AppleIntelIntegratedFramebuffer6.0.6 com.apple.driver.AppleUSBTrackpad 1.8.0b4 com.apple.driver.AppleUSBTCKeyEventDriver 1.8.0b4 com.apple.driver.AppleUSBTCKeyboard 1.8.0b4 com.apple.driver.AppleIRController 251.1.4 com.apple.driver.AppleRAID 4.0.6 com.apple.BootCache 31 com.apple.iokit.IOAHCIBlockStorage 1.6.0 com.apple.driver.AppleUSBHub3.8.4 com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1 com.apple.driver.AppleAHCIPort 2.0.1 com.apple.driver.AppleUSBEHCI 3.7.5 com.apple.driver.AppleEFINVRAM 1.3.0 com.apple.driver.AppleFWOHCI4.4.0 com.apple.driver.AppleUSBUHCI 3.7.5 com.apple.driver.AppleIntelPIIXATA 2.5.0 com.apple.driver.AppleRTC 1.3 com.apple.driver.AppleHPET 1.4 com.apple.driver.AppleSmartBatteryManager 160.0.0 com.apple.driver.AppleACPIButtons 1.3 com.apple.driver.AppleSMBIOS1.4 com.apple.driver.AppleACPIEC1.3 com.apple.driver.AppleAPIC 1.4 com.apple.driver.AppleIntelCPUPowerManagementClient 96.0.0 com.apple.security.sandbox 0 com.apple.security.quarantine 0 com.apple.nke.applicationfirewall 2.1.11 com.apple.driver.AppleIntelCPUPowerManagement 96.0.0 com.apple.driver.AppleProfileReadCounterAction 17 com.apple.driver.AppleProfileTimestampAction10 com.apple.driver.AppleProfileThreadInfoAction 14 com.apple.driver.AppleProfileRegisterStateAction10 com.apple.driver.AppleProfileKEventAction 10 com.apple.driver.AppleProfileCallstackAction20 com.apple.iokit.IOSurface 73.0 com.apple.iokit.IOBluetoothSerialManager2.2.4f3 com.apple.iokit.CHUDKernLib 207 com.apple.driver.DspFuncLib 1.7.9a4 com.apple.iokit.IOSerialFamily 10.0.3 com.apple.iokit.IOFireWireIP2.0.3 com.apple.iokit.IO80211Family
[OpenAFS] another kernel panic with 1.5.71
meg...@quine:~$sudo ~/bin/openafs-decode.pl -i /Library/Logs/DiagnosticReports/Kernel_2010-02-07-174609_Adam-Megaczs-MacBook.panic /Library/OpenAFS/Tools/root.client/usr/vice/etc/afs.kext appears to be loadable (not including linkage for on-disk libraries). /tmp/afsdebuggvaRGC/gdb.input:6: Error in sourced command file: Cannot access memory at address 0x0 Interval Since Last Panic Report: 1655062 sec Panics Since Last Report: 9 Anonymous UUID:06FDBD1C-6E37-46D3-92E6-009678548AC0 Sun Feb 7 17:46:09 2010 panic(cpu 1 caller 0x2a7ac2): Kernel trap at 0x, type 14=page fault, registers: CR0: 0x8001003b, CR2: 0x, CR3: 0x00101000, CR4: 0x06e0 EAX: 0x45f54080, EBX: 0x, ECX: 0x, EDX: 0x094a6c04 CR2: 0x, EBP: 0x3467bbfc, ESI: 0x0657, EDI: 0x EFL: 0x00010206, EIP: 0x, CS: 0x0004, DS: 0x3467000c Error code: 0x0010 Backtrace (CPU 1), Frame : Return Address (4 potential args on stack) 0x3467b9b8 : 0x21b2bd (0x5cf868 0x3467b9ec 0x223719 0x0) 0x3467ba08 : 0x2a7ac2 (0x591c30 0x0 0xe 0x591dfa) 0x3467bae8 : 0x29d968 (0x3467bafc 0x3467bbfc 0x0 0xe) 0x3467baf4 : 0x0 (0xe 0x34670048 0x2a000c 0xc) 0x3467bbfc : 0x45ed7dca (0x45ce4664 0x5ade004 0x657 0x0) 0x3467bd0c : 0x45edc746 (0x5ade004 0x45ce4664 0x3467be88 0x1) 0x3467be54 : 0x45f02fcd (0x45ce4664 0x3467be88 0x1 0x0) 0x3467beb0 : 0x45f3eebc (0x45ce4664 0x42d8280 0x43f53d4 0x88336f0) 0x3467bed8 : 0x2f6d6c (0x3467befc 0x3711fcb9 0x0 0x0) 0x3467bf28 : 0x2e3e3c (0x88336f0 0x1 0x5aee7c4 0x3467bf5c) 0x3467bf78 : 0x4ee5dc (0x49f6000 0x5aee6c0 0x5aee704 0x0) 0x3467bfc8 : 0x29deb8 (0x4fc3014 0x0 0x4 0x4fc3014) No mapping exists for frame pointer Backtrace terminated-invalid frame pointer 0xbfffe5d8 Kernel Extensions in backtrace (with dependencies): org.openafs.filesystems.afs(1.5.71)@0x45eb3000-0x45f61fff BSD process name corresponding to current thread: emacs Mac OS version: 10C540 Kernel version: Darwin Kernel Version 10.2.0: Tue Nov 3 10:37:10 PST 2009; root:xnu-1486.2.11~1/RELEASE_I386 System model name: MacBook1,1 (Mac-F4208CC8) System uptime in nanoseconds: 240706717768 unloaded kexts: com.apple.driver.AppleFileSystemDriver 2.0 (addr 0x2ebb2000, size 0x12288) - last unloaded 146893998952 loaded kexts: org.virtualbox.kext.VBoxNetAdp 3.0.6 - last loaded 52888618916 org.virtualbox.kext.VBoxNetFlt 3.0.6 org.virtualbox.kext.VBoxUSB 3.0.6 org.virtualbox.kext.VBoxDrv 3.0.6 com.cisco.nke.ipsec 2.0.1 org.openafs.filesystems.afs 1.5.71 com.FTDI.driver.FTDIUSBSerialDriver 2.2.14 com.apple.driver.IOBluetoothBNEPDriver 2.2.4f3 com.apple.filesystems.autofs2.1.0 com.apple.Dont_Steal_Mac_OS_X 7.0.0 com.apple.iokit.CHUDUtils 201 com.apple.driver.AppleIntelYonahProfile 14 com.apple.iokit.CHUDProf214 com.apple.driver.AppleIntelGMA950 6.0.6 com.apple.driver.AudioIPCDriver 1.1.2 com.apple.driver.AppleHDA 1.7.9a4 com.apple.driver.AppleUpstreamUserClient3.1.0 com.apple.driver.AppleIntelIntegratedFramebuffer6.0.6 com.apple.driver.SMCMotionSensor3.0.0d4 com.apple.iokit.AppleYukon2 3.1.14b1 com.apple.driver.AirPort.Atheros421.19.8 com.apple.driver.ACPI_SMC_PlatformPlugin4.0.1d0 com.apple.driver.AppleLPC 1.4.9 com.apple.driver.AppleBacklight 170.0.14 com.apple.iokit.SCSITaskUserClient 2.6.0 com.apple.driver.AppleIRController 251.1.4 com.apple.driver.AppleUSBTrackpad 1.8.0b4 com.apple.driver.AppleUSBTCKeyEventDriver 1.8.0b4 com.apple.driver.AppleUSBTCKeyboard 1.8.0b4 com.apple.iokit.IOAHCIBlockStorage 1.6.0 com.apple.driver.AppleRAID 4.0.6 com.apple.driver.AppleUSBHub3.8.4 com.apple.BootCache 31 com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1 com.apple.driver.AppleUSBEHCI 3.7.5 com.apple.driver.AppleFWOHCI4.4.0 com.apple.driver.AppleAHCIPort 2.0.1 com.apple.driver.AppleIntelPIIXATA 2.5.0 com.apple.driver.AppleUSBUHCI 3.7.5 com.apple.driver.AppleEFINVRAM 1.3.0 com.apple.driver.AppleRTC 1.3 com.apple.driver.AppleHPET 1.4 com.apple.driver.AppleSmartBatteryManager 160.0.0 com.apple.driver.AppleACPIButtons 1.3 com.apple.driver.AppleSMBIOS1.4 com.apple.driver.AppleACPIEC1.3 com.apple.driver.AppleAPIC 1.4 com.apple.driver.AppleIntelCPUPowerManagementClient
[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?
Derrick Brashear sha...@gmail.com writes: Ok, so, can you gather rxdebug (hungclient) 7001 and perhaps a couple minutes of tcpdump -s 1500 -n -w /tmp/packets host (hungclient) and port 7001 at this point? (specify an ethernet interface with -i if it's not the default that's your upstream) I can't predict when the hangs will happen, so I can't start the tcpdump until it's already hung. Is this still going to be useful? If so, I will gather the data you request. By the way, I tried upgrading to 1.5.71 which -- except for the issue I'm just about to post -- has been really great. The blocking issue has not happened yet (although I've only been using it a week), and fs precache is a godsend for streaming media -- works beautifully. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] reproducible kernel panic with 1.5.71 on MacOS 10.6
I can get this to happen consistently. - a meg...@quine:/tmp$./decode.pl -i /Library/Logs/DiagnosticReports/Kernel_2010-02-06-135623_Adam-Megaczs-MacBook.panic /Library/OpenAFS/Tools/root.client/usr/vice/etc/afs.kext appears to be loadable (not including linkage for on-disk libraries). /var/folders/I6/I6COgALlF4K3h0E+O9vmUE+++TI/-Tmp-/afsdebugI8kfyu/gdb.input:6: Error in sourced command file: Cannot access memory at address 0x0 Can't write to folder /var/db/openafs/logs. at ./decode.pl line 268 main::write_dump_file('/var/db/openafs/logs/crash.dump', 'HASH(0x811c10)', 'add symbol table from file /var/folders/I6/I6COgALlF4K3h0E+O...') called at ./decode.pl line 101 .. Interval Since Last Panic Report: 1586223 sec Panics Since Last Report: 2 Anonymous UUID:06FDBD1C-6E37-46D3-92E6-009678548AC0 Sat Feb 6 13:56:23 2010 panic(cpu 0 caller 0x2a7ac2): Kernel trap at 0x, type 14=page fault, registers: CR0: 0x8001003b, CR2: 0x, CR3: 0x00101000, CR4: 0x06e0 EAX: 0x45fc7080, EBX: 0x, ECX: 0x, EDX: 0x04b94404 CR2: 0x, EBP: 0x34f63cc0, ESI: 0x180c, EDI: 0x EFL: 0x00010216, EIP: 0x, CS: 0x0004, DS: 0x34f6000c Error code: 0x0010 Backtrace (CPU 0), Frame : Return Address (4 potential args on stack) 0x34f63a78 : 0x21b2bd (0x5cf868 0x34f63aac 0x223719 0x0) 0x34f63ac8 : 0x2a7ac2 (0x591c30 0x0 0xe 0x591dfa) 0x34f63ba8 : 0x29d968 (0x34f63bc0 0x34f63bf8 0x34f63cc0 0x0) 0x34f63bb8 : 0x0 (0xe 0x34f60048 0x2a000c 0xc) 0x34f63cc0 : 0x45f4adca (0x45ce2b94 0x9089004 0x180c 0x0) 0x34f63dd0 : 0x45f4f746 (0x9089004 0x45ce2b94 0x34f63f5c 0x4) 0x34f63f18 : 0x45f740e1 (0x45ce2b94 0x34f63f5c 0x4 0x34f63f3c) 0x34f63f3c : 0x45f32c38 (0x45ce2b94 0x34f63f5c 0x246 0x45fc7d00) 0x34f63f80 : 0x45f332c1 (0x45fc7d00 0x34f63fac 0x487ab6 0x34f73ad0) 0x34f63fac : 0x45fb3f87 (0x34f73ad0 0x34f63fcc 0x227595 0x2) 0x34f63fc8 : 0x29d68c (0x34f73ad0 0x0 0x8 0x3dec334) Kernel Extensions in backtrace (with dependencies): org.openafs.filesystems.afs(1.5.71)@0x45f26000-0x45fd4fff BSD process name corresponding to current thread: kernel_task Mac OS version: 10C540 Kernel version: Darwin Kernel Version 10.2.0: Tue Nov 3 10:37:10 PST 2009; root:xnu-1486.2.11~1/RELEASE_I386 System model name: MacBook1,1 (Mac-F4208CC8) System uptime in nanoseconds: 217490384250 unloaded kexts: com.apple.driver.AppleFileSystemDriver 2.0 (addr 0x2ebaa000, size 0x12288) - last unloaded 197935465108 loaded kexts: org.virtualbox.kext.VBoxNetAdp 3.0.6 - last loaded 54522995514 org.virtualbox.kext.VBoxNetFlt 3.0.6 org.virtualbox.kext.VBoxUSB 3.0.6 org.virtualbox.kext.VBoxDrv 3.0.6 com.cisco.nke.ipsec 2.0.1 org.openafs.filesystems.afs 1.5.71 com.FTDI.driver.FTDIUSBSerialDriver 2.2.14 com.apple.driver.IOBluetoothBNEPDriver 2.2.4f3 com.apple.filesystems.autofs2.1.0 com.apple.Dont_Steal_Mac_OS_X 7.0.0 com.apple.iokit.CHUDUtils 201 com.apple.driver.AppleIntelYonahProfile 14 com.apple.iokit.CHUDProf214 com.apple.driver.AppleIntelGMA950 6.0.6 com.apple.driver.AudioIPCDriver 1.1.2 com.apple.driver.AppleHDA 1.7.9a4 com.apple.driver.AppleUpstreamUserClient3.1.0 com.apple.driver.AppleIntelIntegratedFramebuffer6.0.6 com.apple.driver.SMCMotionSensor3.0.0d4 com.apple.iokit.AppleYukon2 3.1.14b1 com.apple.driver.AirPort.Atheros421.19.8 com.apple.driver.AppleLPC 1.4.9 com.apple.driver.ACPI_SMC_PlatformPlugin4.0.1d0 com.apple.driver.AppleBacklight 170.0.14 com.apple.iokit.SCSITaskUserClient 2.6.0 com.apple.driver.AppleIRController 251.1.4 com.apple.driver.AppleUSBTrackpad 1.8.0b4 com.apple.driver.AppleUSBTCKeyEventDriver 1.8.0b4 com.apple.driver.AppleUSBTCKeyboard 1.8.0b4 com.apple.iokit.IOAHCIBlockStorage 1.6.0 com.apple.driver.AppleRAID 4.0.6 com.apple.driver.AppleFWOHCI4.4.0 com.apple.driver.AppleUSBHub3.8.4 com.apple.driver.AppleAHCIPort 2.0.1 com.apple.driver.AppleUSBEHCI 3.7.5 com.apple.BootCache 31 com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1 com.apple.driver.AppleIntelPIIXATA 2.5.0 com.apple.driver.AppleEFINVRAM 1.3.0 com.apple.driver.AppleUSBUHCI 3.7.5 com.apple.driver.AppleRTC 1.3 com.apple.driver.AppleHPET 1.4 com.apple.driver.AppleSmartBatteryManager
[OpenAFS] bug report: MacOS installer says a newer version is installed even if it was uninstalled
To reproduce: 1. Install OpenAFS 1.5.71 2. Reboot 3. Run OpenAFS 1.5.71 Uninstall.command script 4. Reboot 5. Attempt to install OpenAFS 1.4.12rc2 It will refuse to install. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: bug report: MacOS installer says a newer version is installed even if it was uninstalled
Derrick Brashear sha...@gmail.com writes: that's true. 1.5.x 1.4.x for their rules. the 1.5 installers do let you backrev. we could backport it to 1.4 Actually, I think the problem is that the 1.4.x uninstaller does not pkgutil --forget org.openafs.OpenAFS.pkg. That ought to fix it. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: another MacOS cache manager wedging
Derrick Brashear sha...@gmail.com writes: On Sun, Jan 31, 2010 at 2:23 PM, Adam Megacz a...@megacz.com wrote: server 169.229.3.178 partition /vicepa RO Site -- New release server gentzen.megacz.com partition /vicepa RO Site -- Old release server gentzen.megacz.com partition /vicepa RW Site -- New release server gentzen.megacz.com partition /vicepa RO Site -- New release Uh. This isn't right. You have 2 RO sites on the same server and partition. Ok, fixed, but the problem persists (see below). I'm upgrading to 1.4.12rc2 now, I will see if that solves the issue. - a cmeg...@quine:~$cmdebug localhost Lock afs_xvcache status: (upgrade_waiting, write_locked(pid:30499 at:335), 1 waiters) Lock afs_xserver status: (none_waiting, 1 read_locks(pid:0)) Lock afs_xvcb status: (writer_waiting, write_locked(pid:0 at:273), 1 waiters) ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: another MacOS cache manager wedging
Thanks for taking the time to check this out. This issue with the CM blocking unnecessarily seems to have been happening intermittently ever since I upgraded from 1.4.6 to 1.4.11 (and continued with 1.4.12 while I was using it). Unfortunately I can't downgrade to see if it goes away because I run Snow Leopard now. I'm pretty darn sure this is an issue with the client -- either something that changed in OpenAFS or else something that changed in MacOS, because it was all humming quite nicely beforehand. Derrick Brashear sha...@gmail.com writes: Lock afs_xserver status: (none_waiting, 1 read_locks(pid:0)) Lock afs_xvcb status: (writer_waiting, write_locked(pid:0 at:273), 2 waiters) ** Cache entry @ 0x45dee6c4 for 200.536870919.44.36878 [megacz.com] locks: (none_waiting, write_locked(pid:10589 at:54)) Which server is 536870919 on, The vos ex output is below; let me know if you need additional information. and why (from the client's perspective) would it not be answering? I don't know. There is a significant possibility of very small, steady (like 3%) packet loss on the path between the client and the server, but otherwise both servers (169.229.3.178 and gentzen.megacz.com) are up, healthy, and responsive. root.cell.readonly536870919 RO 28 K On-line 169.229.3.178 /vicepa RWrite 536870918 ROnly 0 Backup 0 MaxQuota 5000 K CreationWed Jan 13 19:36:04 2010 CopySun May 11 11:42:30 2008 Backup Wed Jan 13 17:00:11 2010 Last Update Wed Jan 13 19:34:12 2010 89 accesses in the past day (i.e., vnode references) root.cell.readonly536870919 RO 28 K On-line gentzen.megacz.com /vicepa RWrite 536870918 ROnly 536880267 Backup 536870920 MaxQuota 5000 K CreationWed Jan 13 19:36:04 2010 CopyWed Jan 13 19:36:04 2010 Backup Wed Jan 13 17:00:11 2010 Last Update Wed Jan 13 19:34:12 2010 205 accesses in the past day (i.e., vnode references) RWrite: 536870918 ROnly: 536870919 Backup: 536870920 RClone: 536870919 number of sites - 4 server 169.229.3.178 partition /vicepa RO Site -- New release server gentzen.megacz.com partition /vicepa RO Site -- Old release server gentzen.megacz.com partition /vicepa RW Site -- New release server gentzen.megacz.com partition /vicepa RO Site -- New release ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: another MacOS cache manager wedging
Here's another one: ** Cache entry @ 0x45eb543c for 200.536879758.1.1 [megacz.com] locks: (none_waiting, 1 read_locks(pid:6896)) 6144 bytes DV 438 refcnt 1 callback 0677db04 expires 1265005388 0 opens 0 writers volume root states (0x1), stat'd ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: another MacOS cache manager wedging
Derrick Brashear sha...@gmail.com writes: ** Cache entry @ 0x45e0f004 for 1.1.1.1 [dynroot] locks: (writer_waiting, write_locked(pid:2870 at:54), 2 waiters) I don't even have to look at this one. 54 is FetchStatus. Oddly, it's dynroot, so there's something off here. Here's another: Lock afs_xserver status: (none_waiting, 1 read_locks(pid:0)) Lock afs_xvcb status: (writer_waiting, write_locked(pid:0 at:273), 2 waiters) ** Cache entry @ 0x45dee6c4 for 200.536870919.44.36878 [megacz.com] locks: (none_waiting, write_locked(pid:10589 at:54)) 0 bytes DV0 refcnt 1 callback expires 0 0 opens 0 writers normal file states (0x4), read-only ** Cache entry @ 0x45deeafc for 200.536880268.1.1 [megacz.com] locks: (none_waiting, 1 read_locks(pid:10548)) 14336 bytes DV 325 refcnt 1 callback 07dc1284 expires 1264896318 3 opens 0 writers volume root states (0x1), stat'd ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Recommended way to start up OpenAFS on Solaris 10?
Atro Tossavainen atro.tossavainen+open...@helsinki.fi writes: Is everybody still writing their own SMF bits to start OpenAFS on Solaris 10 without /etc/init.d bits, or is there already a Received Way of doing this? For the server component, I use the script below (with runit, sort of like SMF for Linux). If my make bosserver handle SIGTERM properly patch is merged, this mess will get a lot simpler (and more reliable): #!/bin/bash DAEMON=openafs-fileserver mkdir -p /etc/service/$DAEMON/control echo '#!/bin/bash' /etc/service/.tmpfile-$DAEMON echo '/usr/bin/bos shutdown -wait -localauth `hostname`' /etc/service/.tmpfile-$DAEMON echo 'kill `cat /etc/service/$DAEMON/supervise/pid`' /etc/service/.tmpfile-$DAEMON chmod +x /etc/service/.tmpfile-$DAEMON mv /etc/service/.tmpfile-$DAEMON /etc/service/$DAEMON/control/t exec /usr/sbin/bosserver -nofork ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] another MacOS cache manager wedging
So, I successfully demultihomed all servers in the cell in question. Unfortunately the random blocking still seems to be happening. The one shown below was particularly nasty: it did not resolve after any reasonable approximation to the timeout value (stayed stuck for well over 30 minutes before I gave up and rebooted the client). - a hosed Description: Binary data hosed.long Description: Binary data
[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?
Derrick Brashear sha...@gmail.com writes: I might be able to try that, but it will take a few days. if true, you should see output in cmdebug now Okay, I just caught it red-handed. Can anybody help with reading the tea leaves here? meg...@quine:~$cmdebug localhost Lock afs_xvcache status: (none_waiting, write_locked(pid:11013 at:335)) Lock afs_xserver status: (none_waiting, 1 read_locks(pid:0)) Lock afs_xvcb status: (writer_waiting, write_locked(pid:0 at:273), 1 waiters) - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?
Derrick Brashear sha...@gmail.com writes: Lock afs_xvcache status: (none_waiting, write_locked(pid:11013 at:335)) Ah, so I am to interpret the thing after the comma as the name of a function somewhere within the openafs source code. Knowing that helps a lot! assuming you're not running disconnected and actively trying to disconnect, Correct. So then the question is why FlushVCBs is blocking you. well, you said you had multihomed fileservers. To be completely precise, one of my fileservers is a machine with two IP addresses, with a one-line NetInfo file. By multihomed did you mean on a machine with two public IPs or the AFS server somehow knows about both IPs? RXAFS_GiveUpCallBacks is called here. you didn't perchance grab rxdebug output for the client at this point? Sorry, no; I will do that next time. could we address this? yes! how? well, i suppose we could on network events (macos has support for this) and when a new server is discovered, probe all addresses, so any unreachable addresses are marked down in advance. How do I ask the cache manager to tell me what IPs it thinks a particular server has? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?
Derrick Brashear sha...@gmail.com writes: You don't. You can ask the vlserver, which is how the CM found out anyhow: vos listaddrs -printuuid -noresolve Yikes, that list is full of incorrect addresses. How on earth is the list compiled? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] advice on troubleshooting blocked cache manager on MacOS?
Hi, lately I've been encountering a lot of situations where a process seems to block for a really long time trying to access something in /afs; it usually succeeds, but only after several minutes. This seems to happen only on MacOS (1.4.11, although I saw it with 1.4.10 too). Can anybody give me some advice on how to go about discovering exactly which file access is blocked, and perhaps why? I have control of the fileserver in this situation. Thanks! - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?
Ken Hornstein k...@cmf.nrl.navy.mil writes: If it matters at all, I saw the exact same thing. It seemed to be caused by a combination of a multihomed fileserver and AFS client behind a NAT (yeah, it's easy to see how that would be an issue). Wow, that is really interesting, *both* of those factors are in play in my situation as well. Once the multihomed fileserver went away, it all fixed itself. I might be able to try that, but it will take a few days. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: 1.4.12fc1 kernel panics
Simon Wilkinson s...@inf.ed.ac.uk writes: However, there is a tool that will help you do this - decode-panic. I'm not sure if we're installing it in the 1.4.x series, No, you aren't. I very very very strongly urge the gatekeepers to arrange for the installers to always install all tools necessary to create a complete bug report. I take the time to gather the information you guys ask for, but most users won't. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: 1.4.12fc1 kernel panics
Simon Wilkinson s...@inf.ed.ac.uk writes: http://git.openafs.org/?p=openafs.git;a=blob_plain;f=src/packaging/MacOS/decode-panic;h=a775b9a82b1deea7abdc2c0f109dc04446371e60;hb=HEAD That did not work. meg...@quine:/tmp$sudo ./x.pl Can't find panic file: /Library/Logs/panic.log! at ./x.pl line 75 meg...@quine:/tmp$ls /Library/Logs/panic.log ls: /Library/Logs/panic.log: No such file or directory - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: 1.4.12fc1 kernel panics
Simon Wilkinson s...@inf.ed.ac.uk writes: Unfortunately, just a bare panic log isn't that much use when it comes to tracking the problem down. Unless we've got exactly the same kernel, Mac OS X 10.6.2, build 10C540 (general release) architecture, i86 (32-bit) OpenAFS build, Check the subject line. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: 1.4.12fc1 kernel panics
Derrick Brashear sha...@gmail.com writes: Instead, give it some arguments, like -i /Library/Logs/DiagnosticReports/(path to kernel report). Ok, Panic Date: Interval Since Last Panic Report: 472905 sec Kernel Version: Darwin Kernel Version 10.2.0: Tue Nov 3 10:37:10 PST 2009; root:xnu-1486.2.11~1/RELEASE_I386 OpenAFS Version: org.openafs.filesystems.afs(1.4.12fc1) = add symbol table from file /tmp/afsdebugLAjeJl/org.openafs.filesystems.afs.sym? 0x21b2bd panic+445: mov0x8011d0,%eax 0x2a7ac2 kernel_trap+1530:jmp0x2a7ade kernel_trap+1558 0x29d968 lo_alltraps+712: mov%edi,%esp 0x4607e500 afs_GetDCache+7832:mov0x64(%edx),%ebx 0x46078a18 BPrefetch+144: mov%eax,-0x3c(%ebp) 0x4607928d afs_BackgroundDaemon+573: jmp0x460792cb afs_BackgroundDaemon+635 0x460e76a7 afsd_thread+719: call 0x2a013e current_thread 0x29d68c call_continuation+28:add$0x10,%esp - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: 1.4.12fc1 kernel panics
Simon Wilkinson s...@inf.ed.ac.uk writes: If you're using the dmg's from the OpenAFS website, yes ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] 1.4.12fc1 kernel panics
Interval Since Last Panic Report: 1043811 sec Panics Since Last Report: 2 Anonymous UUID:06FDBD1C-6E37-46D3-92E6-009678548AC0 Fri Jan 15 20:40:17 2010 panic(cpu 1 caller 0x2a7ac2): Kernel trap at 0x4607e500, type 14=page fault, registers: CR0: 0x8001003b, CR2: 0x0064, CR3: 0x00101000, CR4: 0x06e0 EAX: 0x0010, EBX: 0x, ECX: 0x460870e2, EDX: 0x CR2: 0x0064, EBP: 0x34cabf1c, ESI: 0x0bff4004, EDI: 0x EFL: 0x00010297, EIP: 0x4607e500, CS: 0x0004, DS: 0x000c Error code: 0x Backtrace (CPU 1), Frame : Return Address (4 potential args on stack) 0x34cabc48 : 0x21b2bd (0x5cf868 0x34cabc7c 0x223719 0x0) 0x34cabc98 : 0x2a7ac2 (0x591c30 0x4607e500 0xe 0x591dfa) 0x34cabd78 : 0x29d968 (0x34cabd90 0x460b0d6d 0x34cabf1c 0x4607e500) 0x34cabd88 : 0x4607e500 (0xe 0xbff0048 0x34ca000c 0x34ca000c) 0x34cabf1c : 0x46078a18 (0x45da800c 0xf10 0x0 0x34cabf58) 0x34cabf80 : 0x4607928d (0x46100420 0x34cabfac 0x487ab6 0x34883ab8) 0x34cabfac : 0x460e76a7 (0x34883ab8 0x34cabfc8 0x227595 0x2) 0x34cabfc8 : 0x29d68c (0x34883ab8 0x0 0x8 0x50ba33c) Kernel Extensions in backtrace (with dependencies): org.openafs.filesystems.afs(1.4.12fc1)@0x4606c000-0x4610dfff BSD process name corresponding to current thread: kernel_task Mac OS version: 10C540 Kernel version: Darwin Kernel Version 10.2.0: Tue Nov 3 10:37:10 PST 2009; root:xnu-1486.2.11~1/RELEASE_I386 System model name: MacBook1,1 (Mac-F4208CC8) System uptime in nanoseconds: 16634816639383 unloaded kexts: com.apple.driver.AppleFileSystemDriver 2.0 (addr 0x2eda2000, size 0x12288) - last unloaded 161942628549 loaded kexts: org.virtualbox.kext.VBoxNetAdp 3.0.6 - last loaded 34152370293 org.virtualbox.kext.VBoxNetFlt 3.0.6 org.virtualbox.kext.VBoxUSB 3.0.6 org.virtualbox.kext.VBoxDrv 3.0.6 com.cisco.nke.ipsec 2.0.1 org.openafs.filesystems.afs 1.4.12fc1 com.FTDI.driver.FTDIUSBSerialDriver 2.2.14 com.apple.driver.IOBluetoothBNEPDriver 2.2.4f3 com.apple.filesystems.autofs2.1.0 com.apple.Dont_Steal_Mac_OS_X 7.0.0 com.apple.iokit.CHUDUtils 201 com.apple.driver.AppleIntelYonahProfile 14 com.apple.iokit.CHUDProf214 com.apple.driver.AppleHDA 1.7.9a4 com.apple.driver.AudioIPCDriver 1.1.2 com.apple.driver.AppleUpstreamUserClient3.1.0 com.apple.driver.AppleIntelGMA950 6.0.6 com.apple.driver.SMCMotionSensor3.0.0d4 com.apple.iokit.AppleYukon2 3.1.14b1 com.apple.driver.AirPort.Atheros421.19.8 com.apple.driver.ACPI_SMC_PlatformPlugin4.0.1d0 com.apple.driver.AppleLPC 1.4.9 com.apple.driver.AppleIntelIntegratedFramebuffer6.0.6 com.apple.driver.AppleBacklight 170.0.14 com.apple.driver.AppleIRController 251.1.4 com.apple.driver.AppleUSBTrackpad 1.8.0b4 com.apple.driver.AppleUSBTCKeyEventDriver 1.8.0b4 com.apple.driver.AppleUSBTCKeyboard 1.8.0b4 com.apple.iokit.IOAHCIBlockStorage 1.6.0 com.apple.driver.AppleRAID 4.0.6 com.apple.driver.AppleUSBHub3.8.4 com.apple.driver.AppleUSBEHCI 3.7.5 com.apple.BootCache 31 com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1 com.apple.driver.AppleFWOHCI4.4.0 com.apple.driver.AppleUSBUHCI 3.7.5 com.apple.driver.AppleEFINVRAM 1.3.0 com.apple.driver.AppleAHCIPort 2.0.1 com.apple.driver.AppleIntelPIIXATA 2.5.0 com.apple.driver.AppleRTC 1.3 com.apple.driver.AppleHPET 1.4 com.apple.driver.AppleSmartBatteryManager 160.0.0 com.apple.driver.AppleACPIButtons 1.3 com.apple.driver.AppleSMBIOS1.4 com.apple.driver.AppleACPIEC1.3 com.apple.driver.AppleAPIC 1.4 com.apple.driver.AppleIntelCPUPowerManagementClient 96.0.0 com.apple.security.sandbox 0 com.apple.security.quarantine 0 com.apple.nke.applicationfirewall 2.1.11 com.apple.driver.AppleIntelCPUPowerManagement 96.0.0 com.apple.iokit.IOSCSIArchitectureModelFamily 2.6.0 com.apple.driver.AppleProfileReadCounterAction 17 com.apple.driver.AppleProfileTimestampAction10 com.apple.driver.AppleProfileThreadInfoAction 14 com.apple.driver.AppleProfileRegisterStateAction10 com.apple.driver.AppleProfileKEventAction 10 com.apple.driver.AppleProfileCallstackAction20 com.apple.iokit.IOSurface
[OpenAFS] Re: afs/c...@realm vs a...@realm vs 1.5.68
Derrick Brashear sha...@gmail.com writes: did you ask aklog what it's doing? -d (the debug switch) should tell you exactly what it's doing. Yep gentzen:/usr/src# aklog -d -c research.cs.berkeley.edu Authenticating to cell (server afs.research.cs.berkeley.EDU). Trying to authenticate to user's realm RESEARCH.CS.BERKELEY.EDU. Getting tickets: a...@research.cs.berkeley.edu We've deduced that we need to authenticate to realm RESEARCH.CS.BERKELEY.EDU. Getting tickets: a...@research.cs.berkeley.edu Getting tickets: a...@research.cs.berkeley.edu Kerberos error code returned by get_cred : -1765328377 aklog: Couldn't get AFS tickets: aklog: Server not found in Kerberos database while getting AFS tickets ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: afs/c...@realm vs a...@realm vs 1.5.68
Derrick Brashear sha...@gmail.com writes: gentzen:/usr/src# aklog -d -c research.cs.berkeley.edu Authenticating to cell (server afs.research.cs.berkeley.EDU). Authenticating to cell %s So for some reason the cell configuration it's getting back from get_cellconfig doesn't include a cell name in it. Why does it care? I specified the cell to use on the command line, explicitly, like I always do. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] afs/c...@realm vs a...@realm vs 1.5.68
Many AFS tools will accept either of these two principals for the vlserver/dbserver/fileserver: afs/c...@realm a...@realm Is one preferred over the other for new cells? Moreover, there seems to be some sort of change in the behavior of the 1.5.68 aklog relative to 1.4.11; the new aklog only appears to attempt the latter one. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: afs/c...@realm vs a...@realm vs 1.5.68
Derrick Brashear sha...@gmail.com writes: Moreover, there seems to be some sort of change in the behavior of the 1.5.68 aklog relative to 1.4.11; the new aklog only appears to attempt the latter one. There seems not to be. Well, when I hold tickets for afsad...@research.cs.berkeley.edu and attempt to aklog to research.cs.berkeley.edu (which uses principal afs/research.cs.berkeley@research.cs.berkeley.edu), I see this on RESEARCH.CS.BERKELEY.EDU's KDC: 2009-12-28_19:16:48.25167 Dec 28 11:16:48 research.cs.berkeley.edu krb5kdc[2979](info): TGS_REQ (1 etypes {1}) 65.23.129.159: UNKNOWN_SERVER: authtime 1262027795, afsad...@research.cs.berkeley.edu for a...@research.cs.berkeley.edu, Server not found in Kerberos database 2009-12-28_19:16:48.39314 Dec 28 11:16:48 research.cs.berkeley.edu krb5kdc[2979](info): TGS_REQ (1 etypes {1}) 65.23.129.159: UNKNOWN_SERVER: authtime 1262027795, afsad...@research.cs.berkeley.edu for a...@research.cs.berkeley.edu, Server not found in Kerberos database 2009-12-28_19:16:48.53461 Dec 28 11:16:48 research.cs.berkeley.edu krb5kdc[2979](info): TGS_REQ (1 etypes {1}) 65.23.129.159: UNKNOWN_SERVER: authtime 1262027795, afsad...@research.cs.berkeley.edu for a...@research.cs.berkeley.edu, Server not found in Kerberos database When I downgrade the client (65.23.129.159) to 1.4.11, everything works fine. I'm sure this is a configuration error on my part, and I've just lucked out in some way that the 1.4.11 client is more forgiving about. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] More EINVAL's with 1.5.68
Strange, I can cat the file in question, but for some reason Ruby programs (I don't know Ruby) choke on it. - a /usr/lib/ruby/1.8/fileutils.rb:1039:in `read': Invalid argument - /afs/megacz.com/user/m/me/megacz/.netstiff/store.new (Errno::EINVAL) from /usr/lib/ruby/1.8/fileutils.rb:1039:in `fu_copy_stream0' from /usr/lib/ruby/1.8/fileutils.rb:470:in `copy_stream' from /usr/lib/ruby/1.8/pstore.rb:369:in `commit_new' from /usr/lib/ruby/1.8/pstore.rb:368:in `open' from /usr/lib/ruby/1.8/pstore.rb:368:in `commit_new' from /usr/lib/ruby/1.8/pstore.rb:297:in `transaction' from /usr/bin/netstiff:245:in `each' from /usr/bin/netstiff:737:in `cleanup' from /usr/bin/netstiff:733:in `initialize' from /usr/bin/netstiff:1104:in `initialize' from /usr/bin/netstiff:1350:in `new' from /usr/bin/netstiff:1350 /usr/lib/ruby/1.8/fileutils.rb:1039:in `read': Invalid argument - /afs/megacz.com/user/m/me/megacz/.netstiff/store.new (Errno::EINVAL) from /usr/lib/ruby/1.8/fileutils.rb:1039:in `fu_copy_stream0' from /usr/lib/ruby/1.8/fileutils.rb:470:in `copy_stream' from /usr/lib/ruby/1.8/pstore.rb:369:in `commit_new' from /usr/lib/ruby/1.8/pstore.rb:368:in `open' from /usr/lib/ruby/1.8/pstore.rb:368:in `commit_new' from /usr/lib/ruby/1.8/pstore.rb:297:in `transaction' from /usr/bin/netstiff:245:in `each' from /usr/bin/netstiff:737:in `cleanup' from /usr/bin/netstiff:733:in `initialize' from /usr/bin/netstiff:1049:in `initialize' from /usr/bin/netstiff:1337:in `new' from /usr/bin/netstiff:1337 ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: More EINVAL's with 1.5.68
Andrew Deason adea...@sinenomine.net writes: They are probably performing a difference sequence of syscalls on the file, or using different flags, etc. Is this reliably reproducible? fstrace dumps could be more enlightening, Yes, but only until I reboot. I posted the fstrace (see previous thread) but was told that I had to grab one shortly after reboot. Unfortunately this problem doesn't manifest itself until the machine has been up for a while. So we're kinda stuck. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Issue report: invalid argument with 1.5.68 but not with 1.4.11
Simon Wilkinson s...@inf.ed.ac.uk writes: Still no sign of anything failing in that log. There's also not much that looks like it's accessing git data, either. Derrick has had a look over it and can't see anything either. Could you try rebooting your machine, then immediately start fstrace and try to reproduce the bug - we might get a bit more information that way. Hrm, disturbing news: the problem went away after rebooting. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Current OpenAFS Backup Recommendations
Holger Rauch holger.ra...@empic.de writes: do you have any scripts available that you'd be willing to/allowed to share? I'd highly appreciate that. Sure. I think I posted them here before, but here's where it lives: /afs/megacz.com/srv/bin/dump.sh Also handy for manipulating the incrementals are: /afs/megacz.com/srv/bin/compress.sh /afs/megacz.com/srv/bin/expand.sh The backups are kept as a full dump of the most recent nightly plus a chain of backwards diffs. Only nightlies are supported right now (no concept of monthlies/annuals). - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Issue report: invalid argument with 1.5.68 but not with 1.4.11
Simon Wilkinson s...@inf.ed.ac.uk writes: *) The kernel's message log with any messages output when the git clone failed None. *) A fstrace log of the git clone operation (see http://blob.inf.ed.ac.uk/sxw/2009/01/24/using-fstrace-to-debug-the-afs-cache-manager/ ) Ok, it's here: /afs/megacz.com/.pub/afs-cm-dump *) A packet trace of traffic between the client and the fileserver, showing the last 10 or so RPCs before the git clone fails Hrm, that might be trickier. If the fstrace log doesn't have the information you need, could you recommend a particular tcpdump invocation I should use? Thanks Simon! - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Current OpenAFS Backup Recommendations
Holger Rauch holger.ra...@empic.de writes: Which entry do I have to add to my CellServDB file in order to be able to create the mount point for the call megacz.com? Make sure you are passing the -afsdb argument to afsd when you start it. Then you won't need to hardwire the IP address into your configuration files. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Issue report: invalid argument with 1.5.68 but not with 1.4.11
Russ Allbery r...@stanford.edu writes: The Debian packaging may still be putting it one level up, without the C subdirectory, which may be the problem. Yep, that was it. Simon Wilkinson s...@inf.ed.ac.uk writes: If you could install that, and try the whole thing again, that would be great! Done. New dump is at /afs/megacz.com/.pub/afs-cm-dump Thanks again! - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Issue report: invalid argument with 1.5.68 but not with 1.4.11
I just upgraded one of my machines to the 1.5.68 client, and I've been experiencing some mysterious behavior when using git. The one reproducible oddity is this: $ git clone /afs/megacz.com/debian/openafs/1.5/openafs-debian.git Initialize openafs-debian/.git Initialized empty Git repository in /tmp/openafs-debian/.git/ error: copy-fd: read returned Invalid argument fatal: failed to copy file to openafs-debian/.git/objects/60/ae957b9b417ed139ccc0156ba9eca9542a48a6 Executing the same command on a machine with the 1.4.11 client works fine. The directory above is world-readable, so feel free to try (but I may need to vos_release later this week, so please try soon if you can). Thanks, - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Linux tmpfs
Simon Wilkinson s...@inf.ed.ac.uk writes: Crickey - this is a thread from almost 12 months ago - talk about necromancy! Ah yes, the magic of gmane ;) Anyway, the first point is solved by Marc Dionne's LINUX_USE_FH patches in the 1.5.x series - these let you use pretty much any filesystem as a disk cache, and are automatically enabled for kernels that are new enough to lack the iget interface - check to see if LINUX_USE_FH is defined for your build and, if it isn't, define it. That sounds great; looks like you've got another 1.5.x beta tester now. Thanks Simon! - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Current OpenAFS Backup Recommendations
Russ Allbery r...@stanford.edu writes: You can get remarkably good compression by using xdelta3 on dump files. Don't you have to do full dumps in order to calculate the xdelta3 differences? I pipe the output of vos dump directly to xdelta3, so the full dump never hits the disk. If you perform the dump and diff on the fileserver itself, then the network traffic is also proportional to the size of the diff (rather than the size of the whole dump). You can then send the diff over the network to secondary storage. Secondary storage needs to keep a copy of the full dump and reapply the diff, so the amount of effort is proportional to the size of the dump, but the network traffic is proportional to the diff. It takes a few tries to set it up, but works really smoothly once you've got it going. IMHO it's the best incremental backup solution for AFS that doesn't involve the risk of your proprietary software suddenly becoming unsupported. The ideal situation would be to have the volserver emit an RFC3284 file directly using its knowledge of which file blocks have changed since a given date. This would give you block-granularity incremental dumps instead of the file-granularity dumps without inventing a new dump format; just use an RFC3284-of-dumpfiles. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Linux tmpfs
Rainer Toebbicke r...@pclella.cern.ch writes: 1. tmpfs on linux just works fine, if you have a (small) patch that glues the inode-centric file opens to the dentry-centric tmpfs files. 2. AFS files end up twice in memory, once in the mapping of the AFS file itself and then in the mapping of the cache chunk. We've addressed this by short-circuiting the VM layer for the AFS file, a relatively straightforward mod, but which gets messy as you still need that layer for everything that is memory-mapped, such as executables. Hi, does anybody have copies of one or both of these patches? I'd be quite interested in trying them out. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Current OpenAFS Backup Recommendations
Russ Allbery r...@stanford.edu writes: Given the low churn of most of our AFS space and compression of the vos dump files, the backups actually take slightly less space than the entirety of our cell, with 30 day retention. You can get remarkably good compression by using xdelta3 on dump files. I have almost two years worth of *daily* backups and they consume only 10x the space of the active data itself (and lives on much cheaper/slower storage). I should probably only keep monthlies, but I haven't gotten around to adjusting the scripts. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] libnss-afs v2.0 released
I am pleased to announce the availability of libnss-afs v2.0: http://www.megacz.com/software/libnss-afs.html This version will return not found if nscd is not running. This resolves numerous minor issues including some which could cause extraordinarily long delays when shutting down or rebooting. http://git.megacz.com/?p=libnss-afs.git;a=commitdiff;h=b5f46b9 A Name Service Switch (NSS) plugin is a shared library used by glibc to -- among other things -- translate between usernames and numeric userids and between group names and numeric groupids. The libnss-afs library is an NSS plugin which answers these queries using the information stored in the AFS ptserver, avoiding the need to duplicate (and update) this information in /etc/passwd or LDAP. The library also synthesizes the name AfsPag- for the fake group ids that are used to represent AFS PAGs. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: LDAP backend for PTS?
Holger Rauch holger.ra...@empic.de writes: Is there a PTS backend for OpenLDAP available and actively maintained (in the sense that it can be used in conjunction with OpenAFS 1.4.x or 1.5.x)? What you want is actually an NSS module, not an LDAP module. You're probably using the LDAP module for NSS; you want to augment that with a ptserver module. Here's where you can get it: http://www.megacz.com/software/libnss-afs.html - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Combined AFS/Kerberos Apache 2 module
Hey, neat! Kevin Hildebrand ke...@umd.edu writes: In addition, when obtaining AFS tokens, it's possible to do so before the Apache directory walk phase, which is a current limitation of mod_waklog. Well, not entirely... http://article.gmane.org/gmane.comp.file-systems.afs.modwaklog.devel/114 When using this module, the use of mod_waklog is not required. What is the equivalent for mod_waklog's WaklogLocationPrincipal directive? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: problems with cygwin
Same here; all of Cygwin has lost access to AFS. Sadly I too will need to downgrade. - a Lars Schimmer l.schim...@cgv.tugraz.at writes: Hi! One of my students send m this regression: After upgrading OpenAFS Client on WindowsXP (32 bit) from openafs-en_US-1-5-61.msi to openafs-en_US-1-5-62.msi i can't read files inside afs with any cygwin app. Example: $ cat test.txt cat: test.txt: Invalid request code I also tested other apps: nano, less, lighttpd. None can read any file. Tested on several different PCs with different users, files and paths. After downgrading to 1.5.61 things work again. MfG, Lars Schimmer -- - TU Graz, Institut für ComputerGraphik WissensVisualisierung Tel: +43 316 873-5405 E-Mail: l.schim...@cgv.tugraz.at Fax: +43 316 873-5402 PGP-Key-ID: 0x4A9B1723 -- ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: OpenAFS 1.5.61 released
Derrick Brashear sha...@openafs.org writes: MacOS: * GUI installer now asks for local cell information. The no local cell option seems to be missing (IMHO it should be the default). - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Trouble with libnss-afs (on amd64)
Derrick Brashear sha...@gmail.com writes: Krullgkli...@cs.uni-goettingen.de wrote: the latest version 1.08 of libnss-afs cannot be build on a amd64 Ubuntu Jaunty. Hi Krull. I don't have root access to any amd64 machines, so unfortunately I can't reproduce this. If you find a solution, please let me know. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Trouble with libnss-afs (on amd64)
Russ Allbery r...@stanford.edu writes: Adam, on Debian you want to link with -lafsauthent_pic -lafsrpc_pic. At least if they're available, which for lenny may require using a backport since they only went in the 1.4.10 release. I see. Russ, as a Debian guru, can you advise me on the officially-approved way to pick the right libraries at deb-install-time? It's more important that libnss-afs continue to work properly with etch's native version of opeanfs than that it work on amd64. Thanks, - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Trouble with libnss-afs
Russ Allbery r...@stanford.edu writes: Adam, on Debian you want to link with -lafsauthent_pic -lafsrpc_pic. At least if they're available, which for lenny may require using a backport since they only went in the 1.4.10 release. One possibility: I could require =1.4.10 in order to build, but allow installation on earlier versions of OpenAFS. Will this work? In other words, if libnss_afs.so statically links to libafsauthent_pic.a/libafsrpc_pic.a from 1.4.10 and then the shared library is introduced to (say) a 1.4.2 system, will it cause problems? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Trouble with libnss-afs
Russ Allbery r...@stanford.edu writes: One possibility: I could require =1.4.10 in order to build, but allow installation on earlier versions of OpenAFS. Will this work? Yup. Okay, libnss-afs 1.09 has been released. Debs are in the usual place. http://git.hcoop.net/?p=megacz/libnss-afs.git;h=tags/1.09 Please let me know if this works on amd64 (can't test it myself). - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Trouble with libnss-afs
mozafar roshany m.rosh...@gmail.com writes: I working on OpenAFS through this document on Debian Lenny: ... I should mention that I've installed the libnss-afs_1.08_i386.deb package or its 1.07 version. Hi, Mozafar. Could you please try upgrading to libnss-afs 1.08? I recently fixed a linking problem that only causes trouble for users of newer libc's: http://git.hcoop.net/?p=megacz/libnss-afs.git;a=commit;h=491cdaa05effc3 /afs/hcoop.net/user/m/me/megacz/public/libnss-afs/libnss-afs_1.08_i386.deb Note that the debs with 1.08 in their name were rebuilt recently; git is the authoritative source of version numbers. Thanks, - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: afs: Lost contact with file server on the same machine?
Jason Edgecombe ja...@rampaginggeek.com writes: Have you increased the fileserver logging to the maximum level? Yes, we tried that a while back. Apparently throttling doesn't get logged. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: afs: Lost contact with file server on the same machine?
Andrew Deason adea...@sinenomine.net writes: No, there's no log message that indicates that this is happening; How unfortunate. but having logs for this would somewhat defeat the purpose of the throttling. If you're triggering the throttling behavior, logging it would almost certainly really slow down the fileserver. Presumably the fileserver would only log this message at most once per client IP per hour or something like that. The least disruptive way to see if it's happening is probably correlating kernel error messages with the problem, as Russ mentioned earlier. I don't think he ever explained what server-side event he was correlating the client-side kernel messages with. Although, if it's tolerable, it may be easiest to just disable throttling by passing -abortthreshold 0 to the fileserver, and see if the problem goes away. THANK YOU. That's the sort of solution I was looking for. We will implement this immediately. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: afs: Lost contact with file server on the same machine?
Adam Megacz meg...@hcoop.net writes: - Have you tried using rxdebug to see if the fileserver is getting caught up on something? Try running it when one of the clients claims it's lost contact with the server. Unfortunately we can't reproduce the bug on demand. It tends to happen when nobody's looking, and goes away quickly enough that by the time somebody gets an admin's attention it has gone away. I should add, however, that I am eager to add or enable any sort of instrumentation that might help in determining the cause of these problems and/or fix them. If there's any sort of logging I can/should turn on, that would be great. But an action that needs to be performed manually while the bug is in progress isn't really feasible. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: afs: Lost contact with file server on the same machine?
Russ Allbery r...@stanford.edu writes: This sounds identical to the problem that we were having with our web servers that was mostly caused by CGI script tokens expiring and then scripts continuing to try to access AFS until the file server started throttling Rx connections. Can get the fileserver to log a message indicating that it has decided to throttle connections from a host? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: afs: Lost contact with file server on the same machine?
Esther Filderman mizmo...@gmail.com writes: - Does the lost contact with server occur on all clients at the same time? Or is it scattered which one loses contact? It is definitely scattered; we've seen situations where one client lost contact while another seemed to be having no troubles. - For how long does the lost contact occur? Is it seconds or minutes or longer? Around 10-15 minutes, or until the next fs checks, whichever comes first. Some users know to run fs checks to make this go away, but most don't. Others are seeing unsupervised cron/at jobs fail as a result of this. - Simple, stupid question: Have you confirmed your hardware is OK and not causing hiccups in the system? Yes. - Have you tried using rxdebug to see if the fileserver is getting caught up on something? Try running it when one of the clients claims it's lost contact with the server. Unfortunately we can't reproduce the bug on demand. It tends to happen when nobody's looking, and goes away quickly enough that by the time somebody gets an admin's attention it has gone away. Thanks for your help, - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Connection Timed Out errors occasionally when accessing openafs drive
FWIW, we are still experiencing this problem as well after upgrading to 1.4.10, although it seems to occur less often than it did before. - a Ken Elkabany k...@elkabany.com writes: I upgraded our server and client to 1.4.10. Unfortunately, I am still receiving Connection Timed Out errors. They rarely occur, but when they do they are a severe hindrance. My use case is as follows: Three different unix user accounts (root, www-data, aux) are all running multiple background processes (~9 total) which access the afs mount. They each automatically acquire, or re-acquire tickets and tokens, and then proceed to read, copy, and write files. Occasionally, upon creating a directory using a python os command similar to mkdir -p (os.makedirs), I receive a Connection Timed Out error. The processes must then be restarted. Any other suggestions? Ken On Sun, May 10, 2009 at 7:41 PM, Derrick Brashear sha...@gmail.com wrote: it probably matters in the server here, but both. Derrick On May 10, 2009, at 10:35 PM, Ken Elkabany k...@elkabany.com wrote: Is this bug fixed in the client or the server? Thanks. Ken On Sun, May 10, 2009 at 7:22 PM, Derrick Brashear sha...@gmail.com wrote: I'd venture this is a bug fixed in 1.4.10, with idle dead time computation in rx. Derrick On May 10, 2009, at 9:53 PM, Ken Elkabany k...@elkabany.com wrote: Hello, I have openafs 1.4.9 client and server running on two separate machines across a WAN. The client has scripts that access the /afs/our.cell/ directory. Occasionally, the script will fail to complete, and the logs will say that the Connection Timed Out on a mkdir -p /afs/our.cell/x/y/z command. The frequency of the errors are approximately 1 in 100, small enough to not be easily reproducible manually, but enough to hamper our project. The scripts run as the root user, and is guaranteed to have the proper ticket and token. It's also important to note that these scripts often run in parallel (4 at a time, all root, modifying our cell). When one fails, all scripts running concurrently will fail with the same error, and I typically either unlog;kdestroy or restart the openafs-client (I am unsure which of those solutions is necessary or sufficient). I will soon have an additional LAN setup, and will determine if the same error occurs. Has anyone dealt with this issue before? Thank you for the assistance, Ken ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info -- ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: afs: Lost contact with file server on the same machine?
For the benefit of the mailing list archives, I'd just like to mention that upgrading from 1.4.6 to 1.4.10 seems to have helped quite a bit, but the problem remains. It just happens less frequently. - a Adam Megacz meg...@hcoop.net writes: Hello, We've got a situation where clients seem to be encountering afs: Lost contact with file server fairly frequently (at least once a week). This is happening both for a client machine which is on the same ethernet switch as the fileserver (no NAT going on) as well as the OpenAFS client running on the server machine losing contact with the fileserver process running on the very same machine (so it's unlikely to actually be the network). Sending kill -TSTP to the fileserver to increase the logging level hasn't revealed anything interesting happening at the time that contact is lost. Is there any way to get more detailed information about the reason why the client decided that it had lost contact? For example, whether the failure was due to a timeout, an ICMP unreachable, or no-route-to-host, etc? All machines in question are running OpenAFS 1.4.6 (client and server), using the debian packages. The fileserver is running with these arguments: -p 23 -busyat 600 -rxpck 400 -s 1200 -l 1200 -cb 65535 -b 240 -vc 1200 Thanks for any suggestions... - a -- ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Notes on how to get filedrawers minimally working with R/W AFS access on Debian (at least).
David Boyes dbo...@sinenomine.net writes: A bit of googling reveals that Adam Megacz has actually done a Debian package of filedrawers. This saves some time: /afs/hcoop.net/user/m/me/megacz/public/filedrawers/ FWIW, this is probably way out of date. If the filedrawers folks are interested in distributing my debianization as part of the package (so it remains up to date), I will update it for them. So, it turns out that Adam Megacz expects you to still have Apache 1 installed in order to build the damn thing. Yeah, this is because both the Apache1 module and the Apache2 module are built from the same source package. If there are any Debian wizards out there who know how to make a multiple-binary-packages-from-one-source-package that lets you selectively build some of the binary packages but not others, and doesn't require you to have the Build-Depends's for the packages you haven't selected, please let me know. This might not be possible. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: Notes on how to get filedrawers minimally working with R/W AFS access on Debian (at least).
Jason Edgecombe ja...@rampaginggeek.com writes: So, it turns out that Adam Megacz expects you to still have Apache 1 installed in order to build the damn thing. Yeah, this is because both the Apache1 module and the Apache2 module are built from the same source package. If there are any Debian wizards out there who know how to make a multiple-binary-packages-from-one-source-package that lets you selectively build some of the binary packages but not others, Why not build all of the pacjages and let the user choose as apt-get time? That's what it currently does. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: LDAP-AFS interaction (best practice?)
Stephen Joyce step...@physics.unc.edu writes: We currently use cfengine and custom scripts to manage /etc/passwd by sourcing a central file and checking AFS PTS group memberships to build the local file hourly. You might want to check out libnss-afs. http://deleuze.hcoop.net/~megacz/software/libnss-afs.html - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] afs: Lost contact with file server on the same machine?
Hello, We've got a situation where clients seem to be encountering afs: Lost contact with file server fairly frequently (at least once a week). This is happening both for a client machine which is on the same ethernet switch as the fileserver (no NAT going on) as well as the OpenAFS client running on the server machine losing contact with the fileserver process running on the very same machine (so it's unlikely to actually be the network). Sending kill -TSTP to the fileserver to increase the logging level hasn't revealed anything interesting happening at the time that contact is lost. Is there any way to get more detailed information about the reason why the client decided that it had lost contact? For example, whether the failure was due to a timeout, an ICMP unreachable, or no-route-to-host, etc? All machines in question are running OpenAFS 1.4.6 (client and server), using the debian packages. The fileserver is running with these arguments: -p 23 -busyat 600 -rxpck 400 -s 1200 -l 1200 -cb 65535 -b 240 -vc 1200 Thanks for any suggestions... - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: retaining AFS-specific nameless group IDs (PAG) in `id' and `groups'
Jim Meyering [EMAIL PROTECTED] writes: Since you guys are interested in AFS, I'm hoping one of you will respond to the above. http://lists.openafs.org/pipermail/openafs-info/2008-April/029132.html I'll wait a few days, after which, if I don't hear anything, I'll just revert to the old behavior. If old behavior means no special action for GIDs that might be PAGs, I think that is the right course of action. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Re: coreutils-6.11 released
Didi [EMAIL PROTECTED] writes: the main problem is that through this the 'groups' command becomes utterly useless and confused quite a lot of users. $ groups users id: cannot find name for group ID 1091323188 If you would like that numeric groupid to resolve to some alphanumeric group name, the right way to do that is to use the NSS: http://www.hcoop.net/~megacz/software/libnss-afs.html If someone can provide code to determine efficiently whether a nameless GID is a PAG then we can probably make everyone happy. The code you are looking for appears in libnss-afs, but it is based on assumptions that are only valid on a system known to be running the OpenAFS client. In other words, unless coreutils somehow detects the presence of the OpenAFS client (pioctls?), it probably shouldn't be trying to guess at what is or isn't a PAG GID the way libnss-afs does. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] do the servers pick up CellServDB changes without a restart?
When a new server is added to a cell, is it necessary to bos restart the existing servers to make them notice the changes to CellServDB, or are these changes picked up automatically by vlserver/ptserver/etc? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: maildir on openafs [new faq entry]
David Bear [EMAIL PROTECTED] writes: I seem to distantly recall some discussion about storing maildir directories on openafs, but I don't remember if it was safe, discouraged, or otherwise problematic. Any one see problems with putting maildir in afs? HCoop is doing this with courier, and it works, although the gadgetry currently used to acquire tokens is really, really sketchy. Many users really like having shell access to their mailbox, backup volumes of their mail, the ability to grep their mail, etc. The (sole) SMTP server, (sole) IMAP server, and AFS fileserver all happen to be the same (fairly powerful) machine, so we may be dodging some of the performance issues that other people see. Robert Banz [EMAIL PROTECTED] writes: Its a mess. AFS is not for mail. Unix user accounts are not for mail. Use an actual mail system and do it right ;) This sentiment comes up here often, and although there is much truth in it, I think that stating it so dogmatically might not be the most productive route to take. Mail uses storage; AFS provides storage; so, let's not imply that putting mail in AFS is obviously stupid! (perhaps it's only non-obviously stupid). The problems, as I am aware of them, are: - AFS does not perform well under the sort of multiple-machine concurrent access scenario that certain mail architectures (large-site Cyrus) use. - Unlike POP, the IMAP protocol offers many features which are best implemented by backend storage which is more database-like than filesystem-like in nature. AFS is a less than ideal storage medium for databases, for reasons explained elsewhere. Anyways, I've started a FAQ entry to collect concrete reasons why mail should or should not be stored in AFS. http://www.dementia.org/twiki/bin/view/AFSLore/AdminFAQ#3_52_Is_it_a_good_idea_to_store I will update the FAQ entry with the proceeds of this thread; please share your reasoning for encouraging or discouraging mail in AFS. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] openafs-devel not processing messages?
I've posted two messages to openafs-devel lately (one yesterday, one today) and neither has come through... could somebody perhaps check on that list to see if it is operating correctly? Thanks, - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Re: other-realm groups in ACLs?
Jeffrey Altman [EMAIL PROTECTED] writes: Please clarify what you are asking. Are you asking if you can use the group definitions from cell A on ACLs in cell B? Yes. Derrick Brashear [EMAIL PROTECTED] writes: No. And my server has no creds to do a lookup in your realm Sorry, I should have indicated that I was assuming a cross-realm trust between the home kerberos realms of the two cells. (and it could be hazardous if i lost contact with you) Ah, very good point. Now I see why this isn't practical. - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] other-realm groups in ACLs?
Does AFS support the use of pts groups in a remote (trusted) cell on ACLs in the home cell? In other words, is there a way to get this to work? fs sa /afs/home.edu/xyz/ somebody:[EMAIL PROTECTED] rli Or is system:[EMAIL PROTECTED] the only group that is allowed to have an @ in it? - a ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info